Use Cases

Using Default Hearst Patterns
Adding additional Hearst Patterns
Using Dependencies
Using Lexicalizations

Here we talk about various use cases of the web application and the general effects of some configuration changes.

The example text being used for testing can be found here - https://worksheets.codalab.org/rest/bundles/0xb4ab264671fe4e3bae00e9367a88eaeb/contents/blob/

Using Default Hearst Patterns

Using defined hearst patterns are a set of hearst patterns which have already been internally stored into the web application. Using these hearst patterns simply means Using extra hearst patterns, which should improve the results which are achieved by the two. Take the following text for example

New York—often called New York City or the City of 
New York to distinguish it from the State of 
New York, of which it is a part—is the most populous 
city in the United States and the center of the New York 
metropolitan area, the premier gateway for legal 
immigration to the United States and one of the most 
populous urban agglomerations in the world. A global 
power city, New York exerts a significant impact upon 
commerce, finance, media, art, fashion, research, 
technology, education, and entertainment, its fast pace 
defining the term New York minute. Home to the 
headquarters of the United Nations, New York is an 
important center for international diplomacy and has 
been described as the cultural and financial capital of 
the world.

Running these on the following two configurations, we get the following resulst (RDF) :-

Without Using Predefined Hearst Patterns

We get no triples

Using Predefined Hearst Patterns

@prefix ns1: <http://predicateProperty.org/> . @prefix ns2: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xml: <http://www.w3.org/XML/1998/namespace> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://dbpedia.org/resource/United_and_uniting_churches> a ns2:Document ; ns1:typeOf <http://dbpedia.org...> . ns1:attribute a rdf:Property ; ns2:name "attribute" . ns1:typeOf a rdf:Property ; ns2:name "typeOf" . <http://dbpedia.org/resource/List_...> a ns2:Document . [] a ns2:Document ; ns1:attribute [ a ns2:Document ; ns2:name "center" ] ; ns2:name "headquarters" . [] a ns2:Document ; ns1:attribute [ a ns2:Document ; ns2:name "center" ] ; ns2:name "headquarters" . [] a ns2:Document ; ns1:attribute [ a ns2:Document ; ns2:name "part" ] ; ns2:name "new_york" . [] a ns2:Document ; ns1:attribute [ a ns2:Document ; ns2:name "part" ] ; ns2:name "new_york" .

We thus get better results by doing so.

Adding additional Hearst Patterns

Adding hearst patterns just like the above process, helps in increasing the chances of extracting triples from the following text. Although this cannot be guaranteed everytime, since it's dependent in the text given, the chances do increase significantly.

Using Dependencies

Dependencies information triples extraction methods relies on the dependenices obtained from the text in order to be able to extract triples. Since unlike Hearst Patterns, the following process does not depend explicitely or directly on the sentence structute, a generalized algorithm can be written for the following. Use the above text, we run the app on two configurations.

Predefined Hearst Patterns, Spotlight

Predefined Hearst Patterns, Dependencies, Spotlight

The triples obtained from the following

Predefined Hearst Patterns

Predefined Hearst Patterns + Dependencies

@prefix ns1: <http://xmlns.com/foaf/0.1/> . @prefix ns2: <http://predicateProperty.org/> . @prefix ns3: <http://purl.org/linguistics/gold/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xml: <http://www.w3.org/XML/1998/namespace> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://dbpedia.org/resource/City> a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/United_States> ; ns3:hypernym <http://dbpedia.org/resource/New_York> . <http://dbpedia.org/resource/Milecastle> a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/Immigration> . <http://dbpedia.org/resource/New_York_City> a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/New_York> . <http://dbpedia.org/resource/U.S._state> a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/New_York> . <http://dbpedia.org/resource/United_and_uniting_churches> a ns1:.. ; ns2:typeOf <http://dbpedia.org/resource/List_of_metropolitan_.> . <http://dbpedia.org/resource/Urban_area> a ns1:Document ; ns2:hypernym_low_confidence [ a ns1:Document ; ns1:name "world" ] . ns2:attribute a rdf:Property ; ns1:name "attribute" . ns2:describ_as a rdf:Property ; ns1:name "describ_as" . ns2:hypernym_low_confidence a rdf:Property ; ns1:name "hypernym_low_confidence" . ns2:typeOf a rdf:Property ; ns1:name "typeOf" . ns3:hypernym a ns1:Property . <http://dbpedia.org/resource/Area> a ns1:Document . <http://dbpedia.org/resource/Capital_city> a ns1:Document ; ns2:hypernym_low_confidence [ a ns1:Document ; ns1:name "world" ] . <http://dbpedia.org/resource/Diplomacy> a ns1:Document . <http://dbpedia.org/resource/Home> a ns1:Document ; ns2:hypernym_low_confidence [ a ns1:Document ; ns1:name "headquarters" ] . <http://dbpedia.org/resource/Immigration> a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/United_States> . <http://dbpedia.org/resource/List_of_metropolitan_areas_in_Pakistan> a .. . <http://dbpedia.org/resource/United_Nations> a ns1:Document . <http://dbpedia.org/resource/United_States> a ns1:Document . <http://dbpedia.org/resource/New_York> a ns1:Document . [] a ns1:Document ; ns2:describ_as <http://dbpedia.org/resource/Capital_city> ; ns1:name "center" . [] a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/Area> ; ns1:name "center" . [] a ns1:Document ; ns2:attribute [ a ns1:Document ; ns1:name "center" ] ; ns1:name "headquarters" . [] a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/United_Nations> ; ns1:name "headquarters" . [] a ns1:Document ; ns2:attribute [ a ns1:Document ; ns1:name "part" ] ; ns1:name "new_york" . [] a ns1:Document ; ns2:hypernym_low_confidence <http://dbpedia.org/resource/Diplomacy> ; ns1:name "center" . [] a ns1:Document ; ns2:attribute [ a ns1:Document ; ns1:name "center" ] ; ns1:name "headquarters" . [] a ns1:Document ; ns3:hypernym <http://dbpedia.org/resource/Home> ; ns1:name "center" . [] a ns1:Document ; ns2:attribute [ a ns1:Document ; ns1:name "part" ] ; ns1:name "new_york" .

As demonstrated from above, the number of triples extracted increase drastically. Thus the above method is highly useful

Using Lexicalizations

Lexicalizations mainluy determine the property for RDF generated. Since RDF must include a valid resource for predicates lexicalizations, unlike the subject and object, there is a key role in doing so. In the above text, using dependencies method if we were to add lexicalizations for few properties such as

describ_ad -> http://xyz.org/ontology/Description

The following RDF becomes

ns2:describ_as a rdf:Property ; ns1:name "describ_as" . . . . <http://xyz.org/ontology/Description> a rdf:Property

Main Documentation