The final week in GSoC invloved major bug fixes and improvements to the web application, the source code documentation. along with development of the web application’s manual.
Manual Creation
The manual for DBpedia Extract was created and hosted on the projects github pages. The manual includes basic information regarding how to use the web application
- Setting up the configuration, this inclues the following sub-steps
- Selecting the required methods for the configurations
- Adding additional input parameters
- Language
- Running the application on the text
- Understanding the results
and descriptions of various use cases which can be done through the application. The use cases include the following:-
- Using Default Hearst Patterns
- Adding additional Hearst Patterns
- Using Dependencies
- Using Lexicalizations
The manual for the web-application can be found here - link
Source Code Documentation
functions and different modules used in the web application were documented in the code itself, so that further work can be done smoothly.
class TripleExtraction(object):
"""
Stanford treegex utility for extracting triplets. Utilises the stanford coreNLP API for execution. Make sure that it
is running while running.
"""
The rest of the modules of the GSoC code-base were updated to its latest version. This mainly included the SyntaxNet-Triples
module which contains source code for the methods used in the process. The datasets and results were also updated.