StrepHit
StrepHit copied to clipboard
An intelligent reading agent that understands text and translates it into Wikidata statements.
As pointed out in https://lists.wikimedia.org/pipermail/wikidata/2016-November/010028.html
Hello StrepHit seems very interesting I've installed it on Windows. perl and TreeTagger are working and in the PATH When I execute the following command line python -m strephit extraction...
[Script](../blob/master/strephit/annotation/post_job.py) currently implemented for English.
Encountered typical 8-bit encodings interpreted as UTF-8: `elected at the Académie Française in 1816` `iconv -f utf8 -t latin1` fixes that: `elected at the Académie Française in 1816`
Missing features in the current [Mediawiki page](https://www.mediawiki.org/wiki/StrepHit), as compared to what I consider a gold standard, i.e., http://docs.python-requests.org/en/latest/api/ ordered by priority: 1. **usage** commands (only needed for prominent functions of...
Decide which entity linking service should be supported. Current list in alphabetical order: - [Alchemy](http://www.alchemyapi.com/) - [Babelfy](http://babelfy.org/) - [Cogito Intelligence](http://www.intelligenceapi.com/) - [Dandelion](https://dandelion.eu/docs/api/datatxt/nex/v1/) - [DBpedia Spotlight](http://spotlight.dbpedia.org/) - [Open Calais](http://www.opencalais.com/) - [TagMe](http://tagme.di.unipi.it/tagme_help.html)...
``` python -m strephit commons pos_tag -t nltk samples/corpus.jsonlines bio en ``` breaks with an error ``` TypeError: tag_many() takes at most 3 arguments (5 given) ``` since the signature...
Got a segmentation fault when running the command `python -m strephit extraction process_semistructured`. Full stack trace attached. [stacktrace.txt](https://github.com/Wikidata/StrepHit/files/314179/stacktrace.txt)
It should fit the biography domain. The gazetteer should be automatically built via queries to a knowledge base and should contain instances of: 1. Artist 2. Writer 3. Scientist 4....
Instead of HTML visualizations of RDF resources, where it is hard to find the relevant piece of information. 1. `ULAN`: replace http://vocab.getty.edu/ulan/500015963 with http://vocab.getty.edu/page/ulan/500015963 2. `British Museum`: http://collection.britishmuseum.org/id/person-institution/159306 should be...