StrepHit icon indicating copy to clipboard operation
StrepHit copied to clipboard

An intelligent reading agent that understands text and translates it into Wikidata statements.

Results 17 StrepHit issues
Sort by recently updated
recently updated
newest added

As pointed out in https://lists.wikimedia.org/pipermail/wikidata/2016-November/010028.html

task

Hello StrepHit seems very interesting I've installed it on Windows. perl and TreeTagger are working and in the PATH When I execute the following command line python -m strephit extraction...

pull request welcome

[Script](../blob/master/strephit/annotation/post_job.py) currently implemented for English.

task
pull request welcome

Encountered typical 8-bit encodings interpreted as UTF-8: `elected at the Académie Française in 1816` `iconv -f utf8 -t latin1` fixes that: `elected at the Académie Française in 1816`

minor
pull request welcome

Missing features in the current [Mediawiki page](https://www.mediawiki.org/wiki/StrepHit), as compared to what I consider a gold standard, i.e., http://docs.python-requests.org/en/latest/api/ ordered by priority: 1. **usage** commands (only needed for prominent functions of...

pull request welcome

Decide which entity linking service should be supported. Current list in alphabetical order: - [Alchemy](http://www.alchemyapi.com/) - [Babelfy](http://babelfy.org/) - [Cogito Intelligence](http://www.intelligenceapi.com/) - [Dandelion](https://dandelion.eu/docs/api/datatxt/nex/v1/) - [DBpedia Spotlight](http://spotlight.dbpedia.org/) - [Open Calais](http://www.opencalais.com/) - [TagMe](http://tagme.di.unipi.it/tagme_help.html)...

minor
pull request welcome

``` python -m strephit commons pos_tag -t nltk samples/corpus.jsonlines bio en ``` breaks with an error ``` TypeError: tag_many() takes at most 3 arguments (5 given) ``` since the signature...

Got a segmentation fault when running the command `python -m strephit extraction process_semistructured`. Full stack trace attached. [stacktrace.txt](https://github.com/Wikidata/StrepHit/files/314179/stacktrace.txt)

major

It should fit the biography domain. The gazetteer should be automatically built via queries to a knowledge base and should contain instances of: 1. Artist 2. Writer 3. Scientist 4....

task

Instead of HTML visualizations of RDF resources, where it is hard to find the relevant piece of information. 1. `ULAN`: replace http://vocab.getty.edu/ulan/500015963 with http://vocab.getty.edu/page/ulan/500015963 2. `British Museum`: http://collection.britishmuseum.org/id/person-institution/159306 should be...

major