natural icon indicating copy to clipboard operation
natural copied to clipboard

Named entity recognition

Open robdefeo opened this issue 12 years ago • 14 comments

Do you have any plans for named entity recognition, I have seen that it would require a sequential classifier. The ability to train it with your own data set (json document) of POS tags and other key attributes.

robdefeo avatar Nov 26 '13 03:11 robdefeo

This is very interesting!!! there is a plan?

sabatinim avatar Feb 25 '14 13:02 sabatinim

Don't currently have a plan, if anyone wants to tackle this it would be great though!

kkoch986 avatar Feb 25 '14 14:02 kkoch986

Anyone have more thoughts on what algorithm might be best to implement for this?

mbc1990 avatar Dec 15 '14 22:12 mbc1990

@mbc1990 I think crf is the best model for NER, the pipeline is tokenize -> pos tag -> NER, the challenge is you need find a NER training data, it's a hard work.

liwenzhu avatar Dec 16 '14 00:12 liwenzhu

any news of the feature?

hbakhtiyor avatar Apr 03 '16 14:04 hbakhtiyor

+1

gagan-bansal avatar Aug 17 '17 11:08 gagan-bansal

A detailed approach is given in nltk document for NER extraction.

gagan-bansal avatar Aug 17 '17 11:08 gagan-bansal

Hi there everyone, I was just studying this subject and found some real interesting stuff about NER that I want to share:

There some ways of doing this feature, the CharWNN seems to be the one with best results, but not by far. The others seems to need specific training corpus. For me it looks pretty similar to the PoS Tagger. I'm still not able to reproduce the algorithm detailed in those papers, also haven't found anything in javascript, only a few examples in python. Hope this will help to get any talented developer here inspired =)

diegodorgam avatar Dec 12 '17 04:12 diegodorgam

I'm working on named entity recognition for natural. I'm working on three ways of recognition:

  • regular expressions: can be used for time, date, uri's, currency, etc.
  • vocabulary of named entities
  • a (trained) model that classifies named entities

It will be possible to combine these approaches, so a hybrid approach. The methods returns a list of edges of the form (recognised string, start index, end index, category, score). Score only makes sense for the trained model. I'm thinking of using a maximum entropy model. Is that a viable route, any ideas on useful feature functions?

Hugo

Hugo-ter-Doest avatar Apr 28 '18 08:04 Hugo-ter-Doest

how is this going @Hugo-ter-Doest ? do you had any progress on this?

diegodorgam avatar Jun 12 '18 04:06 diegodorgam

Yes, I did some work on this: https://github.com/Hugo-ter-Doest/natural/tree/NER

I am trying to make a hybrid approach. First find the easy to define and match entities with regular expressions and lexicons, then apply a statistical model to do more advanced detection.

Hugo

Hugo-ter-Doest avatar Jun 12 '18 09:06 Hugo-ter-Doest

Hi! Perhaps I can help with that, I did a NER but only for "enumerateds", with similar search, and my next step was to add regular expression entities (I see that you already had them!!! Great job!!!).

jseijas avatar Jul 31 '18 19:07 jseijas

Are there any plans to incorporate this ?

GeorgeNance avatar Jun 01 '20 20:06 GeorgeNance

Any update on this?

dorgan avatar Sep 01 '20 01:09 dorgan