probablepeople
probablepeople copied to clipboard
not recognizing full middle names
ORIGINAL STRING: Randie John Mcdonald PARSED TOKENS: [('Randie', 'Surname'), ('John', 'GivenName'), ('Mcdonald', 'Surname')] UNCERTAIN LABEL: Surname
perhaps this is purposeful considering some format which would make the full middle name ambiguous...
I'm having issues with some full middle names not being correctly parsed either.
>>> probablepeople.parse('Dominic James LoBue')
[('Dominic', 'GivenName'), ('James', 'Surname'), ('LoBue', 'Surname')]
James should be MiddleName.
What's crazy is if I change the given name to something else, like John, it correctly parses it:
>>> probablepeople.parse('John James LoBue')
[('John', 'GivenName'), ('James', 'MiddleName'), ('LoBue', 'Surname')]