python-nameparser
python-nameparser copied to clipboard
Strange parsing of name w lastname prefix and title before and after
The combination of having lastname prefixes and repeated titles before and after a name seems to break the parsers logic around middle name handling.
Here's a test that fails:
hn = HumanName("dr Vincent van Gogh dr")
self.assertEqual("Vincent", hn.first)
self.assertEqual("van", hn.middle)
self.assertEqual("Gogh", hn.last)
For some reason, the middle name comes out as dr Vincent van instead of the expected van.
Current master does this:
% python tests.py "dr Vincent van Gogh dr"
<HumanName : [
title: 'dr'
first: 'Vincent'
middle: ' dr Vincent van'
last: 'Gogh'
suffix: 'dr'
nickname: ''
]>
Actually this is what I would expect because "van" is a prefix and should attach itself to the following piece.
% python tests.py "dr Vincent van Gogh dr"
<HumanName : [
title: 'dr'
first: 'Vincent'
middle: ''
last: 'van Gogh'
suffix: 'dr'
nickname: ''
]>
I think this is a valid bug but I'm not sure what the issue is.
actually, "van" is not a prefix because it is sometimes a first name. So you're right, this should be the expected output:
% python tests.py "dr Vincent van Gogh dr"
<HumanName : [
title: 'dr'
first: 'Vincent'
middle: 'van'
last: 'Gogh'
suffix: 'dr'
nickname: ''
]>