Feature vectors for allophones that aren't phonemes
Some segments appear in the PHOIBLE data as allophones, but not as phonemes in any language.
Examples:
-
tʃːis an allophone fort̠ʃin kuna1268 -
tʂis an allophone fort̠ʃin yuch1247 -
tʂʼis an allophone fort̠ʃʼin yuch1247
phoible.csv doesn't seem to have feature vectors for these allophones.
That's correct. We have a student working on this right now. But we're not sure yet how to provide them; they can't be part of phoible.csv because it has one row per phoneme (not one per allophone). Can you tell us about your use case / what would be the best format from your perspective?
I was only looking to compare the features of tʂ with ʈʂ. Phoible uses both symbols (possibly to represent different sounds), but Wikipedia says they represent the same sound.
tʂ looks like a mistake to me; we try to enforce that affricates have place-matching between the stop part and the fricative part. Such mistakes are more likely in the allophones because they aren't run through the same validation code that the phonemes are; though as I said we have a student working on this right now so hopefully soon many of these allophone errors will get corrected.
cc @Alessioryan
@drammock Would you be able to send me the validation code for the phonemes? I'd love to take a look at this issue, I hadn't noticed it prior.