LEMLAT3
LEMLAT3 copied to clipboard
Grouping identical analyses into a single entry
Create a filter/function to group identical analyses into a single entry. For example, analyses 18 and 19 of forma (Du Cange) are identical:
============================ANALYSIS 18==================================
SEGMENTATION: form -a
---------------------morphological feats 1 ----------------------------
--bfs--
Case: Ablative
Gender: Feminine
Number: Singular
---------------------morphological feats 2 ----------------------------
--nfs--
Case: Nominative
Gender: Feminine
Number: Singular
---------------------morphological feats 3 ----------------------------
--vfs--
Case: Vocative
Gender: Feminine
Number: Singular
============================LEMMA =================================
forma N1 D68HA f
-----------------------morphological feats-------------------------
NcA
PoS: Noun
Type: Common
Inflexional Category: I decl
-----------------------derivational info---------------------------
IS DERIVED: NO
============================ANALYSIS 19==================================
SEGMENTATION: form -a
---------------------morphological feats 1 ----------------------------
--bfs--
Case: Ablative
Gender: Feminine
Number: Singular
---------------------morphological feats 2 ----------------------------
--nfs--
Case: Nominative
Gender: Feminine
Number: Singular
---------------------morphological feats 3 ----------------------------
--vfs--
Case: Vocative
Gender: Feminine
Number: Singular
============================LEMMA =================================
forma N1 D68HB f
-----------------------morphological feats-------------------------
NcA
PoS: Noun
Type: Common
Inflexional Category: I decl
-----------------------derivational info---------------------------
IS DERIVED: NO
LemLat application has indeed a problem of grouping but in case of word-forms with both 'ordinary' and exceptional lemmatization.
NOT the case of your example where you have actually two lemmas (note the different ids).
It could be arguable (in cases like that) the choice of using two different entries in the support database.