udify
udify copied to clipboard
Prediction of multi-word expression
Is it possible to predict multi-word expression (MWE) from raw text?
I run predict.py with option --raw_text to find that MWE cannot be predicted.
For example, in Italy, "della" is abbreviation of "di la" and UD annotates such token like as follows:
31-32 della _ _ _ _ _ _ _ _
31 di di ADP E _ 35 case 35:case _
32 la il DET RD Definite=Def|Gender=Fem|Number=Sing|PronType=Art 35 det 35:det _
However, the output of UDify is something like this:
31 della della ADP _ _ 3 case _ _
I hope to obtain the conllu output with proper MWE. Are there any way to realize it?