sweater
sweater copied to clipboard
👚 Speedy Word Embedding Association Test & Extras using R
In general, cosine is not a good distance measurement for all-zero vectors. But we can't change that. https://github.com/chainsawriot/sweater/blob/6aebf710d813033c6d07f0268f12bd3e6badaee5/src/weat.cpp#L14 This will generate "divide by zero" problem because `deno_*` is zero, the...
To show how quick and slow this basic operation is (vs Python or else).
The `3CosAdd`, `3CosMul`, `LRCor` etc. see [this paper](https://drops.dagstuhl.de/opus/volltexte/2020/13022/pdf/OASIcs-SLATE-2020-9.pdf). Probably include the data also. See the [ACLwiki](https://aclweb.org/aclwiki/Analogy_(State_of_the_art))
https://webis.de/downloads/publications/papers/spliethoever_2021.pdf
It is like a variant of SemAxis. https://arxiv.org/pdf/1901.07656.pdf Embedding Quality Test is also possible. And the concept is fun. But it is quite difficult to reproduce because the procedure involves...
This paper claims that one can hack WEAT by cherry-picking words in `A` and `B`. The RIPA can protect against such hacking. The method RIPA does not appear to be...