artitw comments

Results 90 comments of


                                            artitw

Fine-tune crosslingual model for language detection

Can we store the model in somewhere like Google Drive and only download it when the Identifier is used? This approach would follow the existing convention to keep the core...

Fine-tune crosslingual model for language detection

@Mofetoluwa thanks for the updates and the pull request. I added some comments there. With regards to the third point you raise, when I tested the model, it returned "hy"...

Fine-tune crosslingual model for language detection

I think training with shorter texts and approach 2 would address the issue. Another approach us to use 2D embeddings. Currently we are using 1D embeddings, which are calculate by...

Fine-tune crosslingual model for language detection

@Mofetoluwa yes, we can do `vectorize(output_dimension=2)` as specified in the [latest version](https://github.com/artitw/text2text/blob/77b548d5f9855088db149a94bbbfa310b7c0e3e1/text2text/vectorizer.py#L7). Also note that the default 1D output should be improved now compared to the version you used most...

Fine-tune crosslingual model for language detection

Yes, a comparison of both would be useful. Thanks so much for checking the shorter texts. It will help to confirm the fix for the way 1D embeddings are calculated.

Fine-tune crosslingual model for language detection

Hi Mofe, 1. Are we sampling the data so that each class is balanced when training? 2. Could we update the README so that users could have some documentation to...

Fine-tune crosslingual model for language detection

Could we also add the `Identifier` in the README's [class diagram](https://github.com/artitw/text2text#class-diagram)?

Fine-tune crosslingual model for language detection

@Mofetoluwa, what do you think about using the TFIDF embeddings to perform the language prediction? I think that might be better than the neural embeddings currently used, as it won't...

is having a CUDA enabled GPU required for installation ?

No, it should work on just CPU. Give it a try and let us know if you have any issues

generating different types of question depending on token combination

This is possible. You would want to control the answer using the [SEP] token. We could also consider implementing the functionality directly into the code base if that’s what you...