Gokul NC
Gokul NC
**Is your feature request related to a problem? Please describe.** Annotation for scene text detection and scene text recognition. **Describe the solution you'd like** Bounding boxes for marking text regions,...
The npm package seems to be almost 6 months old now: https://www.npmjs.com/package/@heartexlabs/label-studio Is it possible to bump the version and publish the latest code from this repo to the NPM...
Hi, I have created a web extension to romanize Arabic texts with diacritics, using [ALA-LC scheme](https://www.loc.gov/catdir/cpso/romanization/arabic.pdf): Extension: https://chrome.google.com/webstore/detail/arabic-romanizer/lnpmajkeelelndcfjaaajpppefpjndoe Source code: https://github.com/GokulNC/Arabic-Romanizer-Web-Extension Python library: https://github.com/GokulNC/Arabic-Romanizer Please add this if you feel...
Sometimes, a single unicode character `ﷲ` is used to denote `اللہ`. Please normalize this as part of the Arabic->Urdu conversion.
## Feature description Would it be possible to extend the Standard Urdu script to add support for ShahMukhi (Punjabi) and Sindhi's additional characters as well? That is, it would be...
- Paper: http://www.apsipa.org/proceedings/2021/pdfs/0000511.pdf - Duration: 519 hours - Dataset link: Not yet made public?
Massively multilingual abusive comment identification across Indian languages in Code-Mixed text: Hindi, Telugu, Marathi, Tamil, Malayalam, Bengali, Kannada, Odia, Gujarati, Haryanvi, Bhojpuri, Rajasthani, Assamese https://www.kaggle.com/c/iiitd-abuse-detection-challenge/data
Paper: https://arxiv.org/abs/2101.00204 Data: https://github.com/csebuetnlp/banglabert#datasets (Not yet released maybe)
- Paper: https://arxiv.org/abs/2201.03180 - Datasets: Not yet public. (Probably available on request)
- Dataset: https://catalist-2021.github.io/ - Langs: Hindi, Marathi (incl. English)