text-processing topic
stringx
Drop-in replacements for base R string functions powered by stringi
bloatectomy
A python package for removing duplicate text in clinical notes or other documents
chr
🔤 Lightweight R package for manipulating [string] characters
NLP-tools
Useful python NLP tools (evaluation, GUI interface, tokenization)
atarashi
Atarashi scans for license statements in open source software, focusing on text statistics. Designed to work stand-alone and with FOSSology.
Text2Summary-Android
A library for Text Summarization on Android applications.
nlpo3
Thai Natural Language Processing library in Rust, with Python and Node bindings.
Russian_subtitles_dataset
Preprocessing of the dataset of 347 subtitles for the TV series (thanks to Taiga Corpus) to build a word2vec model, JamSpell model, neural network training, chat bot training or in any other NLP task.
sciteco
Advanced TECO dialect and interactive screen editor based on Scintilla
sova-tts-tps
NLP-preprocessor for the SOVA-TTS project