mirusan
mirusan copied to clipboard
A PDF collection reader with built-in full-text search engine
Mirusan


A PDF collection reader with built-in full text search engine
Written in Python / Electron / Elm / Javascript
Features
-
Simple UI
-
Local database (You have controll 100% of your data)
-
Easy installation (No need to install external databases)
-
Multiplatform (Linux, Mac, Windows)
Installation
Prerequisites
Instructions
git clone https://github.com/mknz/mirusan.git
cd ./mirusan
cd ./search
pip install -r requirements.txt
cd ../electron
npm install
npm run compile
npm start
Language support
Mirusan automatically detects input language using Google's language-detection. Tokenizer or analyzer for indexing is chosen according to the detected language.
For following languages, Whoosh's built-in LanguageAnalayzer or StandardAnalyzer (for English) is used.
(though currently it does not work properly for Arabic.)
Arabic
Danish
Dutch
English
Finnish
French
German
Hungarian
Italian
Norwegian
Portuguese
Romanian
Russian
Spanish
Swedish
Turkish
For other languages, N-gram tokenizer (minsize=1, maxsize=2) is used.