deep-scite
deep-scite copied to clipboard
:rowboat: A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow
DeepScite - A Simple Convolutional-based Recommendation Model

Overview
DeepScite takes in papers (titles, abstracts) and emits recommendations on whether or not they should be scited by the particular users whose data we've used for training (in the case of this repo, it is me).
As output, it also gives a "goodness" score for each word; when this number is high, it has contributed strongly to the paper being (recommended) for sciting, when it is negative, it has contributed strongly to the paper not being recommended.
Below are some example outputs of the system:


The blue text are those words which are "good", and the red text are those which are "bad".
Installation
- Clone this repository:
git clone https://github.com/silky/deep-scite.git
-
Use conda or (virtualenv) and create an environment that has Python 3.5.
conda create -n deep-scite python=3.5 -
Activate the environment
source activate deep-scite -
Install the requirements
pip install -r requirements.txt
- Install
nltklanguage packs
In order to tokenise strings, we use the nltk package. It requires
us to download some data before using it though. To do so, run:
python -c 'import nltk; nltk.download("punkt")'
- Install this library in
developmode
python setup.py develop
Usage
From the root directory of this project:
- Activate the
deep-sciteenvironment
source activate deep-scite
- Train the model on the
noondata set, and emit recommendations
./bin/run_model.py
This will run through the steps defined in model.yaml.
- Open up
./data/noon/report.htmlin your browser and observe recommendations.
Misc
You can play around with the embedding by looking at it in TensorBoard. Run TensorBoard with:
tensorboard --logdir /tmp/tf-checkpoints/deepscite-noon
Then click on the "Embedding" tab.
