gemini
gemini copied to clipboard
Advanced similarity and duplicate source code at scale.
We have [some fixtures](https://github.com/src-d/gemini/tree/master/src/test/resources) to test query/report independently from the hash. But updating them is a pain.
igraph is a C library. There is some experimental binding to java exist. But maybe we can build our own because we need only very small subset of igraph. Problems...
1. just running `./sbt test` would fail. 2. Hash requires running feature extractor, need to add docs how to run it in dev mode 3. Report requires correct python environment,...
1. Currently docker-compose builds images which takes forever and fails sometimes. We can use prebuild images instead 2. `k8s/` directory and `gemini-k8s-cluster` are kind of outdated. We can replace them...
It is easy to re-implement feature extraction in scala which would make running & deploying gemini much easier. It should also reduce the amount of code & dependencies drastically.
Through the code, we have some constants that provide the best output for most common cases. It may make sense to allow a user to change them. Most probably through...
After #95, update the README structure, add - "Easy Way" a quickstart for Gemini \w containers - "Hard Way" as discussed in #84, a separate file under `/docs/`, with description...
Currently, it runs in integration tests job: https://travis-ci.org/src-d/gemini/jobs/486502301
This line in `sbt` does not work when the major version is 2 digits (e.g. 10). https://github.com/src-d/gemini/blob/d111064b4320ca9860f4f3f6829688a8026baf64/sbt#L195