Adding RETRO model Faiss sharding index and KNN sharding index
What does this PR do ?
To handle very large dataset, e.g. hundreds of gigabyte to terrabyte compressed raw data, we need multiple nodes to create sharding index and combine them together. It enhance the KNN index data structure to handling sharding index.
In this PR, it creates GPU Faiss index in 3 stages.
- stage 0, train Faiss index structure.
- stage 1, create Faiss sharding index
- stage 2, merge sharding index into one.
It creates KNN index in 2 stages.
- stage 1, create KNN sharding index
- stage 2, merge KNN sharding index into one.
It includes some unit tests to cover the KNN sharding index, dedup funcitons etc.
I have been used it to successfuly create Faiss and KNN index for 350G dataset in a slurm cluster enviroment.
This pull request introduces 4 alerts when merging 21b10abd01ae5e9001c15e4c8c0d8158287a0ef8 into f921ebe0436e55f7547b183ca83a623f6678422d - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 5af8e5f2381c7327397d3de632da9d826931793e into 4cd9b3449cbfedc671348fbabbe8e3a55fbd659d - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging f592030910b9f8d6ecd7cca8d9455258b8d2ba7b into f8ca550967a83473aa2c20267690ac59c4fb640f - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 1195a7081f2e60c6091ebb20ff2d2713f393c8b8 into 4bf54b715b1ba8832ec3beedcb6983acf55ff096 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 3757717f511686341d6691efed01fb0265be61fd into 4bf54b715b1ba8832ec3beedcb6983acf55ff096 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 77d0b19c7215df779aa2728976b8ac371d53a0e6 into 6abfbbfda654f44313068b950edb0f70b01449b1 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 46dfb46f8fd3ea009b5d892dce3c888826e45f9a into 28524d6accb64f4ff4cf60fe3e86532d1e6f6738 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 3826add60d378aa4ab1e9eb5f778aaa26110163b into c0bfa6f07f766a3abd1804f5b666474887e0a1e4 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 1b1adc2980703cd7956d80406e322c88e1b4dc30 into c0bfa6f07f766a3abd1804f5b666474887e0a1e4 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging ae42d7d1c704f78775ade91b2667097c9ec798a4 into c0bfa6f07f766a3abd1804f5b666474887e0a1e4 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging e027c2b34177a2a9069159e1750f747dc33f6dcd into f53bb3465ad8f9c055bd4b22581540ac06184e81 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging fd0b19cad146e5a28f0dc8bee46ad77609fe4e4c into f53bb3465ad8f9c055bd4b22581540ac06184e81 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 8d66ac50866fa35e2b09db98683dbea583cabc17 into e8ba60b648ae0fe04ca46d93a4d9e0f6537b521d - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging 6b4e3a92ec00db6179f9a0c15502005cee432914 into 1c16b966299203392aaba73090d820376a291974 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment
This pull request introduces 4 alerts when merging c978671cbe5046435cdc7558602c5fa6596d5e01 into d29a66bc5344415a134fac597be095b1271a4ce7 - view on LGTM.com
new alerts:
- 2 for Module is imported with 'import' and 'import from'
- 2 for Redundant assignment