bugbug icon indicating copy to clipboard operation
bugbug copied to clipboard

Add a similarity feature for duplicate classifier

Open ashridh opened this issue 6 years ago • 1 comments

Fixes : #582

Caveat : StructuredColumnTransformer converts sparse matrices to dense matrices, increasing the memory size of the inputs. (https://github.com/mozilla/bugbug/blob/master/bugbug/utils.py#L44). Thus, I haven't been able to test the performance of this model so far.

ashridh avatar Jun 22 '19 08:06 ashridh

Codecov Report

Merging #616 into master will decrease coverage by 0.06%. The diff coverage is 51.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #616      +/-   ##
==========================================
- Coverage   58.29%   58.23%   -0.07%     
==========================================
  Files          58       58              
  Lines        3690     3718      +28     
==========================================
+ Hits         2151     2165      +14     
- Misses       1539     1553      +14     
Impacted Files Coverage Δ
bugbug/models/duplicate.py 35.10% <51.72%> (+6.31%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 938eb29...b38f498. Read the comment docs.

codecov-io avatar Jun 22 '19 08:06 codecov-io