Bernd Noll

Results 8 comments of Bernd Noll

What about using max as a default instead of min?

I am talking about the metadata set. Why not make "max" the default metadata set instead of "min". Or is there any other way to use the "max" set without...

Hi @ParticularMiner, thank you very much for your helpful answer. I managed to implement the merge operation and got this to work according to your advice. Performance is fine with...

@iibarant, thanks for the advice. That's exactly what I implemented today. However, running multiple matches for small subsets takes actually way more time than running one match over a big...

Hi gents, unfortunately not able to share the original file, but the one attached will do for our purposes. My code below. Grouping/blocking can be switched on/off by setting attribute...

@ParticularMiner: Sorry for that, trying one more time... ```python import sys, csv, json, pandas as pd import numpy import string_grouper as sg import time print('Dedupe script') t0 = time.time() inputfilename...

@ParticularMiner Thank you so much for looking into this and for collaborating on this. Very much appreciated! I basically copy and pasted your code to test it before applying to...

Hi @ParticularMiner, my machine specs are similar to yours...Intel(R) Core(TM) i7-8650U CPU @ 2.11 GHz, 16 GB RAM, SSD, Win 10 Prof. Looking at the runtimes I had with my...