SharpLearning icon indicating copy to clipboard operation
SharpLearning copied to clipboard

TrimSplitLineTrimColumnsToDictionary throws a "key already exists" exception

Open david-clinch opened this issue 6 years ago • 1 comments

Hi guys! First off, great job! SharpLearning is very useful and well built.

One small bug I found while mistakenly creating a dataset based on a CSV file without the headers line (when calling 'ToF64Matrix()').

In SharpLearning.InputOutput.Csv.CsvParser -> Dictionary<string, int> TrimSplitLineTrimColumnsToDictionary(string line) there's an iteration over the headers line, but it assumes all headers are distinct (and also that it is the headers line) - therefore an exception of "key already exists in dictionary" is thrown. I think it should check if there's a duplication and throw a more explanatory error message in such case.

Let me know if you want me to fix it and add a pull request.

david-clinch avatar Oct 03 '19 13:10 david-clinch

Hi @david-clinch,

I am glad you find SharpLearning useful - thanks!

You are welcome to create a pull request with a fix, and a more explanatory error message, that would be great!

If you go ahead with this, please write a unit test that fails/shows the error before the fix, and passes after the fix has been implemented. This makes it a lot easier to understand, and review the bug and solution :-).

best regards Mads

mdabros avatar Oct 03 '19 14:10 mdabros