TrimSplitLineTrimColumnsToDictionary throws a "key already exists" exception
Hi guys! First off, great job! SharpLearning is very useful and well built.
One small bug I found while mistakenly creating a dataset based on a CSV file without the headers line (when calling 'ToF64Matrix()').
In SharpLearning.InputOutput.Csv.CsvParser -> Dictionary<string, int> TrimSplitLineTrimColumnsToDictionary(string line) there's an iteration over the headers line, but it assumes all headers are distinct (and also that it is the headers line) - therefore an exception of "key already exists in dictionary" is thrown. I think it should check if there's a duplication and throw a more explanatory error message in such case.
Let me know if you want me to fix it and add a pull request.
Hi @david-clinch,
I am glad you find SharpLearning useful - thanks!
You are welcome to create a pull request with a fix, and a more explanatory error message, that would be great!
If you go ahead with this, please write a unit test that fails/shows the error before the fix, and passes after the fix has been implemented. This makes it a lot easier to understand, and review the bug and solution :-).
best regards Mads