strtod is locale-dependent (LC_NUMERIC) and breaks JSON float parsing under non-C locales (e.g. French)
https://github.com/asg017/sqlite-vec/blob/a2dd24f27ec7e4a5743e58f5ab6835deea5db58d/sqlite-vec.c#L754
Currently the code uses strtod to parse floating point numbers from JSON arrays.
However, strtod respects the current process locale (LC_NUMERIC).
This causes incorrect behavior when the user’s locale is not "C":
- In French locale (
fr_FR.UTF-8),"0.1"is rejected because the expected separator is",". - JSON always requires a
.(dot) as the decimal separator. - As a result, JSON parsing will fail on systems where the locale is set to something other than
Coren_US.
Example:
setlocale(LC_NUMERIC, "fr_FR.UTF-8");
double x = strtod("0.1", NULL); // fails, does not parse
double y = strtod("0,1", NULL); // parses as 0.1
Impact: Parsing JSON arrays of floats will fail depending on the user’s locale, even though the input is valid JSON.
Possible fixes:
- Use
strtod_l/newlocalewith aClocale, for locale-independent parsing. - Or replace
strtodwith a locale-independent float parser (e.g. a custom implementation or a JSON parser library).
Expected behavior:
Parsing JSON should always interpret . as the decimal separator, regardless of the system locale.
possible related issue: https://github.com/asg017/sqlite-vec/issues/168
@sohelzerdoumi this has been fixed in my fork. Hope it eventually makes its way upstream:
https://github.com/vlasky/sqlite-vec/
The commit that fixed it:
- https://github.com/vlasky/sqlite-vec/commit/dd13eb5eb5c0865f70f8e50aac0cea1d4026564b