sqlite-vec icon indicating copy to clipboard operation
sqlite-vec copied to clipboard

strtod is locale-dependent (LC_NUMERIC) and breaks JSON float parsing under non-C locales (e.g. French)

Open sohelzerdoumi opened this issue 5 months ago • 1 comments

https://github.com/asg017/sqlite-vec/blob/a2dd24f27ec7e4a5743e58f5ab6835deea5db58d/sqlite-vec.c#L754

Currently the code uses strtod to parse floating point numbers from JSON arrays. However, strtod respects the current process locale (LC_NUMERIC).

This causes incorrect behavior when the user’s locale is not "C":

  • In French locale (fr_FR.UTF-8), "0.1" is rejected because the expected separator is ",".
  • JSON always requires a . (dot) as the decimal separator.
  • As a result, JSON parsing will fail on systems where the locale is set to something other than C or en_US.

Example:

setlocale(LC_NUMERIC, "fr_FR.UTF-8");
double x = strtod("0.1", NULL); // fails, does not parse
double y = strtod("0,1", NULL); // parses as 0.1

Impact: Parsing JSON arrays of floats will fail depending on the user’s locale, even though the input is valid JSON.

Possible fixes:

  • Use strtod_l / newlocale with a C locale, for locale-independent parsing.
  • Or replace strtod with a locale-independent float parser (e.g. a custom implementation or a JSON parser library).

Expected behavior: Parsing JSON should always interpret . as the decimal separator, regardless of the system locale.

sohelzerdoumi avatar Aug 19 '25 14:08 sohelzerdoumi

possible related issue: https://github.com/asg017/sqlite-vec/issues/168

sohelzerdoumi avatar Aug 19 '25 14:08 sohelzerdoumi

@sohelzerdoumi this has been fixed in my fork. Hope it eventually makes its way upstream:

https://github.com/vlasky/sqlite-vec/

The commit that fixed it:

  • https://github.com/vlasky/sqlite-vec/commit/dd13eb5eb5c0865f70f8e50aac0cea1d4026564b

vlasky avatar Dec 06 '25 21:12 vlasky