cre2 icon indicating copy to clipboard operation
cre2 copied to clipboard

cre2_new silently truncates pattern if length is passed incorrectly

Open Tonyyang0606 opened this issue 7 months ago • 1 comments

Step to reproduce

#include <stdio.h>
#include <string.h>
#include <cre2.h>

int main() {
    const char *pattern = "^admin:[0-9]{4}$";
    // Expected: only matches "admin:1234"

    int wrong_len = 7;  // ❌ Intentionally too short

    cre2_options_t *opt = cre2_opt_new();
    cre2_regexp_t *re = cre2_new(pattern, wrong_len, opt);

    if (cre2_error_code(re) != 0) {
        printf("Compilation failed: %s\n", cre2_error_string(re));
    } else {
        // Actually compiled as "^admin:" (truncated)
        printf("Compiled successfully, but pattern was truncated!\n");

        const char *test = "admin:xxx2312x";
        cre2_string_t match;
        int matched = cre2_match(re, test, (int)strlen(test),
                                 0, (int)strlen(test),
                                 CRE2_UNANCHORED, &match, 1);
        printf("Match result: %d\n", matched);
    }

    cre2_delete(re);
    cre2_opt_delete(opt);
    return 0;
}

Actual behaviour

  • The regex compiles successfully but only uses the first len characters.
  • No error is reported if len < strlen(pattern).
  • This can lead to serious logic bugs: the user thinks the pattern is compiled correctly, but it’s silently truncated.

Example output

Compiled successfully, but pattern was truncated!
Match result: 1

Expected behaviour

  1. If len < strlen(pattern), the library should:
    • Either fail with an error code, or
    • Emit a clear warning (especially in debug builds).
  2. Provide a safe API for null-terminated strings, e.g.:
cre2_regexp_t *cre2_new_cstr(const char *pattern, cre2_options_t *opt);

This would automatically use strlen(pattern) and avoid common mistakes.

Suggestion

  • Add cre2_new_cstr (safe wrapper).
  • Improve error handling when the provided len does not match the actual string length.
  • At minimum, document this pitfall more explicitly, since it can cause subtle and dangerous bugs in security-sensitive regex use cases.

Tonyyang0606 avatar Sep 22 '25 07:09 Tonyyang0606

It is a good idea to support strings that are not nil terminated.

marcomaggi avatar Sep 22 '25 11:09 marcomaggi