nut For the few cases where we use variables as formatting strings, find a way to ensure it is safe

In a few cases we use formatting strings as variables (e.g. coming from some tables or even constructed at run-time) which is error-prone with regard to interpretation of subsequent memory stack when calling a printf-related method. While this is done only for strings defined in NUT codebase, it has a potential to regress if someone modifies the table value in some later revision. Currently we hush a warning like format not a string literal and no format arguments and a few similar others with pragmas, but there gotta be some better way.

Some first ideas (more welcome):

create a macro (one with a number argument, or a set of macros with different amount of args) just to pass a hint about amount of expected formatting arguments, and somehow parse (pycparser as used elsewhere? simpler perl magic?) to statically check that the amount of unescaped percent characters in the resolved first argument (at least when fixed from tables) matches the macro hint amount;
do same in runtime to at least throw fatalx() or similar and not do dangerous things - at least in our methods like upsdebugx() or dstate_setinfo(), we can at least control the varargc vs. amount of percents in the actual formatting string.

Either way, I think we lose the facility of modern compilers to also statically check the types (that a %i refers to an int-sized number, and not a long or char*, etc.) in these cases, so some error-proneness remains even if the amount of args remains but their type changes.

Maybe the solution to get the best of all worlds could be in fact to specify the runtime method to return a string (so callers would go like dstate_setinfo("ups.model", "%s", checked_format(variableFormat, checkingFormat, ...)); and avoid hushing pragmas altogether) with checkingFormat being a real formatting string like "%s%s%i%"PRIuSIZE"%f" according to the types of subsequent vararg parameters, and the preceding variableFormat argument specifying the actual formatting string (expected/checked to mention exactly the same set of percent-formats in same order).

This way we could have compile-time checks that varargs conform in amount and type to some contrived formatting string, and run-time assertions that whatever variable string we actually use to produce the checked_format() string is compatible with those expectations.

Is there some (library?) method to strip non-formatting characters (plain text, format beautification with sizes/alignments like %.01f => %f or whatever) so we could directly strcmp() the expected pattern vs. the stripped dynamic formatting string? If not, we have a fallback printf() implementation that I guess could be wrangled into such a helper method...

May 22 '24 08:05 jimklimov

Not sure if relevant, but on GCC you can add typechecking by adding the following attribute to the function declaration:

__attribute__ ((format (printf, <string-index>, <first-to-check>)));

In GLib it is used extensively through the G_GNUC_PRINTF macro.

Jun 03 '24 08:06 ntd

Not sure if relevant, but on GCC you can add typechecking by adding the following attribute to the function declaration:
__attribute__ ((format (printf, <string-index>, <first-to-check>)));
In GLib it is used extensively through the G_GNUC_PRINTF macro.

Sorry, I just seen you are already using it.

Jun 03 '24 08:06 ntd

Thanks! Not sure how portable this is, but the attribute is actually used in include/common.h (and some other files with static methods) to mark the varargs support, and all compilers currently used in the NUT CI farm do not complain at least.

Probably that helps clang and gcc raise the compile-time warnings, and this is what one part of this solution relies on with the "reference" formatting strings (the fallback being that they can be checked by a human to match any subsequent actual string/numeric/... arguments).

Jun 03 '24 08:06 jimklimov