dyninst icon indicating copy to clipboard operation
dyninst copied to clipboard

Fix race condition in parsing w/ isSyscall()

Open bigtrak opened this issue 2 years ago • 4 comments

There is a static initializer in parseAPI's internal isSyscall().

It fails about 1% of the time, and causes a parse failure and dyninst crash.

Previously this was guaranteed to be thread safe, as it would run prior to thread creation, and there was no possibility of synchronization errors.

After the C/++ 11 changes to "scoped statics" aka statics in a function or method, there is no longer a guarantee of any thread protection.

In fact, it is not possible to have anything thread safe as a scoped static due to the way they are specified in the standard, even if the object is a thread sync object such as a once_flag, which can have a race itself with the new standard. Ouch.

bigtrak avatar Apr 21 '23 16:04 bigtrak

I think you have this backwards. Prior to C++ 11 compiler implementations were not required to initialize static local variables in a thread safe manner (gcc used thread safe initialization, clang did not). For C++ 11 and later static local variables are required to initialized in a thread safe manner; initialized only once by the first thread to reach definition (see https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables). If this is happening then it is a compiler bug.

kupsch avatar Apr 21 '23 16:04 kupsch

I think you have this backwards.

Nope.

Prior to C++ 11, scoped statics were treated as they always had been -- they were converted go partially visible file local statics, which are global statics initialized by the runtime before execution.

With C++ 11, scoped statics changed to the "protection" of a non-syncrhonized mechanism to try and prevent them from being executed twice. However, that caused further brain damage.

What happens is that each individual scoped static object basically has an individual guard of

if (!iam_initialized) {
	iam_initialized = 1

	initialize this object
}

/* ps iam_initialized is a misnomer, it it more along the lines of "initialization of this has started, move along" */

However, if you have two static objects that need to be initialized in order, there is no longer any guarantee of correctness.

For example:

While thread1 is initializing the first item, even if the "iam initialized" mechanism is thread safe, the next thread can be initializing the next item. Which, if the first item that needs to be completely initialized yet, is not initialized yet, boom race condition or other failure.

This can cause a throw on the detection of such as multiple simultaneous initializers -- which the gcc/linux runtime does sometimes detect when it happens. Other times it is just memory corruption and subsequent failure.

Or, if the first item is a synchronization method to control access to object #2, well that breaks as well.

Just -- wait -- but it gets better -- a flaw can happen with only one item: For example, thread #2 arrives while object #1 is being initialized. Well since the "initialized flag" is set, it skips right over that, which can goto code which uses that object which is being initialized by thread#1. Voila, object in indeterminate state, crash.

There is no way to guarantee the ordering of that initialization with C/++ 11, which was possible with earlier standards.

Also, if re-entrancy is involved "the behavior is undefined". Really it always was, so not a big change there... except for the same issue happening with a single thread.

Prior to C/++ 11, the order of static initializers, whether scoped or not, was guaranteed to be in-order by the runtime before execution was started, just as it always had been.

Essentially a scoped static was treated as a file 
static with limited scope, and that global mechanism
was used -- as it still is with file scoped statics.

Admittedly, if you had a bad initialization tree, that
could cause an error, but that is not the issue here.

This is a known flaw in C++/11 that needs to be worked around without using scoped statics. They broke existing working code with their change. "Easy" to fix once you find the flaw in the standards. I found the flaw in execution, then found it was a well known issue with C/++ >= 11 to document it with.

As IA_IAPI already had a global initializer, the thing being initialized here was moved to it -- which is protected by a un-scoped static once_flag to "guarantee" correctness. Along with once_call semantics, which keeps any other threads waiting until that hits, and then it works...

Which begs the question, is a once_flag and call_once mechanism really safe, which now I am starting to wonder about after rewriting this description of the scenario to an audience unfamiliar with the issue?

Call once appears to do the right thing, and all callers
to it hang and wait for the owning thing to initialize
the code and then se the once flag.

However, the once_flag can't be a scoped static,it must be
a real static, as it would have the race condition mentioned
earlier.

So yes, our global scoped static once_flag in IA_IAPI
initializer is thread safe.

Bolo -- Josef T. Burger

ps: If the static initializers were treated as a call_once type situation, then you are right, the race might not occur. However the standard does not say anything about the interaction or an implementation choice like that is necessary. Perhaps a newer C/C++ standard will address this issue, or work-arounds like this will need to exist for correctness against the standard. Until that exists and the entire world moves forward , well scoped statics are not as safe or useful as they used to be.

bigtrak avatar Apr 21 '23 18:04 bigtrak

C++ 11 and later do treat static local variable as if there is a call_once function to initialize static local variables and are thread-safe. C++ always allowed static local, static members and global variables to be initialized dynamically; if they can be initialized statically using a constant or an expression that can be evaluated at compilation-time, then they are initialized at compile-time and are not a problem (C only allows the static initialization). In C++ 11 and later the declaration static foo = bar() is thread-safe (your expansion of the initialization is missing the mutex which is required by the standard) and gets turned into:

static int fooGuard = 0;  // initialized at compile-time
if (!fooGuard)  {
    GuardMutex guard(globalRecursiveStaticMutex);   // get the mutex, release on scope exit
                                                    // recursive to allow initialization of called
                                                    // functions statics
    if (!fooGuard)
        foo = bar();
        fooGuard = 1;
    }
}

Prior to C++ 11 implementation were allowed to do the following which is thread-unsafe:

if (!fooGuard)
    foo = bar();        // these two statements can be swapped
    fooGuard = 1;
}

The requirement is stated in the C++ 11 (https://timsong-cpp.github.io/cppwp/n3337/stmt.dcl section 6.7 paragraph 4) and later standards.

Local static variables are the preferred mechanism to deal with order of initialized problems in C++.

What isn't specified is the order of initialization for dynamically initialized global variables. They are initialized in declaration order of each compilation unit. The order across compilation units initializing global variables is unordered and may occur after main is started. The only guarantee is that globals in a compilation unit will be initialized before the ODR-use of any function defined in the compilation unit, so if you use globals defined in another compilation unit that depend on global initialization order this can be a problem with or without threads.

Are there other non-local static variables (global or static member) used by the RegisterAST::Ptr constructor or the RegisterAST constructor that are causing this problem. I don't think the static local variable can cause the problem you think.

kupsch avatar Apr 21 '23 20:04 kupsch

My understanding on this issue is in line with @kupsch but regardless, if you do believe there is an issue, I think the way to prove there is a race condition issue is via evidence with thread sanitizers. Discussion about the standard is absolutely relevant but it is always possible that a particular compiler version isn't conforming to the standard

jrmadsen avatar Apr 25 '23 14:04 jrmadsen

Fixed by #1686.

hainest avatar Feb 27 '24 18:02 hainest