Java: Queries for thread-safe classes
This PR introduces three queries for thread-safe classes, corresponding to three properties that such classes must possess, known as
- P1: No escaping
- P2: Safe publication
- P3: Correct synchronization Each query is introduced in its own commit, P3 first since it comes with some definitions that make P2 and P1 simple to write. And then there is a fourth commit fixing up the query for P3, since its detection in the first version only works when there are no loops in the call graph. I chose to show both versions, since the first is quite simple to understand and the second is basically the same idea but operating on SCCs of the call graph.
Some considerations:
- should (parts of) the module
Monitorsbe moved intoconcurrency.qll? - should the severity be
error? - the queries are being included in the code quality suite
QHelp previews:
java/ql/src/Likely Bugs/Concurrency/Escaping.qhelp
Escaping
In a thread-safe class, non-final fields should generally be private (or possibly volatile) to ensure that they cannot be accessed by other threads in an unsafe manner.
Recommendation
If the field does not change, mark it as final. If the field is mutable, mark it as private and provide properly synchronized accessors.
References
- Java Language Specification, chapter 17: Threads and Locks.
- Java concurrency package: java.util.concurrent.
java/ql/src/Likely Bugs/Concurrency/SafePublication.qhelp
Safe publication
In a thread-safe class, values must be published safely to avoid inconsistent or unexpected behavior caused by visibility issues between threads. If a value is not safely published, one thread may see a stale or partially constructed value written by another thread, leading to subtle concurrency bugs.
In particular, values of primitive types should not be initialised to anything but their default values (which for Object is null) unless this happens in a static context.
Techniques for safe publication include:
- Using synchronized blocks or methods to ensure that a value is fully constructed before it is published.
- Using volatile fields to ensure visibility of changes across threads.
- Using thread-safe collections or classes that provide built-in synchronization, such as are found in
java.util.concurrent. - Using the
finalkeyword to ensure that a reference to an object is safely published when the object is constructed.
Recommendation
Choose a safe publication technique that fits your use case. If the value only needs to be written once, say for a singleton, consider using the final keyword. If the value is mutable and needs to be shared across threads, consider using synchronized blocks or methods, or using a thread-safe collection from java.util.concurrent.
Example
In the following example, the values of value and server_id are not safely published. The constructor creates a new object and assigns it to the field value. However, the field is not declared as volatile or final, and there are no synchronization mechanisms in place to ensure that the value is fully constructed before it is published. A different thread may see the default value null. Similarly, the field server_id may be observed to be 0.
public class UnsafePublication {
private Object value;
private int server_id;
public UnsafePublication() {
value = new Object(); // Not safely published, other threads may see the default value null
server_id = 1; // Not safely published, other threads may see the default value 0
}
public Object getValue() {
return value;
}
public int getServerId() {
return server_id;
}
}
To fix this example, we declare the field value as volatile. This will ensure that all changes to the field are visible to all threads. The field server_id is only meant to be written once, so we only need the write inside the constructor to be visible to other threads; declaring it final guarantees this:
public class SafePublication {
private volatile Object value;
private final int server_id;
public SafePublication() {
value = new Object(); // Safely published as volatile
server_id = 1; // Safely published as final
}
public synchronized Object getValue() {
return value;
}
public int getServerId() {
return server_id;
}
}
References
- Java Language Specification, chapter 17: Threads and Locks.
- Java concurrency package: java.util.concurrent.
java/ql/src/Likely Bugs/Concurrency/ThreadSafe.qhelp
Not thread-safe
In a thread-safe class, all field accesses that can be caused by calls to public methods must be properly synchronized.
Recommendation
Protect the field access with a lock. Alternatively mark the field as volatile if the write operation is atomic. You can also choose to use a data type that guarantees atomic access. If the field is immutable, mark it as final.
References
- Java Language Specification, chapter 17: Threads and Locks.
- Java concurrency package: java.util.concurrent.
Commented offline.
Notes from offline review:
- [x] Everything private
- [x] Better lock detection, look at unreleased lock query
- [x] Monitor abstract toString
- [x] Remove char preds and fields from wrapped IPA types
- [x] Revert to NodeToBeDominated
- [x] locallySynchronizedOn not field? No, see example on page 550 of https://docs.oracle.com/javase/specs/jls/se25/jls25.pdf
- [x] Prefix should be calculated via matches or as
output + _ - [x] getErasure -> getSourceDeclaration (needs to be on the field instead of the type)
- [x] getTypeDescriptor -> getPackage.getName (needs to be on the field instead of the type)
- [x] java.util , Collections, sync_name + sync_name.matches(“synchronized%”)
- [x] Define ExposedField ~- [ ] Hoist lookups out of negations, using exists~
- [x] Pragma[inline] -> pragma[inline_late] + bindingset
- [x] AnnotatedAsThreadSafe, inline char pred
- [ ] Create extension point ~- [ ] If a then b else any is often better than a implies b~
- [x] Safe publication example is wrong.
- [x] Reformulate unsynchronized to find escaping paths from an exposed field access out to public methods.
Is this not what
providesAccessand "the call graph induced bya" do? Sure, but the reformulation is more about avoiding binary predicates and recursion through forall. - [x] Split into self_conflicting and inconsistently locked
- A publicly accessible write that is not locked is self-conflicting and can be reported right away.
- A publicly accessible read that is not locked is conflicting with any publicly accessible write. This can be reported right away, if the definition of exposed field includes that there is a publicly accessible write.
- All other cases involve two different locks.
Code change looks good. If you haven't already, then it's probably good to triage a few results from a MRVA run to check that the latest version still looks good. Also, some basic performance checking is in order - let's do a dca run, but remember that it can be limited to these 3 queries to make it smaller/faster.
Not sure if the failing test is simply unrelated?
Not sure if the failing test is simply unrelated?
I think it's unrelated. I just tried to rerun, so let's see if that helps.
Triage of new results:
java/unreleased-lock, 21 new results
These are all due to lockInterruptibly now being correctly recognized as a locking call. This should in itself not introduce problems as the predicate failedLock handles InterruptedException.
- several FPs in apache/geode: The code is very complex, but ultimately a function wraps a locking call and therefor does not release the lock. The query correctly detects that, but it is probably intended by the developer.
- an FP in geode due to a
ReadWriteLock(link). The reader is taken under certain conditions and the write under others. They are later released under the same conditions. A split control flow graph might have prevented this FP. - some FPs in hadoop where locking is wrapped in methods.
- two FPs in google/guava due to locking being wrapped in methods.
java/safe-publication, 9 results The FP rate for this has been scrutinized already, but with only 9 results I went throuhg them anyway.
- 3 TPs in apache/flink, repeated in the buildless version
- 2 TPs in gradle
- 1 FP in gradle due to the unsafely published value being unreadable (it is an internal factory, I think it could probably be static).
java/not-threadsafe, 75 results The FP rate for this has already been established to be good, but I went through a flock for each repo as a sanity check
- several TPs in apache/flink
SystemResourcesCounter.java(no locking at all in that file), repeated in buildless - several TPs in gradle, little locking used
I will do an extra DCA run on nightly for timing, since I changed the code, but otherwise I am ready to merge this :-)
Timing has actually improved with the code change: palatable/lambda is now slightly slower, but apache/commons now sees no slow-down. So the project with the very many prefixes is the only one where running the new queries is even visible.