While wading into the wide world of high throughput production Java, I have been enjoying guidance from “Java Concurrency in Practice” by Brian Goetz, as well as my friend and coworker, David Copeland. In a recent talk, David boiled down the range of concurrency problems to three main issues: atomicity, visibility, and ordering conflicts.
An atomic operation is one that executes in a single operation. A variable assignment (
int x = 2;) is an example of an atomic operation (mostly* — see footnote). Incrementing a variable, on the other hand, is NOT an atomic operation.
1 2 3 4 5 6
count++ appears to be a single action it is actually three:
- read count,
- modify count, and
- write count.
If ThreadA and ThreadB share a NonAtomicCounter and both call
doWork(), ThreadA can read the value of count while ThreadB is modifying it, making
count vulnerable to lost updates.
Be suspicious of any shared state that is involved in read-modify-write or test-then-act sequences of actions.
Once a thread has read a value, there is no guarantee that it will ever check to see if another thread has modified that value.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Each thread can cache a local copy of
stop, which means they will read the value once and then never check to see if it has been changed.
Be suspicious of loops that are gated by a variable that is visible to other threads.
The JVM has a just-in-time compiler that optimizes execution at runtime. The memory model guarantees deterministic behaviour within a thread, but makes no promises about how instructions are ordered in the meantime.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
How is it possible that t1 sees that A = 2? From the docs:
“The semantics of the Java programming language allow compilers and microprocessors to perform optimizations that can interact with incorrectly synchronized code in ways that can produce behaviors that seem paradoxical.””
What this means, is that even though the instructions are written in such a way that localA should not be able to see the reassignment of A, the compiler can reorder instructions to optimize evaluation. As long as the result of the computation is deterministic within a thread, the compiler is free to change the execution order, inline assignments, simplify algebra, etc.
In the above example,
is the same as
When the instructions for thread1 and thread2 are interleaved, thread1 will occasionally see
A = 2.
Be suspicious of operations involving multiple variables and ordering requirements.
Thread Safety Analysis Guidelines
If you are dealing with a thread safety issue, ask these three questions:
- Do your threads share mutable state? Look for public getters and setters, or methods that return references to state rather than copies or primitive values.
- Do your threads share multi-variable states (invariants)? Can one component change independently from others when they should always be treated as a single unit?
- For any publicly mutable state, is it an atomicity, visibility, or ordering issue?
Solutions for thread safety issues
The Java language has a few solutions to offer:
Adding the keyword
volatile to a variable declaration tells the JVM not to use a cached value for any read operations. This keyword will ensure that writes to a variable are always visible to other threads, but will not fix atomicity or ordering issues.
synchronized lock around a variable will address all three issues, at the cost of concurrency. If only one thread at a time can access your data, your program is single threaded. Common lock-gotchas include locking writes but not reads!
final addresses all three issues, but requires your state to be immutable. This approach is most useful in composition with other techniques.
Use guava libraries
Google has published a library of thread-safe collections that handle concurrent access for you. ConcurrentHashMap is one of my favourites, and you can check out the user guide for more.
- A variable’s declaration and assignment are really two operations: the allocation of memory to hold the value, and the writing of that value into memory. If a variable is declared and assigned within an object constructor, the object will not be published without variable initialization. However, if it is declared and assigned anywhere else, there are no such guarantees. For the purpose of this example, I am completely ignoring these implications :D