I was interested in the following questions:
- How much slower is the access to volatile fields?
- What is the cost of synchronization?
- What is the probability of the bad result coming from wrong multi-threading code?
- How does the synchronization compare to modern alternatives like STM?
So I wrote a very simple test code to answer all four questions, a Counter interface with 4 implementations:
- Simple counter
- Synchronized counter
- Counter with volatile field
- STM counter with Multiverse
The test code starts a number of threads and shares a single counter with them. All threads call hit() on the counter exactly 1 million times and then terminates. The test waits for all the threads to finish and then checks the Counter. It should be exactly a million times the number of threads. And of course, we have two BAD implementation here (number 1 and 3), where I only wanted to know how wrong they are.
Test environment: the usual dual-core AMD nuclear submarine with 2 GB 1066 Hz memory, Linux and java 1.6
Test environment: the usual dual-core AMD nuclear submarine with 2 GB 1066 Hz memory, Linux and java 1.6
Conclusions
About the error rates: My first impression was that the volatile solution produces even higher error than the simple one, but now I am not quite sure. But anyway, they are both just wrong.
Think before Thread.start()!
- Access to volatile fields is much slower of course, than just normal fields. The results say it is about 4 times that slow, but I believe this also depends on your RAM speed. It is actually not much better than a synchronized block.
- The cost of synchronization is high indeed if you just look at those lines, but not high at all if you know that the "simple" and "volatile" solutions produce wrong results.
- The probability of bad result coming from wrong concurrency code is huge. If you frequently update something from multiple threads, you need to think about synchronization. Well, this test really is an edge case, but never mind.
- From the first moment when I heard of software transactional memory, I love the idea. It sounds just great. But in this test it does not perform great, at least not on 2 cores, but this is something the wiki page mentioned as well. It would be a nice to run it on a 4-core or 8-core computers just to see how the figures change, but my guess is that it does not improve, because it needs to do way too many rollbacks. Optimistic locking should perform better on 4+ cores when the probability of collission is relatively small. This is not actually a fair test for STM, it really needs a smarter one.
About the error rates: My first impression was that the volatile solution produces even higher error than the simple one, but now I am not quite sure. But anyway, they are both just wrong.
Think before Thread.start()!