Saturday, 14 May 2011

Volatile

Volatile is a rarely used keyword in Java. If you have never used it, don't worry, you are almost certainly right! All it does is that it enforces the program to read the value from the RAM rather than using a cache, so you can be sure that you got the fresh value at least at the moment when you read it. It has a somewhat better performance than a synchronize block since it does not lock. However, you run into trouble if you also want to write the data back, because even a ++ operation is non atomic, it is a read, a calculation and a write, therefore the probability of a wrong result is high.

I was interested in the following questions:

  1. How much slower is the access to volatile fields?
  2. What is the cost of synchronization?
  3. What is the probability of the bad result coming from wrong multi-threading code?
  4. How does the synchronization compare to modern alternatives like STM?
So I wrote a very simple test code to answer all four questions, a Counter interface with 4 implementations:
  1. Simple counter
  2. Synchronized counter
  3. Counter with volatile field
  4. STM counter with Multiverse
The test code starts a number of threads and shares a single counter with them. All threads call hit() on the counter exactly 1 million times and then terminates. The test waits for all the threads to finish and then checks the Counter. It should be exactly a million times the number of threads. And of course, we have two BAD implementation here (number 1 and 3), where I only wanted to know how wrong they are.

Test environment: the usual dual-core AMD nuclear submarine with 2 GB 1066 Hz memory, Linux and java 1.6

Code: https://dummywarhead.googlecode.com/hg/volatilepenalty/




Conclusions

  1. Access to volatile fields is much slower of course, than just normal fields. The results say it is about 4 times that slow, but I believe this also depends on your RAM speed. It is actually not much better than a synchronized block.
  2. The cost of synchronization is high indeed if you just look at those lines, but not high at all if you know that the "simple" and "volatile" solutions produce wrong results.
  3. The probability of bad result coming from wrong concurrency code is huge. If you frequently update something from multiple threads, you need to think about synchronization. Well, this test really is an edge case, but never mind.
  4. From the first moment when I heard of software transactional memory, I love the idea. It sounds just great. But in this test it does not perform great, at least not on 2 cores, but this is something the wiki page mentioned as well. It would be a nice to run it on a 4-core or 8-core computers just to see how the figures change, but my guess is that it does not improve, because it needs to do way too many rollbacks. Optimistic locking should perform better on 4+ cores when the probability of collission is relatively small. This is not actually a fair test for STM, it really needs a smarter one.

About the error rates: My first impression was that the volatile solution produces even higher error than the simple one, but now I am not quite sure. But anyway, they are both just wrong.
Think before Thread.start()!

10 comments:

  1. How do you define error rate ?
    The graphs are nice, except there is no scale for 'nr-of-threads' :)
    If you can show the code, I will try to execute it on a 4-core+hyper threaded machine, to compare the results.

    ReplyDelete
  2. Thanks Zsombor, I uploaded the test code here:
    https://dummywarhead.googlecode.com/hg/volatilepenalty/

    And when running you need to use the multiverse java agent:
    http://multiverse.codehaus.org/developconfiguration.html

    ReplyDelete
  3. I think you should try to test it with several JVMs to see the low level implementation differences in the performance. Hotspot, harmony, jrockit, ibm jvm...

    ReplyDelete
  4. Opp I forgot to mention to test atomic variables from the java.util.concurrent package. :)

    ReplyDelete
  5. Interesting article , you have indeed cover the topic with great details. I have also blogged my experience on java How Synchronization works in Java. let me know how do you find it.

    ReplyDelete
  6. Running 'mvn package' emitted the following results:


    Running com.googlecode.dummywarhead.volatilepenalty.CounterTest
    threads, simple, simple-err, synchronized, synchronized-err, volatile, volatile-err, STM, STM-err
    1,19, 0.0,226, 0.0,87, 0.0,54, 0.0
    2,220, 0.485141,1536, 0.0,481, 0.41066474,100, 0.32700694
    3,238, 0.47826597,3241, 0.0,790, 0.5341241,337, 0.6575921
    4,438, 0.7395874,5295, 0.0,1020, 0.6308595,449, 0.7437196
    5,622, 0.49282768,9812, 0.0,1184, 0.5969199,450, 0.53689766
    6,703, 0.6679924,11211, 0.0,1254, 0.58200794,380, 0.42169026
    7,698, 0.4072889,10760, 0.0,1393, 0.5045509,815, 0.59120107
    8,1260, 0.47434774,9920, 0.0,1396, 0.10136391,1320, 0.48243165
    9,1407, 0.4623657,11490, 0.0,1607, 0.3194817,1399, 0.47734383
    10,1516, 0.4697184,12922, 0.0,1788, 0.43953177,1524, 0.5322588
    11,1590, 0.47772938,12091, 0.0,2192, 0.6387634,1575, 0.5541662
    12,1898, 0.46083474,15973, 0.0,2273, 0.4871486,1840, 0.47043145
    13,1993, 0.4676433,14327, 0.0,2343, 0.66943556,1990, 0.46478605
    14,2168, 0.46901262,17599, 0.0,2507, 0.70300525,2131, 0.5092384
    15,2271, 0.48124048,18871, 0.0,2724, 0.71008426,2339, 0.46366623
    16,1615, 0.36094558,21637, 0.0,2497, 0.8415575,2649, 0.4739362
    Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 240.521 sec

    ReplyDelete
  7. Thank you!

    The STM runtime environment did not work... it produced quick and bad results, but the rest is interesting.

    ReplyDelete
  8. https://spreadsheets3.google.com/oimg?key=0ApMkIX1Ygx8zdDUwdTVLRXlDSW1JaTBBMlZCelNLbnc&oid=3&zx=m78oohjop6ds

    Well, at the end it only tells me that on 4-core things go twice as fast as on 2-core :-D except at synchronization, it is just as slow as on 2 or 1 cores.

    ReplyDelete
  9. Unfortunately both with OpenJDK 64-Bit Server VM (19.0-b09 mixed mode linux-amd64 compressed oops) and Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) it throws SIGSEGV after 9 threads with STM.


    threads, simple, simple-err, synchronized, synchronized-err, volatile, volatile-err, STM, STM-err
    1,23, 0.0,241, 0.0,100, 0.0,1525, 0.0
    2,163, 0.45345715,1572, 0.0,453, 0.4650143,3725, 0.0
    3,197, 0.6395432,3044, 0.0,728, 0.6111076,56952, 0.0
    4,380, 0.4662055,5939, 0.0,963, 0.681162,161091, 0.0
    5,503, 0.56660753,8336, 0.0,1041, 0.70807225,173571, 0.0
    6,699, 0.69340205,10857, 0.0,1330, 0.56910366,232263, 0.0
    7,1048, 0.47411516,11285, 0.0,1285, 0.67546153,247419, 0.0
    8,972, 0.55748,11325, 0.0,1587, 0.43104696,156471, 0.0
    9,1222, 0.54793394,14409, 0.0,1866, 0.58874166,#
    # A fatal error has been detected by the Java Runtime Environment:
    #
    # SIGSEGV (0xb) at pc=0x00007f48d89ec9c4, pid=7699, tid=139950731818752
    #
    # JRE version: 6.0_24-b07
    # Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops)
    # Problematic frame:
    # V [libjvm.so+0x6429c4]



    threads, simple, simple-err, synchronized, synchronized-err, volatile, volatile-err, STM, STM-err
    1,23, 0.0,252, 0.0,105, 0.0,1570, 0.0
    2,79, 0.40747884,1487, 0.0,451, 0.44160494,3594, 0.0
    3,312, 0.6103785,3125, 0.0,747, 0.5681198,45103, 0.0
    4,435, 0.67855424,5192, 0.0,875, 0.5785828,123410, 0.0
    5,540, 0.6425069,7381, 0.0,1040, 0.52961713,172076, 0.0
    6,690, 0.68978775,9456, 0.0,1183, 0.5644597,222214, 0.0
    7,847, 0.57425207,9617, 0.0,1345, 0.70551294,222229, 0.0
    8,1058, 0.5264334,8629, 0.0,1492, 0.3739647,223204, 0.0
    9,1287, 0.52876097,11913, 0.0,1686, 0.5075445,#
    # A fatal error has been detected by the Java Runtime Environment:
    #
    # SIGSEGV (0xb) at pc=0x00007f726c79c90b, pid=4357, tid=140128944121600
    #
    # JRE version: 6.0_20-b20
    # Java VM:
    # Derivative: IcedTea6 1.9.7
    # Distribution: Ubuntu 10.10, package 6b20-1.9.7-0ubuntu1
    # Problematic frame:
    # V [libjvm.so+0x5dd90b]
    #

    ReplyDelete
  10. Java Developers often mistake volatile as replacement of synchronized keyword which is not true. volatile keyword in Java
    can guarantee visibility and ordering and prevent compiler to reorder code statement but as you mentioned can not guarantee atomicity which can only be achieved either by locking or by using Atomic classes from java.util.concurrent.atomic package. one more important thing to note is behavior of volatile keyword before and after java5. visibility and ordering guarantee is achieved by using happens-before relationship and every write in volatile variable happens before every read in volatile variable in java.Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire

    ReplyDelete