Test method
All data series are measured with a 10 different datasets. The datasets differ from each other in the number of repeating UUID's: The first one does not have any, the last one has 90 percent repeating UUID's.So most importantly, let's measure just plain parsing (no cache) just to compare to something. Then, let's measure caching with a HashMap. I have to add that a HashMap is not a cache and whenever I see a HashMap used as cache, I have terrible nightmares about OOM exceptions coming like zombies from everywhere.
Third, let's measure the performace of a real cache, I chose ehcache. You can choose your own pet cache technology (to be honest I use infinispan, but now for simplicity I just wanted a good old ehcache)
Results
Ok, let's see what we got.
- As expected, no cache performs more or less the same everytime.
- HashMap "caching" adds a little speed over 50 percent repeating input. It is a little compensatin for the OOM's you will get, or for the code you will write to avoid it :)
- Ehcache implementation has some difficulties keeping up, it only beats the "no cache" solution when the percentage of repeating uuid's is over 90%, even then, the gain is little.
So my conclusion is: I would probably not want to cache the objects that are this easy to create. I would definetly try to optimize once the database interactions are optimal, the app scales well to multiple processors and even multiple nodes, and so on... but this looks like a small and painful victory.