Showing posts with label parsing. Show all posts
Showing posts with label parsing. Show all posts

Tuesday, 12 November 2019

kotlin, weak and threadLocal delegate

A bit of kotlin today.

Kotlin has a nice and very useful feature called delegated properties. This allows software engineers to add special behavior around a property. For example with the lazy delegate you can delay the calculation of the field.  This may be useful if the calculation is expensive and the calculated value can be re-used multiple times.

The standard library comes with a very modest set of delegates, and to be honest other than the lazy, I haven't found them really useful, but I wrote a couple of delegates that I have found good use of, let's begin with...

threadLocal


Java comes with quite a few utility classes, which we in many cases would like to have as a constant and share it between all threads of the code, until we find that ohhh... surprisingly, they are not thread safe. Probably the most common ones are:
  • SimpleDateFormat
    SimpleDateFormat is indeed the fastest way of parsing date and times and formatting string from Date objects, but it is absolutely not thead-safe, therefore you may be making a billion dollar mistake if you share them as static final.
  • All of the traditional XML processing APIs: DOM, XPath, etc...
    Quite annoying, because they are also very expensive to initialize. I wonder what was the design idea... but anyway


The simple solution is: you do not share them, you do not reuse them, each time you need such an object, you produce one your code refers to it just as local variable, you leave it to the garbage collector after processing.

This will be safe, but a bit slow... or kind of slow if you need to do this a lot.

There is a more performant solution: you declare a ThreadLocal.



Now it is safe for multiple threads, but doesn't look pretty does it? No, it looks crap.

Let's see what can kotlin do for us:

Doesn't this look a lot more human-readable? And just as safe as the java counterpart.

weak

Weak is almost like lazy. It is actually lazy, because the expression will be evaluated at first use. The difference is that reference to the stored value will be weak and therefore whenever the JVM is running low on memory the garbage collector is free to throw them away. Once the results of the calculations are dropped, they will be recalculated again when needed, therefore we balance between good performance and memory limits.



I have found this very useful when working with large data-structures. Of course it is a little bit slower when accessing the stored value compared to the lazy delegate, that is the overhead of using and checking the weak reference.

Give it a try, tell me what you think


The delegates are packaged in kroki-delegates package and deployed in maven central and the code is shared on github.

Use it for good, never for evil.














Sunday, 6 November 2011

JSON vs XML

A few days ago I asked a few guys at work that if they were given a choice, would they choose JSON or XML as a format for a REST API. Everyone responded in favor of JSON. What is surprising in this is that they were all java developers, not a single javascript guy. Maybe it is too early to say this, but it seems that after the initial success of XML in the previous decade and stagnation in the last few years, the XML as format is slowly getting to decline, while more compact and simple, schemaless formats seem to rise: JSON and YAML.
JSON and YAML implementations are just as simple as their formats, you can choose from very many of them.

Question: How do JSON parsers compare to XML parsers from performance perspective?

I wrote sample inputs in XML and JSON. The same data, basically. For XML, I have made two input files. The first is the 'traditional' XML, no attributes at all. The second uses some attributes. The third is the JSON version. I used very few XML apis, only the ones packaged into JDK, this is a bit unfair because the alternative XML apis may be a little faster than the default. I have made a test a few years age and that showed them not so much different, so I was more interested in JSON. With JSON format, I used 5 different parsers: json.org, jackson, jsonlib, jsonj and gson.



Even the slowest JSON parser is faster than the XML parsers of JDK, also the format is more compact, looks like it is win. Jackson is much faster than the rest. It's website is right, it is very fast.

Test source code: here.