Thursday 4 July 2013

String.split versus Pattern.split

I believe in java most people use String.split() to split a String to pieces. That method is there for ages (java 1.4), everyone knows it and it just works.
The alternative for this is to use a Pattern instance, which is immutable, therefore you only need a single instance of the pattern created only once and it can serve your applications forever. The guys who started to use it are in my opinion smarter, because they knew that the String.split actually needs to create a Pattern object, which is a relatively heavy operation and they can save it.

However, this is not the end of the story. It would be too short and would not make a blog post :-)

String.split() is smart, and it has a special case when the pattern is only one or two characters long and it does not contain special regexp characters. In this case it will not create a Pattern object, but simply process the whole thing in place. That special piece of code is not just accidentally there.

Let's see the speed comparison.
As you see, String.split performs better when the character you use for splitting meets the above requirements. When you need to split with several different characters - I believe this might be the less frequent case - you'd be much better using a Pattern constant.