Thursday, 8 August 2013

Wikipedia loads

A bit off-topic for today. I wrote an app to follow Wikipedia loads in Kotlin and MongoDB, it was quite an interesting experience and I learned a lot from the experiment, it would be enough for some post, but what I really wanted to show you is interesting from other perspective: it is a visualization of some facts about internet users, languages and cultures.

On each chart, the black line represents traffic in request/hour and the red line is rendered from the averages of the traffic in that hour in several days.

English: around the clock

Usage graph of the english wikipedia

The English Wikipedia is a wiki built from over 6-million articles maintained by a very big and very active community. What is interesting for me in the English language is that the sun never sets on it. English is the official language of the United States with more than 300 million, Canada with 20 million, Australia with 21 million and United Kingdom with 60 million native english speakers. Also official language in India and smaller Asian countries and several African countries.
This gives that intereresting shape to the curve with several smaller peaks.
  • the big peak is at 18:00 UTC with roughly 14 million request/hour
  • the second peak is at 2:00 UTC with 12 million request/hour. t
  • the load never seems to go lower than 8 million request/hour (even that is huge) at around 7:00 UTC
  • the top load that I have seen is about 18 million request/hour

German: day use


German Wikipedia usage

Let's see my favorite industrial nation. Unlike English, German language is almost only spoken in Europe. This may be the reason why we see bigger ups and downs in the curve, the top of the average load is 2.1 million request/hour, but it is also changing day by day, the top activity you see is 4 million request/hour. That is a huge activity from the 120 million native German-speakers.

Hebrew: Sunday wiki

Hebrew Wikipedia usage

I chose hebrew from the small languages. While it is spoken by very small minorities in so many countries, it is only official and majority language in Israel. These folks have a very strange habbit: Friday is not a working day, but they work on Sunday. Saturday is the most sacred thing for religious Jewish people and they do not work.
Actually the low traffic that you see on the chart is not a Saturday. Saturdays are totally average days on the Hebrew Wikipedia and the top day is Sunday. Sunday is always over average.

Hungarian: The Two Towers

Hungarian Wikipedia usage

The other small language that I chose is Hungarian, my native language. (Did you notice my grammar mistakes?) The interesting thing in this curve is the two peaks of lunchtime and dinner (19:00 GMT, which is 6 PM in Hungary). I can't explain. Most people spend a little time checking mail, googling some stuff and reading Wikipedia at dinner? Anyway, usage after dinner falls dramatically.

Russian: Siberia


Russian Wikipedia usage graph
The last example is from Russian, I wanted to see a language which is spoken in 10 different timezones all across Europe and Asia. It does not show, very likely because of the population distribution of Russia, most Russians live in the European parts of Russia, while Siberia is almost uninhabited. Nice rivers, forests, mountains.

That's it for today, thanks for reading! I took the very last picture over the beautiful Siberia, I hope one day I will have a chance to see it from close. I mean without having to build a railway :-)