Java Leads Programming Language Popularity

| May 29, 2012

Java leads other programming languages in popularity when measured by books sales

This is a good article by Joe Brockmeier from ReadWriteHack that looks at programming language popularity from another point of view, book sales.

How do you calculate the popularity of various programming languages? The TIOBE folks try to rank programming language popularity by searching the Web. The RedMonk team pulls data from GitHub and Stack Overflow. But O'Reilly has a unique method: It measures book sales as an indicator of technology trends. By that measure, at least, Java and JavaScript come out on top.

Mike Hendrickson, O'Reilly's vice president of content strategy, has been taking a deep dive into the state of the computer book market in a series of posts beginning on March 29th. The most recent is a look specifically at programming language popularity related to sales.


Note that the figures come from Bookscan's reports on the top 3,000 titles sold, not just O'Reilly's sales data. Thus, the data includes sales for books by publishers other than O'Reilly as well.

The RedMonk figures from GitHub and Stack Overflow provide a lot of insight because they reflect real-world usage. But they're also limited. Measuring GitHub language popularity includes both older and newer projects, so if a language was really popular two years ago, it's going to continue to weight more heavily even if the projects are no longer actively developed and fewer new projects are started in that language.

Likewise, Stack Overflow is a decent indicator of the overall market, but perhaps not a perfect mirror of the larger developer population.

Measuring book sales is not perfect either, but at least it's relatively current. It's not measuring all mentions or projects over time, but sales over a one-year period. Additionally, it measures a broad population of developers, not just the users of GitHub, Stack Overflow, etc. It's also an indicator of what people are willing to spend money on.

One disadvantage is that book sales miss developers who plunge into a language without buying a book. It also gives false positives, because you can't assume that everyone who buys a book about a programming language will actually use it on a real project.

Java and JavaScript Lead

All that said, let's look at the numbers. According to O'Reilly's figures, Java is the top dog in unit sales and in overall dollar sales. In total, Java book sales were up by more than 13%, selling more than 250,000 units in 2011. Java's market share comes in at 14.45% of the overall programming book market.

JavaScript doubled Java's growth rate, though. Last year, JavaScript titles were a trifle more than 10% of the market; now they're more than 13%. More than 228,000 JavaScript books were sold last year, up from nearly 166,000 in 2010. That's a growth rate upward of 27%.

Objective C and C# dropped in 2011, with C# moving to the third spot and selling 23% less units than in 2010. Objective C sold 7% less (140,679 units in 2011, 151,229 in 2010) last year.

It's worth noting that the lower sales of Objective C and C# come despite the fact that each of those languages had more individual titles on the market in 2011. There were 271 C# titles in 2010 and 305 in 2011. Objective C jumped from 89 titles to 138 in 2011. Java and JavaScript also fielded more individual titles in 2011. JavaScript had 180 in 2010 and 243 in 2011. Java had 374 in 2010 and a whopping 420 in 2011.

Despite the drop in Objective C titles, the Apress book on "Beginning iPhone 4 Development" was the fourth top-selling book in 2010.

O'Reilly breaks the languages into several tiers by sales. "Large" languages are those with between 50,000 and 200,000 units; "major" have between 10,000 and 49,999 units. This goes all the way down to languages that sold fewer than 100 units in 2011. This last category includes things like Prolog, Pascal and Octave.

Hendrickson also breaks out languages that sold no titles at all, which includes Google's Go and Squeak. No surprise on Google Go, as the language doesn't have any titles on the market yet. However, given the release of Go 1, we should be seeing a book any day now.

Looking at the top 20, though, the gap between the most popular and the bottom of the list is pretty staggering. The growth in Java language sales from 2010 to 2011 is more than Perl books sold altogether. That is, Java sales increased by about 30,000 books, but Perl didn't even move 17,000 books.

By the way, if you look at the latest TIOBE index for April, you'll see a lot of disagreement with O'Reilly's numbers. TIOBE ranks Python above JavaScript, C above Java and Perl at a respectable number 9 instead of 19.


So what do all these rankings tell us?

For one, the folks who thought that Java was in trouble after Oracle's acquisition of Sun (including me) called it wrong. Despite Oracle's mishandling of the Java community process, developers are not fleeing Java. The success of Android also plays into Java's continuing popularity, but the trend seems larger than that. Java is doing better than ever.

JavaScript is also doing quite well, which should come as no surprise. The rankings may not agree on exactly where JavaScript stands, but all indicators point to "up."

PHP, though, is down considerably. So are C# and ".Net languages." Microsoft doesn't seem to be driving developer interest quite so much as it used to.

The fact that SQL is holding steady indicates that, despite all the NoSQL talk, there's plenty of interest in traditional SQL skills.

Though Ruby doesn't make the top 10, its growth year-over-year is impressive. It seems likely that Ruby is being used in more Web development projects. It might also be gaining in popularity thanks to Puppet.

Even with all these different measurements, we don't have a perfect picture of language interest and use, but combining the different sources provides a more detailed picture.

What do you think of the figures from the book sales? Is this an accurate picture of the market, or is something missing? What are you using, primarily? Voice your opinion in the comments below.