A Long Look at JVM Languages

| October 30, 2012

This article was originally posted by Eric Bruno over at Dr. Dobb's.

Exploring the remarkably fertile Java Virtual Machine development platform

For the last decade, the Java Virtual Machine (JVM) has been a popular platform to host languages other than Java. Language developers have continually relied upon the performance and "write-once-run-anywhere" portability of the JVM, as well as other advanced features, to provide the runtime environment for their languages. In fact, porting language interpreters to the JVM became so popular that Sun had hired some high-profile language developers, such as Charles Nutter (of the JRuby language), Ted Leung (Jython), and others to accelerate the porting efforts.

Why have so many languages, including Ruby, Groovy, and Python, been ported to Java? Mainly because it's much easier to target one platform (Java code, in this case) and rely on the multiplatform JVM to host it than it is to write interpreters for each operating system. Additionally, with the JVM's advanced just-in-time (JIT) compilation, the resulting compiled and optimized Java bytecode will typically run with equal or better performance than native interpreters. Further, the Java JIT compiler continues to optimize code well after it's first compiled, depending upon changes in code execution and branching. The cost and effort associated with building this into each independent interpreter implementation makes this kind of performance prohibitive; so, it makes sense to leverage the JVM's existing multiplatform implementation of this performance optimization feature.

Additionally, the JVM brings along other useful features, such as a huge set of well-tested libraries, garbage collection, and excellent built-in tools and debugging interfaces that a language developer can easily plug into.

invokedynamic: Bridging the Gap

Although writing language interpreters in Java offers huge benefits in terms of performance and multiplatform support, the biggest win is the interoperability it provides. For instance, Ruby code executing on the JVM is readily callable from a Java application, and vice versa. As a result, entire libraries of Java code, in source or JAR form, are instantly available to developers building applications in these other languages. Conversely, Java developers can experiment with code written in these other languages as well.

However, prior to Java SE 7 in 2011, there was a small penalty in doing so. These languages weren't first-class citizens, and there was a small performance hit when crossing the barrier to and from Java code, such as when calling Ruby code from a Java application, or Ruby code calling into Java classes in a JAR file. This was mainly because the JVM was not initially built to support dynamically typed languages. To fix this and eliminate the performance penalty, Java Specification Request (JSR) 292 was developed with the goal of creating a new bytecode instruction to resolve the issue.

The result is the invokedynamic bytecode, which is an enhancement to the JVM that allows dynamic — and direct — linkage between the calling code and the receiving code at runtime. With Java 7, invoking code in other languages is now equivalent to straight Java method calls. This applies to Java code and any of the languages executing on top of the JVM.

Many of the existing dynamic languages on the JVM are now upgrading their implementations to use invokedynamic. Perhaps the most aggressive in this regard is JRuby, which was one of the earliest adopters of the technology and has reaped the rewards. According to the Alioth benchmarks, it outperforms the native Ruby 1.9 implementation on many tests.

Oracle, too, is getting in on the invokedynamic action — somewhat of a change for a company that has mostly avoided active development of JVM alternatives to Java. At JavaOne 2011, Oracle announced Project Nashorn (German for Rhino), which is an advanced JavaScript engine that's built using invokedynamic. Planned for JDK 8, the Nashorn JavaScript engine will work closely with the Oracle Hotspot JVM to provide good performance, and allow seamless Java-to-JavaScript (and reverse) method calls. Plans for JDK 9 include further optimization regarding Java to native calls (JNI), a unified type system, and the meta-object protocol. The goal is to provide a more generic set of rules and descriptors for the code being written, regardless of language, and achieve symbolic freedom (non-Java-specific types) throughout the JVM.

And of course, as improvements are continually made to the JVM runtime performance, JIT optimization, and platform support (including embedded), as they have been with each Java release, all of the dynamic languages built on top of the JVM will benefit as well.

JVM Language Profiles

The following section takes a quick look at the popular dynamic JVM languages, and the languages' implementation details. These languages fall into three broad categories: those that were created to be an improved alternative to Java, those that are ports of other languages, and a miscellaneous category of languages that have other aims. Where relevant, we include links to Dr. Dobb's features on these languages.

Let's begin by looking at the JVM languages that were created to extend Java, or to make up for some of the perceived shortcomings in the Java language or platform. These include Groovy, Scala, Gosu, and Kotlin, among others.

Language: Groovy
Project inception: Created by James Strachan, around 2003
Licensing: Apache License, v2.0
Commercial support: Community-based
Binding: Late, reflective
Language type system: Strong, supports both static and dynamic typing
Compilation: Bytecode, and JIT compiled
Description: Groovy is an object-oriented programming language much like Java, but meant to be used as scripting language to the Java platform with a dynamic-language feature set. Features include a more compact, less verbose, programming syntax, dynamic typing, native support for DOM-based structures (that is, XML), closures, operator overloading, and many other features that Java does not or will not support. Today, Groovy includes a language branch called Groovy++ that was designed to simplify static typing so as to improve Groovy's performance.

Language: Scala
Project inception: Designed by Martin Odersky around 2001; released 2003
Licensing: BSD
Commercial support: Typesafe Inc. / Scala Solutions
Binding: Late, reflective
Language type system: Strong
Compilation: Bytecode, and JIT compiled
Description: Designed to be a better Java, yet built on top of Java, and meant to interoperate with Java code at every level. Odersky built Scala to clean up many of what he perceived as Java's shortcomings. These included a non-unified type system (primitives vs. objects), type erasure (polymorphism), and checked exceptions. Scala brings functional programming concepts to the Java language, such as type inference, anonymous functions (closures), lazy initialization, and many others. The name itself implies a more "scalable" version of Java (even though the language name is pronounced "scah-lah") that is meant to continually improve upon its parent language's shortcomings. Regarding the design of the language and its future, Dr. Dobb's interviewed Odersky recently.

Language: Kotlin
Project inception: Created by JetBrains in 2011
Licensing: Apache, v2.0.
Commercial support: JetBrains Inc.
Binding: Early
Language type system: Strong, static.
Compilation: Compiles to JavaScript or Java bytecode, JIT compiled
Description: Kotlin is a codename for a statically typed programming language that compiles to JVM bytecode and JavaScript. The main design goals for Kotlin are: to compile as quickly as Java; be safer than Java (that is, statically check for common pitfalls such as null pointer dereference); to be more concise than Java via local type-inference, closures, extension functions, mixins, and first-class delegation; and to make it way simpler than other, similar, JVM languages.

Language: Gosu
Project inception: Began as the GScript scripting language in 2002
Licensing: Apache License, v2.0.
Commercial support: Community-based; Guidewire Software Inc.
Binding: Early
Language type system: Strong, static
Compilation: Compiles to Java bytecode, JIT compiled
Gosu borrows concepts from other languages, such as Ruby, Java, and C#, and is mainly used as a scripting language within other JVM software systems. However, Gosu has some innovative, very interesting features such as the Open Type System, which allows it to be easily extended for compile-time type checking, and its use of XML and XSL as native types. Its syntax is compact and concise, lending to its simplicity.

Other languages in this group include Fantom, which compiles to Java, JavaScript, and .NET, and Ceylon, which is under development at RedHat.

Ports of Other Languages

The following languages are ports of existing language to the JVM.

Language: JRuby
Project inception: Created by Jan Arne Petersen in 2001
Licensing: Free software via a triple CPL/GPL/LGPL license.
Commercial support: Engine Yard Inc.
Binding: Late, reflective
Language type system: Strong, object-oriented
Compilation: Mixed mode — code can be interpreted, JIT compiled, or AOT compiled
Description: JRuby is a Java implementation of the Ruby interpreter, yet tightly integrated with the JVM to allow for efficient and fast calls to and from Java application code. JRuby was a big reason the invokedynamic bytecode was added to Java with the Java SE 7 release. This allows all Java-to-Ruby and reverse calls to be treated the same as Java-to-Java calls. Additionally, JRuby inspired Sun's Multi-VM JVM research effort, where one JVM can act as a separate sandbox to each of multiple applications, allowing different versions of the JRuby runtime to be loaded and active at the same time.

Language: Jython
Project inception: Created by Jim Hugunin in 1997
Licensing: Permissive free, GPL-like, Python Software Foundation license
Commercial support: Community based
Binding: Late, reflective
Language type system: Fully dynamic
Compilation: Mixed mode — code can be interpreted, JIT compiled, or AOT compiled.
Description: Jython, one of the very first languages ported to the JVM, is a Java implementation of the Python interpreter to enable high-performance, compiled, Python application execution. Written on the JVM, Jython fully supports calls into Java code and libraries, and all code is JIT compiled at runtime. Jython's use of invokedynamic is described in this video.

Language: Clojure
Project inception: Created by Rich Hickey in 2007
Licensing: Eclipse Public License.
Commercial support: Community-based
Binding: Late, reflective
Language type system: Strong, dynamic
Compilation: Bytecode, and JIT compiled
Description: Clojure is a modern-day Lisp for functional programming, interoperable with object-oriented code, and built for concurrency. It's built on and tightly integrated with the JVM, treats all code as data, views functions as objects, and emphasizes recursion over looping. The main feature of Clojure is its immutable state system that provides for much easier, less error-prone, parallel code execution.

Other languages in this group include Fortress, a now-abandoned port of Fortran; Mirah, another Ruby knock-off by the principal developer of JRuby; and NetRexx, an IBM-sponsored port of Rexx, which was the first scripting language for Java, but is now mostly inactive.

Miscellaneous Languages

Finally, let's take a quick look at the miscellaneous group, whose most prominent member is Rhino. While technically it is a port of JavaScript, it began life as a stand-alone interpreter blessed by Sun for scripting within Java apps and is now bundled with the JVM for server-side execution.

Language: Rhino
Project inception: Created by the Mozilla Foundation in 1997; as part of the Java SE in 2006
Licensing: Mozilla Public License v1.1 / GPL v2.0
Commercial support: Oracle; Mozilla; Community-based
Binding: Late, reflective
Language type system: Dynamic, weak
Compilation: Bytecode, and JIT compiled
Description: Rhino is a JavaScript engine, bundled as part of Java SE, which executes JavaScript code and allows for interoperability with Java code. The engine works in either interpreted or compiled mode, and is meant to execute server-side JavaScript code as part of an enterprise application solution.

Other languages in this group include Adobe's ColdFusion, which is the scripting language for the Web application of the same name. It can be compiled to Java bytecodes and run on the JVM.

What is clear from how many of these languages are very actively under development is that the JVM continues to be a remarkably fertile development platform for new languages.