This is repost of an article that DZone had as part of it's Top Article of 2011. Solid read.
Java has many areas which can be slow. However for every problem there is a solution. Many solutions/hacks require working around Java's protections but if you need low level performance it is still possible.
Java makes high level programming simpler and easier at the cost of making low level programming much harder. Fortunately most applications follow the rule of thumb that you spend 90% of your time in 10% of the code. This implies you are better off 90% of the time, worse off 10% of the time. 😉
It makes me wonder why you would write more than 10% of your code in C/C++ for most projects. There will be some projects where C/C++ is the only sensible solution, but I suspect most C/C++ projects would more productive with the use of higher level languages like Java.
One way to get C-like performance is to use C via JNI for key sections of code. If you want to avoid using C or JNI there are still ways you can get the performance you want.
Note: Most of these suggestions only work for standalone applications rather than applets.
Note 2: Use at your own risk. You are likely to need to test edge cases which you wouldn't normally need to worry about when using low level Java.
Fast array access
One area Java can be slower is array access. This is because it implicitly does bounds checking. The JVM is smart enough to optimise checks for loops by checking the first and last element, however this doesn't always apply.
One work around is to use the Unsafe class (which is only available on some JVMs, OpenJDK JVMs do) This class has getXxxx() and setXxxx() for each primitive type and gives you direct access to an object, array or direct memory where you have to do the bounds checking. In native code, these are compiled to single machine code instruction. There is also a getObject(), setObject() methods however I suspect that they don't provide as much of a performance improvement (by the time you access the Object as well)
You can check the native code generated for a method by downloading the debug version of the OpenJDK and getting it to print the compiled native code.
Arbitrary memory access
You can use the Unsafe class again for arbitrary access, however a "friendlier" way is to use a DirectByteBuffer and change its address and limit as desired (via reflection or via JNI) This will give you a Buffer which points to a random area of memory such as device buffer.
Using less memory
This is not as much of an issue as it used to be. A 16 GB server costs $1000 and a 1 TB server costs about $70K.
However, cache memory is still a premium and for some applications and its worth cutting memory consumption. A simple thing to do is to use Trove which support primitives in collections efficiently. If you have a large table of data, you can store data by column instead of by row (if you have lots of rows of data, and a few columns). This can improve caching behaviour if you are scanning data by field but don't need all fields.
You can also use Direct memory to store data how you wish. This is what the BigMemory library uses.
Stream based IO is slow and NIO is a pain to use
How can use you have the best of both worlds? Use blocking IO in NIO (which is the default for a Channel) Don't use Selectors unless you need them. In many cases, they just add complexity. Most systems can handle 1K-10K threads efficiently. If you need more connections than that, buy another server, a cheap one cost about $500.
Fast Efficient Strings
Java 6 update 21 has an option -XX:+UseCompressedStrings which can use byte instead of char for the strings which don't need 16-bit characters. For many applications this saves memory but is slower. (5%-10%)
Instead you can use your own Text type which wraps a byte, or get you text data from ByteBuffer, CharBuffer or use Unsafe.
Faster Startup times
Java tends to have slow startup times when you load in lots of bloated libraries. If this is really a problem for you load less libraries. Keeping them to a minimum is good practice anyway. Do this and your startup times will be a few seconds (not as fast as C, but likely to be fast enough)
Less GC pauses
Most Java libraries create objects freely and generally this is not a problem.
However this doesn't mean you can't pre-allocate your objects, use Direct ByteBuffers and Object recycling techniques to minimise your object creation. By increasing the Eden size you can have an application which rarely GCs. You may even reduce it to one GC per day (say as a scheduled over night job)
- 5 tips for proper Java Heap size (javacodegeeks.com)