Why I'm Worried About Java's Future

Java's future is on my mind lately. Oracle's new ownership of Java prompts a series of "what will Larry do" questions. But more to the point, the research Mike Gualtieri and I have been doing on massively scaled systems makes me worry that Java technology has fallen behind the times.

This is not a "Java is dead" commentary but rather a discussion of issues as I see them. Java technology is alive and vitally important; we all must be concerned if its future direction isn't clear.

For me, Java's 2-gigabyte-per-JVM memory limitation symbolizes this gap. Volumes of application data are rising, but standard Java platforms still have a practical limitation of 2 GB of memory. I spoke with one customer that incorporates a search process into its app that alone requires 20 GB of memory. This customer employs servers with 6 GB of memory each but can only use this memory in 2 GB chunks, each chunk managed by a JVM in a scale-out architecture.

We've done pretty well with 2 GB JVMs until now. But as data volumes grow, this company (and others) are no longer well served by scale-out JVM architectures. Java technology should give shops the choice of scaling up the memory within an individual JVM as well. Why?

The company with the 2o GB search process would love to have the option of large memory pools within a single JVM, primarily to get scale without code changes. Gap: There is no memory scale-up option in standard Java today. I'm not sure how many shops would select a scale-up approach over a scale-out approach (would love to know what you think). But I've seen enough requirements to be persuaded that having the additional choice is important. 

This is the problem: The Java community should have an answer for this requirement, but it does not. The recently announced Managed Runtime Initiative, backed by Azul Systems, seeks to drive a scale-up JVM solution (based on a new JVM combining OpenJDK and Azul's software) at a future date. None of the Java leadership has yet signed on to that initiative. We can only hope that: a) MRI comes up with a standards-aligned scale-up JDK and/or b) MRI inspires the Java community leadership to solve this problem.

Here is what worries me about Java's future:

  • Requirements for a scale-up JVM exist now.
  • Creating a scale-up JVM will take tons of work. The solution must address not only the structure of JVMs but also their interactions with operating systems.
  • Neither Oracle nor IBM -- the leaders of the Java community -- have strong incentives to solve this issue. A massive engineering investment to remake the JVM only to give it away in specs and a reference implementation? I don't think this will happen.
  • It isn't clear who's got the job of solving issues like the 2 GB JVM, aside from the amorphous Java Community Process. Oracle generally embraces strong, direct initiatives but hasn't yet done so with Java's technical direction.

And so the status quo in JVM memory persists, and developers work around that limitation. Makes me worry about the future of Java technology. You? I've added a discussion thread in our Application Development & Delivery Community Site (http://community.forrester.com/thread/2918) for your thoughts.

Comments

Er - what 2Gb JVM limit?

According to http://java.sun.com/docs/hotspot/HotSpotFAQ.html#64bit_heap the 2Gb memory limit applies only to 32-bit JVMs.

The 64-bit JVMs in use on most 64-bit server hardware have no such memory limit.

Best regards,
Darren Hague

64-bit Java

Thanks very much to Darren Hague, Osvaldo Doederlein, and Edward Yavno for bringing 64-bit Java into this conversation. (IBM's Billy Newport also mentioned 64-bit in a tweet.) 64-bit does provide relief, but as Osvaldo points out, with some limitations. I will look further into why 64-bit doesn't come up often in my conversations with clients. That is curious. Edward suggested to me that slow uptake may reflect customer inertia or lack of tools.

Slow adoption

This enters in a different problem; I don't see any technical reason for large server apps to not be using full 64-bit systems today, or in the past few years. Even the Windows Server platform has good 64-bit support in the last couple generations. ("GOOD 64-bit support" includes decent support for vertical scalability, like NUMA architecture, large numbers of CPUs, and high-perf networking - without these, fitting a server with a hundred Gb of RAM is a big waste of money.)

Corporate culture may be one problem. Some companies are really slow to move, it's common for me to have clients that are still dragging their feet with things like Oracle 9i, SQLServer 2005 or WebSphere 6.0, on top of a OS/HW platform that's also a half-decade or worse old. In UNIX environments (AIX, Solaris, ...) even conservative clients are all 64-bit because these platforms have moved to 64-bit a long time ago. But for x86 + Windows/Linux shops, the 64-bit generation was available but still new and immature some ~5 years ago, when many current-production systems were developed or planned.

64 bits do not solve Garbage Collection pauses

The practical limit for JVMs on 64 bit platforms today is ~2GB (anywhere form 1GB to 6GB for most apps), and it is driven completely by response time limitations - not by how many bits the CPU or OS can support. JVMs on all major platforms today will occasionally pause to compact the heap. This time leading to this compaction can be tuned and tweaked, but the compaction itself is unavoidable (and this is a widely acknowledged and recognized fact).

Since compaction is done as a stop-the-world operation in current JVM implementations, useful live data sets are limited by the response time characteristics an application is willing to accept. For everything except batch applications, this means a handful of gigabytes today. It's like a ticking time bomb - do enough work, and compaction must happen before you can proceed with more work.

A 1GB live set will cause a periodic 3-5 seconds pause on the latest/greatest CMS collector, running on the latest/greatest Xeon cores today. A 20GB live data set (as discussed in the article) will cause a 60-100 seconds pause. Pauses of that level are considered crashes.

Without solving garbage collection, JVMs will not scale beyond a handful of GB.

[Disclaimer - I work for Azul Systems - where we solved garbage collection].

Gil, I don't think anybody

Gil,

I don't think anybody argued that 64-bit solves GC pauses, but rather 2GB memory limit. The GC issues affect both 32 and 64-bit JVMs, albeit may be more severe with large heap sizes.

Also, you may be not be up-to-date with with non-Azul JVMs.

For example, JRockit JVM with deterministic GC will never have pause times greater than 5-10 ms under normal (tuned) conditions; nor it will ever do stop-the-world GC to compact unless there's severe fragmentation - it's usually configured to do partial compaction.

- Ed Y.

"Never... unless" statements

"Never... unless" statements are exactly how apps get in trouble, and how IT pagers end up going off at the worse possible times [that peak, revenue generating load that falls under the "unless" part].

Pauses [not bits] are the root cause of practical heap size limit, and that limit is in the handful of GB. 64 bits make no difference to pauses, and none to the practical heap limits.

Fragmentation is a fact of life. JRockit's docs are quite up-front about it, and flat out say that compaction *only* happens as a stop-the-world operation. They certainly tune what they can to delay the inevitable fragmentation that will force compaction, but they are also honest about compaction, and what will happen when it occurs:
In fact, here is a direct quote:
“Compaction is done with the application paused. However, it is a necessary evil, because without it, the heap will be useless...” (JRockit RT tuning guide).
[Newer versions use slightly longer language to the same effect].

Don't get me wrong. Real Time JVMs are great for real time things that fit within contained and constrained parameters. But this article is talking about 20GB and larger server applications that would easily fit in today's [and tomorrow's] servers, not about what it takes to run a small real time app with a few 100s of MB of memory.

Read the JRockit specs, their tuning guides, and what they actually claim to do. Their RT stuff is carefully focused, it's just not about using any sizable amounts of memory. It's no accident that all the tuning guides use 1GB heaps. When they were actually posting guarantees, they were for <10ms "99% of the time", and <100ms all the time, as long as the heap is no larger than 1GB heap, as long as it's no more than 30% populated with actual data, and under conditions of limited throughput (limited allocation rate, limited cpu activity). Break those conditions, and your guarantees are out the window.

Since current commodity servers already hold 12-48 cores and 100s of GB of memory, the target sizes for which JRockit's RT guarantees are focused is only ~1% of what a server already holds today.

The reality is that you generally don't find applications [with human response time needs] running with more than a handful of GB is the wild. They don't because things break when they try. Whether it's at 1GB, 2GB or 6GB, it's still a tiny [and shrinking] fraction of what a $10K server holds today.

Gil, I was debating your

Gil, I was debating your statement ".. A 1GB live set will cause a periodic 3-5 seconds pause on the latest/greatest CMS collector ..".
It's an exaggeration. I've given you an example where it's not the case - both from documentation and personal experience I can assure you that 10 ms max pauses are achievable in real world apps with 1GB and more.

Low latency apps is a good example because such delays are not acceptable in that domain, and there are clearly plenty of such apps. JRockit is just one example, there are also realtime JVMs from Sun and IBM which operate within the same predictable millisecond pause range.

I'm not arguing and agree with your other points (GC pauses with larger heaps), and Asul's pauseless GC is certainly an important achievement.

I don't see things so bleak...

First off, your comment about 2Gb limitation is obviously only valid for 32-bit JVMs, but 64-bit JVMs have been available for a long time. The latest JVMs from both Sun/Oracle and IBM even support "compressed references", an optimization that allows a 64-bit VM to have heaps up to 32Gb without the usual footprint cost of doubling the size of every object reference (this cost exists for 64-bit processes written in any language; but the automatic compression trick is not possible on unmanaged languages like C/C++, so the 32->64bit transition is actually an advantage of the JVM, and potentially other managed runtimes). The upcoming Oracle JDK 7 will support up to 64Gb on compressed-ref mode, although with a space tradeoff (16-byte allocation alignment; still less waste than doubling all object refs).

Granted, 64-bit JVMs are not a solution if their Garbage Collectors won't deal well with huge heaps. But the latest GC tech in current JVMs is usually good enough for "average huge" heaps. I have production apps on 3Gb heaps where GC pauses are not an issue -- e.g. ~1s pause each 30s, and this is because the HW and J2EE platform are 5 years old so we're not using latest concurrent/parallel GCs; next month the customer is updating the platform, expects >10X app load increase, but heap size and GC times are not even in the radar. Of course that's not a response-time-critical system, but very few corporate apps are (telephony, hi-freq trading...). And if I ever need that, current Java tech will surely scale to the ~20Mb zone. All JVMs are always scrambling to increase heap&GC scalability; e.g. the Oracle JDK will soon have a production-quality release of G1, a new GC that while not as cool as Azul's provides another incremental improvement, e.g. soft-realtime pauses for heaps of a few Gb, and decently-short pauses (good enough for transactional apps) for heaps weighting several tens of Gb. Azul / MRI technology is aimed to another order-of-magnitude boost, into the ceilings of Terabyte-sized heaps... but this is very high-end / niche, right now. In fact, I expect that Azul's biggest advantage is for response-time-critical apps with more mundane heap sizes. For things like big distributed cache servers, other optimizations (NUMA-aware allocation and GC, thread-local allocation etc.) are more critical, and these have been implemented in production JVMs, no need of Azul's innovations.

I have to disagree with the comment "Neither Oracle nor IBM -- the leaders of the Java community -- have strong incentives to solve this issue." They surely have, as both IBM and Oracle are companies that like to push Big Iron, mainframe/supercomputer-class products. Vertical scaling is their TOP business strategy, because it's where the best margins are. I don't know if Oracle, IBM and other players will join the MRI, or if they will compete with their own research. But sure as death and taxes, they have long been, are, and will continue to invest in JVM tech to support mammoth heaps.

2GB is the new 640K

A 256GB commodity server can now be had for ~$18K. A 1TB for ~$80K. JVMs are limited to using only 2-3% of that capacity at a time now, and the gap is growing. Distributed computing *within* a small computer is what Windows 3.1 did when it hit the 640KB limit on those nice big 32 bit 386 and 486 machines. We're doing it all over again with Java now.

As noted before. - this is entirely about pause times - the aggregate stop-the-world time that JVMs spend compacting the heap without letting the application do any work. Unless compaction is done concurrently (which none of the current JVM collectors do, including the still-in-research-mode G1), a 20GB data set will take tens of second of stop-the-world time to compact. With modern Xeon cores generating ~0.5GB/sec of garbage each when doing useful work, and modern systems generating 4-8GB/sec of garbage, heaps in the tens (and 100s) of gigabytes are needed just to keep up with the work and data sets these machines can crunch.

In effect - that's exactly what people do today (they'll run 40 3GB JVMs in a single 128GB server - adding up to 120GB of heap), they just do in in extremely complex, inefficient, and ineffective ways.

Concurrent, Compacting GC will solve this problem, it just takes some work on the system stack to make that happen. That's what the Managed Runtime Initiative is hoping to make happen. Until then, Azul's Zing Java Virtualization Platform can be used to address the same problem in pretty much any OS today.

Concurrent compaction

While it's true that HotSpot cannot compact the heap concurrently, it can do it in parallel, which is great if you have many cores - and a server with 100's of Gb will typically have. Of course this is best if the server is dedicated to a single container, so the massively-parallel GC won't induce a "CPU-starvation pause" in other applications. The process will be typically limited by the bandwidth of memory hierarchy, so the server gear must be well designed to handle the massive burst of CPU/RAM traffic that happens in a full-GC of a very big heap.

Also, you don't really need compaction. JVMs can collect the heap concurrently and it works pretty well. There are small stop-the-world phases but much smaller than the mark&sweep. The remaining problem here is fragmentation, but this typically can be managed by heap sizing/tuning and also app design/optimization. (99,99% of all individual Java objects are very small and cause no frag issues... except arrays; so, avoid allocation of very large temporary arrays, e.g. by preferring a LinkedList over an ArrayList, or a ConcurrentHashMap over a HashMap; or using pools). Grented, this is sometimes easier said than done, especially when GC-unfriendly code is found in third-party code including your J2EE container as I've already found.

XML is a sure way to fragment your heap.

Fragmentation "only" happens when you have objects of varying sizes coming and going. Unfortunately, XML documents and strings are exactly that - objects of uncontrollable and unpredictable sizes that come and go quickly.

Saying "avoid allocation of large temporary arrays" is the same as saying "avoid using or processing XML". In the real world, you don't have that choice. In the real world, fragmentation happens, and compaction is forced to happen to avoid an out-of-memory crash. In the real world, the compaction takes multiple seconds per live GB.

XML, no big deal

If you follow common-sense programming practices - e.g. don't load a huge XML document in memory (which needs a huge char[] or byte[] somewhere), just open input streams and feed them to a streaming parser like StAX - the parser will only use relatively small (and often recyclable) buffers, and the objects that it creates are typically small (strings and other objects containing each attribute, qname etc.). Even good OO/XML mapping libs (like latest JAXB) ain't bad, they will just additionally allocate big graphs of POJOs. Additionally, JAXP allows you to cache schemas, and you can also recycle/cache/pool the parser objects to reuse their internal buffers.

On the app side, most heavy XML processing has a "streaming" behavior itself, usually ETL-like work (read some XML, store important stuff in tables) or web/transactional; in either case the data extracted from the XML is most often used and discarded in a matter of milliseconds. And GCs love that; if you have this kind of code (even if you have very big concurrency and load) and it's causing major GC issues, you can typically solve it by tuning.

You can fight fragmentation, for example, by increasing the young-gen; apps that have very intense allocation (either short bursts or sustained) need larger young-gen. The YG does not fragment, it is compacted. It seems I'm suggesting a fix with an obvious gotcha - making the YG too big means that your generational GCs will be slower, approaching full-GC pause times, so you basically lose the benefit of generational collection, right? No, because in young-GC the survival ratio is typically very small, and collection time is only proportional to the live set... so a young-gen with 1Gb size will collect in a blink if it only contains 50Mb of live data. Few applications have individual transactions (or whatever unit-of-work: ETL batches, page servers, etc.) that allocate tons of RAM. When increasing the heap due to heavy concurrency, lots of transactions will fit in the YG. But when the young-GC triggers, even in a worst case where each transaction retains all its temp data until completion, most data in the YG will belong to completed transactions. This makes the young-GC cheap (small survivor ratio).

On the G1 collector...

G1 is another way to fix the problem, because it does not need to concurrently compact the whole heap! It can collect (including compaction) small pieces of the heap individually (so the max pause time is bound). This is the whole idea of G1: abandoning the monolithic heap model.
But the jury is still out on G1. The releases than Sun shipped in JDK 6u14+ were very early, as you can judge by the massive amount of G1 changes in the JDK7 project afterwards. I have tested those early G1 builds several month ago, they even crashed on me, filed a couple bug reports. My suggestion is waiting for JDK7-FCS (at least) to evaluate G1's performance for huge heaps. (The G1 team has puslished a paper demonstrating better than 10X reduction in pause times, but only for a couple benchmarks with relatively modest heap sizes like 200Mb... I'm waiting for some benchs in the 20Gb range.)

poorly researched?

As already pointed out, the 2 GB limit does not exist when using 64-bit JVM on a 64-bit OS. Secondly, even running 32-bit JVM on 64-bit Linux or Solaris, which is often the case now, can yield ~ 4B.

- Ed Y.

This blog post is not research

Hi Edward; thanks for your comment. Please don't confuse my blog post with published research. I am experimenting with blogs and communities as tools to conduct research (as well as promote it). The final report is yet to come -- after many more conversations and much reading.

John, no confusion here. I

John, no confusion here. I understand it's a blog post with a more fluid discussion, that's why I commented.

Just felt the premise of the post was wrong or at least incomplete without mentioning existing solutions even if they come up short.

Why and if they come up short would be an interesting discussion though.

- Ed Y.

2Gb limit

as many people have pointed out before me, there has been 64 bit address space JVMs for some time now.

The 2Gb limit (originally) comes from the underlying 32 bit operating system
Virtual Memory Addressing, and how that was traditionally partitioned between kernel and user space, instructions, stack and heap (remember the "break" in UNIX)
therefore these problems are manifest in any language (not just Java) running on a target O.S

As many have also pointed out once you have a large heap many of the lifecycle operations on it (primarily GC) scale similarly and can become problematic.

Also note that Java NIO has had APIs for some time to portably map underlying OS VM into Java avoiding placing all data on the heap ...

finally considerable research into large heap mgmt at BEA, Sun and others to mitigate these problems as heap sizes grow.

Good to "hear your voice" again, Larry

Thank you for chiming into this discussion. Larry. It has been too long since we've spoken live. We should reconnect -- jrymer@forrester.com.

Hi John...

Love to ... drop me a line!

Lunch is on me!

So what else?

I've learned a lot from the debate over Java memory management, and I think we've identified that an issue does exist. It doesn't seem to be the biggest issue, though. Should I conclude that everyone is satisfied that Java has a strong future?

Has Java a strong future?

I personally believe it has, look at some of the data points:

- Google App Engine's Java platform
- Google Android

- VMWare/SpringSource SpringFramework and their partnership with SalesForce

- the 6m developers worldwide

- the "countless" open source technologies ...

Not only that with the current "Cloud" phenomenon think about the JVM/SE itself:

- Portable/Ported to multiple O.S&ISAs (ubiquity)
- 32 & 64bit address space support (large server memory)
- Concurrency primitives (multi-core, parallelism)
Multiple language support (Java, Scala, Jython, JRuby, Groovy,…)
Static (JIT) and dynamic code optimization (continuous optimization, code migration)
Platform for scalable HTTP containers (Tomcat, Jetty)
Platform for packaging/dependency mgmt (OSGi R4.2 Enterprise)
Integrated remote management (JMX)
Integrated remote debugging (JDPA)
Broad selection of capabilities, utilities, etc (java.*, javax.*, org.springframework.*, …)
Excellent IDE integration (Eclipse, Netbeans, …)
.Net interoperability (Web Services, SOAP, REST)
Multiple distributed computing solutions (IIOP, JRMP, MOM, HTTP, …)
I18N/L10N capable

All of these capabilities make it an excellent platform upon which to build a truely elastic cloud.

Also note that for all those budding DSL developers out there, why waste time writing your own VM (Ruby pls take note) when you have the JVM with several 100
man years of development in it already ... its a no brainer.

Java is the platform that will define PaaS for the cloud IMHO

Java is the new Cobol! :) But

Java is the new Cobol! :) But in a good sense: with so much investment in Java infrastructure in the enterprises it's not going away and will have to progress.

Seriously,
what excites me the most is the Java ecosystem:
there's a lot of activity in the enterprise systems space (including critical ones), and the experimental, innovative, open source stuff going on.

New languages on JVM, improvements in concurrency to adapt to the growing number of cores, better GC, new data stores (NoSQL) are just some of the developments that show that Java platform has a great future.

- Ed Y.

Forgot to mention all the

Forgot to mention all the distributed systems/grid solutions available already and being built on the Java platform. This is the area that will continue to grow rapidly with the increasing amount of data we have to process and Java is clearly one of the best platforms for that.

another thought in passing ...

One thought that has been at the back of the minds of the JRockit development team for some time, and in particular, the "bare metal", LVM team is the advantage that a JVM coded directly to either the ISA or the Hypervisor APIs may have in regard to memory utilization and management...

With a traditional 32 or 64 bit VM O.S "out of the way" a JVM coded directly onto either the Instruction Set Architecture or the Hypervisor (HAT) avoids the "overhead" of the O.S process and VM models and could potentially take better advantage of the physical memory of the host server platform (and that exposed via the virtualization layer memory model)

This could both free up more of the virtual address space to the Java heap, allow synergies between the JVM and Hypervisor MM and also enable more efficient GC and other memory related operations ...

sadly I wonder if any of this will now be realized ...

Spread Betting

It's always a little nerve-wracking when a company not known as an open-source player comes into an open landscape like Java.But it's pretty clear that Oracle is not taking Java lightly, especially when so much of its revenue is dependent upon Java........Spread Betting