Why is memory management so visible in Java VM?
I'm playing around with writing some simple Spring-based web apps and deploying them to Tomcat. Almost immediately, I run into the need to customize the Tomcat's JVM settings with -XX:MaxPermSize (and -Xmx and -Xms); without this, the server easily runs out of PermGen space.
Why is this such an issue for Java VMs compared to other garbage collected languages? Comparing counts of "tune X memory usage" for X in Java, Ruby, Perl and Python, shows that Java has easily an order of magnitude more hits in Google than the other languages combined.
I'd also be interested in references to technical papers/blog-posts/etc explaining design choices behind JVM GC implementations, across different JVMs or compared to other interpreted language VMs (e.g. comparing Sun or IBM JVM to Parrot). Are there technical reasons why JVM users still have to deal with non-auto-tuning heap/permgen sizes?
The title of your question is misleading (not on purpose, I know): PermSize issues (and there are a lot of them, I was one of the first one to diagnose a Tomcat/Sun PermGen issue years ago, when there wasn't any knowledge on the issue yet) are not a Java specifity but a Sun VM specifity.
If you use a VM that doesn't use permanent generation (like, say, an IBM VM if I'm not mistaken) you cannot have permgen issues.
So it's is not a "Java" problem, but a Sun VM implementation problem.
Java gives you a bit more control about memory -- strike one for people wanting to apply that control there, vs Ruby, Perl, and Python, which give you less control on that. Java's typical implementation is also very memory hungry (because it has a more advanced garbage collection approach) wrt the typical implementations of the dynamic languages... but if you look at JRuby or Jython you'll find it's not a language issue (when these different languages use the same underlying VM, memory issues are pretty much equalized). I don't know of a widespread "Perl on JVM" implementation, but if there's one I'm willing to bet it wouldn't be measurably different in terms of footprint from JRuby or Jython!
Python/Perl/Ruby allocate their memory with malloc() or an optimization thereof. The limit to the heap space is determined by the operating system rather than the VM, so there's no need for options like -Xmxn. Also, the garbage collection is simpler, based mostly on reference counting. So there's a lot less to fine-tune.
Furthermore, dynamic languages tend to be implemented with bytecode interpreters rather than JIT compilers, so they aren't used for performance-critical code anyway.
The essence of @WizardOfOdds and @Alex-Martelli's answers appear to be correct: Java has an advanced set of GC options, and sometimes you need to tune them. However, I'm still not entirely clear on why you might design a JVM with or without a permanent generation. I have found a bunch of useful links about garbage collection in Java, though not necessarily in comparison to other languages with GC. Briefly:
- The Sun GC evolves very slowly due to the fact that it is deployed everywhere and people may rely on quirks in its implementation.
- Sun has detailed white papers on GC design and options, such as Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine.
- There is a new GC in the wings, called the G1 GC. Alex Miller has a good summary of relevant blog posts and a link to the technical paper. But it still has a permanent generation (and doesn't necessarily do a great job with it).
- Jon Masamitsu has (had?) an interesting blog at Sun various details of garbage collection.
Happy to update this answer with more details if anyone has them.
This is because Tomcat is running in the Java Virtual Machine, while other languages are either compiled or interpreted and run against your actual machine. When you set -Xmx and -Xms you are saying that you want to JVM to run like a computer with am amount of ram somewhere in the set range.
I think the reason so many people run in to this is that the default values are relatively low and people end up hitting the default ceiling pretty quickly (instead of waiting until you run out of actual ram as you would with other languages).