Why do I get an OutOfMemoryError when inserting 50,000 objects into HashMap?
I am trying to insert about 50,000 objects (and therefore 50,000 keys) into a java.util.HashMap<java.awt.Point, Segment>. However, I keep getting an OutOfMemory exception. (Segment is my own class - very light weight - one String field, and 3 int fields).
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.resize(HashMap.java:508) at java.util.HashMap.addEntry(HashMap.java:799) at java.util.HashMap.put(HashMap.java:431) at bus.tools.UpdateMap.putSegment(UpdateMap.java:168)
This seems quite ridiculous since I see that there is plenty of memory available on the machine - both in free RAM and HD space for virtual memory.
Is it possible Java is running with some stringent memory requirements? Can I increase these?
Is there some weird limitation with HashMap? Am I going to have to implement my own? Are there any other classes worth looking at?
(I am running Java 5 under OS X 10.5 on an Intel machine with 2GB RAM.)
You can increase the maximum heap size by passing -Xmx128m (where 128 is the number of megabytes) to java. I can't remember the default size, but it strikes me that it was something rather small.
You can programmatically check how much memory is available by using the Runtime class.
// Get current size of heap in bytes long heapSize = Runtime.getRuntime().totalMemory(); // Get maximum size of heap in bytes. The heap cannot grow beyond this size. // Any attempt will result in an OutOfMemoryException. long heapMaxSize = Runtime.getRuntime().maxMemory(); // Get amount of free memory within the heap in bytes. This size will increase // after garbage collection and decrease as new objects are created. long heapFreeSize = Runtime.getRuntime().freeMemory();
(Example from Java Developers Almanac)
This is also partially addressed in Frequently Asked Questions About the Java HotSpot VM, and in the Java 6 GC Tuning page.
Some people are suggesting changing the parameters of the HashMap to tighten up the memory requirements. I would suggest to measure instead of guessing; it might be something else causing the OOME. In particular, I'd suggest using either NetBeans Profiler or VisualVM (which comes with Java 6, but I see you're stuck with Java 5).
Another thing to try if you know the number of objects beforehand is to use the HashMap(int capacity,double loadfactor) constructor instead of the default no-arg one which uses defaults of (16,0.75). If the number of elements in your HashMap exceeds (capacity * loadfactor) then the underlying array in the HashMap will be resized to the next power of 2 and the table will be rehashed. This array also requires a contiguous area of memory so for example if you're doubling from a 32768 to a 65536 size array you'll need a 256kB chunk of memory free. To avoid the extra allocation and rehashing penalties, just use a larger hash table from the start. It'll also decrease the chance that you won't have a contiguous area of memory large enough to fit the map.
The implementations are backed by arrays usually. Arrays are fixed size blocks of memory. The hashmap implementation starts by storing data in one of these arrays at a given capacity, say 100 objects.
If it fills up the array and you keep adding objects the map needs to secretly increase its array size. Since arrays are fixed, it does this by creating an entirely new array, in memory, along with the current array, that is slightly larger. This is referred to as growing the array. Then all the items from the old array are copied into the new array and the old array is dereferenced with the hope it will be garbage collected and the memory freed at some point.
Usually the code that increases the capacity of the map by copying items into a larger array is the cause of such a problem. There are "dumb" implementations and smart ones that use a growth or load factor that determines the size of the new array based on the size of the old array. Some implementations hide these parameters and some do not so you cannot always set them. The problem is that when you cannot set it, it chooses some default load factor, like 2. So the new array is twice the size of the old. Now your supposedly 50k map has a backing array of 100k.
Look to see if you can reduce the load factor down to 0.25 or something. this causes more hash map collisions which hurts performance but you are hitting a memory bottleneck and need to do so.
Use this constructor:
You probably need to set the flag -Xmx512m or some larger number when starting java. I think 64mb is the default.
Edited to add: After you figure out how much memory your objects are actually using with a profiler, you may want to look into weak references or soft references to make sure you're not accidentally holding some of your memory hostage from the garbage collector when you're no longer using them.
Also might want to take a look at this:
Implicit in these answers it that Java has a fixed size for memory and doesn't grow beyond the configured maximum heap size. This is unlike, say, C, where it's constrained only by the machine on which it's being run.
By default, the JVM uses a limited heap space. The limit is JVM implementation-dependent, and it's not clear what JVM you are using. On OS's other than Windows, a 32-bit Sun JVM on a machine with 2 Gb or more will use a default maximum heap size of 1/4 of the physical memory, or 512 Mb in your case. However, the default for a "client" mode JVM is only 64 Mb maximum heap size, which may be what you've run into. Other vendor's JVM's may select different defaults.
Of course, you can specify the heap limit explicitly with the -Xmx<NN>m option to java, where <NN> is the number of megabytes for the heap.
As a rough guess, your hash table should only be using about 16 Mb, so there must be some other large objects on the heap. If you could use a Comparable key in a TreeMap, that would save some memory.
See "Ergonomics in the 5.0 JVM" for more details.
The Java heap space is limited by default, but that still sounds extreme (though how big are your 50000 segments?)
I am suspecting that you have some other problem, like the arrays in the set growing too big because everything gets assigned into the same "slot" (also affects performance, of course). However, that seems unlikely if your points are uniformly distributed.
I'm wondering though why you're using a HashMap rather than a TreeMap? Even though points are two dimensional, you could subclass them with a compare function and then do log(n) lookups.
Random thought: The hash buckets associated with HashMap are not particularly memory efficient. You may want to try out TreeMap as an alternative and see if it still provide sufficient performance.