How to avoid OutOfMemory exception in a memory intensive application in Java?

We have developed a java application, whose primary objective is to read a file(input file), process it and convert it into set of output files.

(I have given a generic description of our solution, to avoid irrelevant details).

This program works perfectly fine when the input file is of 4 GB, with memory settings of -Xms4096m -Xmx16384m in a 32 GB RAM

Now we need to run our application with the input file of size 130 GB.

We used a linux box with 250GB RAM and with memory setting of -Xms40g -Xmx200g (also tried couple of other variations) to run the application and hit OutOfMemory Exception.

At this stage of our project it's very hard to consider redesigning the code to accommodate hadoop ( or someother large scale data processing framework), also the current hardware configuration which we can afford is 250GB of RAM.

Can you please suggest us ways to avoid OutOfMemory Exceptions, what is the general practise when developing these kind of applications.?

Thanks in advance


The most obvious thing to try is to not keep the whole file in memory (if possible). So you can maybe process it in chunks, and at any moment of time keep just one or a few chunks in memory (and not the whole file).

Just try keep using as less memory as you can, say, don't keep whole file in memory, dump it on disk.

Say, Hadoop HDFS does that for you, just check you doesn't have any leaks via a good profiler or heap dump analyser.

A custom solution could be to still use plain files, but organising access in page-like manner. E.g. Java has good MappedByteBuffer that let you load a certain section of a file into memory for faster access (it has certain problems until Java 7, caused unpredictable unmapping, but as far as I know it has since been fixed).

