What's the difference between generational and incremental garbage collection?
I think that both (generational and incremental) are different approaches to make the garbage collection pauses faster. But what are the differences between generational and incremental? How do they work? And wich one is better for real time software / produces less long pauses?
Also, the Boehm GC is any of those?
A generational GC is always incremental, because it does not collect all unreachable objects during a cycle. Conversely, an incremental GC does not necessarily employ a generation scheme to decide which unreachable objects to collect or not.
A generational GC divides the unreachable objects into different sets, roughly according to their last use - their age, so to speak. The basic theory is that objects that are most recently created, would become unreachable quickly. So the set with 'young' objects is collected in an early stage.
An incremental GC may be implemented with above generational scheme, but different methods can be employed to decide which group of objects should be sweeped.
One might look at this wikipedia page and further downward, for more information on both GC methods.
According to Boehm's website, his GC is incremental and generational:
The collector uses a mark-sweep algorithm. It provides incremental and generational collection under operating systems which provide the right kind of virtual memory support.
As far as a real time environment is concerned, there are several academic research papers describing new and ingenious ways to do garbage collection:
- Nonblocking Real-Time Garbage Collection
- Real-time garbage collection by IBM has good explanation on differences.
An incremental garbage collector is any garbage-collector that can run incrementally (meaning that it can do a little work, then some more work, then some more work), instead of having to run the whole collection without interruption. This stands in contrast to old stop-the-world garbage collectors that did e.g. a mark&sweep without any other code being able to work on the objects. But to be clear: Whether an incremental garbage collector actually runs in parallel to other code executing on the same objects is not important as long as it is interruptable (for which it has to e.g. distinguish between dirty and clean objects).
A generational garbage collector differentiates between old, medium and new objects. It can then do copying GC on the new objects (keyword "Eden"), mark&sweep for the old objects and different possibilities (depending on implementation) on the medium objects. Depending on implementation the way the generations of objects are distinguished is either by region occupied in memory or by flags. The challenge of generational GC is to keep lists of objects that refer from one generation to the other up to date.
Boem is an incremental generational GC as cited here: http://en.wikipedia.org/wiki/Boehm_garbage_collector
Some tracing garbage collection algorithms can pause in the middle of a collection cycle while the mutator continues, without ending up with inconsistent data. Such collectors can operate incrementally and are suitable for use in an interactive system.
Primitive garbage collectors(1), once they start a collection cycle, must either finish the task, or abandon all their work so far. This is often an appropriate restriction, but is unacceptable when the system must guarantee response times; for example, in systems with a user interface and in real-time hardware control systems. Such systems might use incremental garbage collection so that the time-critical processing and the garbage collection can proceed effectively in parallel, without wasted effort.
Generational garbage collection is tracing garbage collection that makes use of the generational hypothesis. Objects are gathered together in generations. New objects are allocated in the youngest or nursery generation, and promoted to older generations if they survive. Objects in older generations are condemned less frequently, saving CPU time.
It is typically rare for an object to refer to a younger object. Hence, objects in one generation typically have few references to objects in younger generations. This means that the scanning of old generations in the course of collecting younger generations can be done more efficiently by means of remembered sets.
In some purely functional languages (that is, without update), all references are backwards in time, in which case remembered sets are unnecessary.
The Boehm-Demers-Weiser has an incremental mode that you can enable by calling GC_enable_incremental. See http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html