Here is a log-log plot of garbage collection counts as function of heap size. The green line is the predicted number of collections assuming that the product of the heap size and GC count is constant. The red points are actual GC counts for MIT/GNU Scheme compiling itself.
As you can see, the line matches pretty good in the middle, but not so much at the ends. It took me a while, but I figured out that the discrepancy at the lower right of the curve (where the heap size is very large and the number of collections quite small) can be attributed to a few extra garbage collections that are triggered for the purposes of compaction. If we take these into account, our graph looks more like this:
That takes care of the lower end, but it is the discrepancy at the upper end of the curve (smaller memory and more frequent GC) that is more interesting.
Dividing the total amount of memory used by the size of the heap gives us a pretty good estimate of the number of garbage collections provided that each garbage collection frees the entire heap. This is generally not the case in a real program. We should be dividing the total amount of memory used by the average of the amount of storage freed by the garbage collector. In this next chart, we adjust for both the extra collections and the space in use that cannot be collected.
In this chart, we can see the effect of not reclaiming 100% of the storage on each collection. When the heap size is small, the memory that survives garbage collection becomes a larger fraction of that heap. This causes the GC count to increase faster than it would if it could free all the storage.
There is a vertical asymptote to the line. This is where the heap size is equal to the averaged amount of storage retained at each GC. Basically, when the heap is only a little larger than the amount of storage retained, you have to do a lot of extra collection. This next chart is simply a close-up of the previous chart showing the divergence better.
It is interesting to see that retaining a modest amount of storage (empirically, it is about 400K words) can have a noticeable effect even
when the heap is tens of times larger. (Remember, the vertical scale is logarithmic, so differences that appear small are actually quite a bit larger.)
If it is possible to use a heap ten, twenty, fifty, or even a hundred times larger than the averaged amount of storage retained, then we'll be beyond the worst part of the curve and down in the linear region. Another way to state this is that you want the amount of storage retained at each GC to be only a few percent of the entire heap.
It looks like you've pretty thoroughly analyzed the effect of heap size on GC.
ReplyDeleteBut your 400K word working set fits comfortably in many contemporary processors' caches. Is it possible to measure an effect on runtime as the heap's size grows past the size of the L2 cache?
I guess the first thing to check would be to plot run time versus number of GCs and look for nonlinearity.