Abstract Heresies: August 2011

Tuesday, August 30, 2011

Measuring GC times

MIT/GNU Scheme has a command line parameter, --heap, that allows one to specify the size of heap in 1024 word blocks. On a 32-bit machine, the heap size is limited because MIT/GNU Scheme stores the type tag in the upper bits of the word. On a 64-bit machine the word size is wider, so it is possible to specify extremely large heaps without worrying that the address will overflow into the type tag.

For kicks, I decided to vary the value of the heap size and see what effect that had on the task of MIT/GNU Scheme compiling itself.

Heap	GC Count	Cumulative GC Time in ms
6000	1056	25800
8192 (64 MiB)	764	19120
10000	623	17910
16384 (128 MiB)	377	10980
32768 (256 MiB)	188	7350
50000	124	5560
65536 (512 MiB)	94	5000
100000	63	4410
131072 (1 GiB)	48	3910
150000	42	4160
200000	32	3800
250000	26	3560
262144 (2 GiB)	25	3050
300000	22	3360
350000	19	3260
393216 (3 GiB)	18	2740

Collecting this data was tedious, so I spent a fair amount of time thinking about a model that could explain the results. At first glance, everything looks about right: the more memory, the fewer GCs and the less total time is spent garbage collecting. The problem is that the curve predicted by my model doesn't fit the empirical data at all. Of course this means that my model is either wrong or incomplete.

I have since figured out the problem, but I thought I'd let people puzzle over it themselves for a bit, first.