Abstract Heresies: November 2011

Wednesday, November 23, 2011

Imagine that we have a boatload of memory and we're nowhere near the limit. Suppose, too, that we allocate memory simply by bumping the free pointer. If you think about it, you can see that the age of the most recently allocated objects is simply the distance between the object's location and the current value of the free pointer. Little's Law tells us that mean age of a heap allocated object is equal to the mean size of the reachable heap, so it must be the case that the mean distance from the reachable objects to the free pointer is also equal to the mean size of the heap (blah, blah, satisfy boundary conditions, blah, in the limit, etc.)

A few weeks ago, someone asked about the improvement in locality that one may get from the garbage collector. The vast majority of objects don't survive even a single garbage collection, so the vast majority of objects are layed out in memory exactly where they were consed and cannot benefit from any improvement in locality. This is not to say that there is no effect at all, but that the effect doesn't apply to the vast majority of objects.

Someone also asked what the GC pauses were. With a large heap (131072 blocks), the total time for MIT/GNU Scheme compiling itself was 167.7 seconds runtime, 1.06 second GC time. This comes out to about 22 milliseconds per GC.

Friday, November 18, 2011

Little's Law and the weak generational hypothesis

Theorem: The mean lifetime of a heap allocated object is equal to the mean amount of reachable storage.
Proof: This is Little's Law with λ=1.

This turns out to be a pretty handy theorem. Here's an example.

In a prior post, I mentioned that a prerequisite for effective garbage collection is that the amount of reachable storage must at all times be less than the amount of memory available for the heap. We apply Little's Law and get this: The mean lifetime of a heap allocated object must be less than the time it takes to exhaust the heap.

That naturally follows: in order for garbage collection to “work”, we need something to become garbage before we completely run out of memory. But Little's Law allows us to make a much stronger statement. The mean lifetime of all objects (where lifetime is measured in subsequent allocations), is, by Little's Law, equal to the mean size of reachable storage. The mean size of the reachable storage has to be less than the amount of memory available for the heap. Since the mean lifetime of an object must be small, and we have the occasional very long-lived object, it must be the case that the vast majority of objects have very short lifetimes. Or in other words,

Most allocated objects die young.

This statement is often called the weak generational hypothesis. It is usually presented as an empirical observation. By Little's Law, it is equivalent to saying that the size of the reachable objects is small, which is not a hypothesis, but a strong pre-condition of garbage collection.

Wednesday, November 16, 2011

Little's Law and Lisp

We model the Lisp heap as a queuing system.

A “queuing system” consists of discrete objects we shall call “items” that “arrive” at some rate to the “system.” Within the system the items may form one or more queues and eventually receive “service” and exit.
-- Graves and Little in Building Intuition: Insights from Basic Operations Management Models and Principles, edited by D. Chhajed and T. J. Lowe, Springer Science+Business Media, LLC, New York, 2008

Our discrete objects are the objects returned by cons, make-vector, make-string, and the other primitive constructors. The heap is where these objects wait, and the objects “exit” the system when they become unreachable.
We can now apply Little's Law:

L=λW

We will assign λ, the average number of items arriving per unit time, to 1. In other words, we normalize time to be equal to the allocation rate. Little's Law simplifies to:

L/λ=W (when λ is 1)
L=W

We let L, average number of items in the queuing system, be the size of the heap in allocation units. W, the average waiting time in the system for an item, is therefore the mean lifetime of an object. Provided that we meet the necessary boundary conditions: The mean lifetime of a heap allocated object is equal to the mean size of the reachable heap.

We are measuring object lifetime in `allocations'. If desired, we could measure the allocation rate in units of time, and then we would get the object lifetime also in units of time. Instead, we are determining the average number of word allocations that occur before an object becomes unreachable.

The boundary conditions for Little's Law are quite lax. If the heap begins and ends empty we can satisfy them. The heap naturally starts empty, and when a program ends, we discard the heap, so we can pretend it is empty again. (This puts an upper limit on the lifetime of all objects.) As I noted in a previous post, the mean heap size for MIT/GNU Scheme compiling itself is about 300,000 words. Therefore, the mean object lifetime is 300,000 word allocations.

I will expand on this in the next post...

Tuesday, November 15, 2011

Little's Law

In Chapter 5 of Building Intuition: Insights from Basic Operations Management Models and Principles, edited by D. Chhajed and T. J. Lowe, Springer Science+Business Media, LLC, New York, 2008, Steven Graves and John Little discuss Little's Law:

A “queuing system” consists of discrete objects we shall call “items” that “arrive” at some rate to the “system.” Within the system the items may form one or more queues and eventually receive “service” and exit.

Little's Law says that, under steady state conditions, the average number of items in a queuing system equals the average rate at which items arrive multiplied by the average time that an item spends in the system. Letting

L = average number of items in the queuing system,
W = average waiting time in the system for an item, and
λ = average number of items arriving per unit time, the law is

L=λW

Graves and Little give several examples, and I'll paraphrase one here.

Caroline is a wine buff. Her wine rack in the cellar holds 240 bottles. She seldom fills the rack to the top but sometimes after a good party the rack is empty. On average it seems to be about 2/3rds full. She buys, on average, about eight bottles per month. How long, on average, does a bottle languish in the cellar before being consumed?

Using Little's Law, we let L be the average number of bottles in the wine rack, 2/3×240=160. λ is the average number of items arriving per unit time, 12×8=96 bottles/year. Divide L by λ and we get W≅1.67 years per bottle.

(It seems to me that Little's Law is related to Stoke's Theorem. The size of the queue is the integral of the “flux” across the queue boundaries.)

more to come...