Imagine that we have a boatload of memory and we're nowhere near the
limit. Suppose, too, that we allocate memory simply by bumping the
free pointer. If you think about it, you can see that the age of the
most recently allocated objects is simply the distance between the
object's location and the current value of the free pointer. Little's
Law tells us that mean age of a heap allocated object is equal to the
mean size of the reachable heap, so it must be the case that the mean
distance from the reachable objects to the free pointer is also equal
to the mean size of the heap (blah, blah, satisfy boundary conditions,
blah, in the limit, etc.)
A few weeks ago, someone asked about the improvement in locality that one may get from the garbage collector. The vast majority of objects don't survive even a single garbage collection, so the vast majority of objects are layed out in memory exactly where they were consed and cannot benefit from any improvement in locality. This is not to say that there is no effect at all, but that the effect doesn't apply to the vast majority of objects.
Someone also asked what the GC pauses were. With a large heap (131072 blocks), the total time for MIT/GNU Scheme compiling itself was 167.7 seconds runtime, 1.06 second GC time. This comes out to about 22 milliseconds per GC.
Wednesday, November 23, 2011
Friday, November 18, 2011
Little's Law and the weak generational hypothesis
Theorem: The mean lifetime of a heap allocated object is equal to the mean amount of reachable storage.
Proof: This is Little's Law with λ=1.
This turns out to be a pretty handy theorem. Here's an example.
In a prior post, I mentioned that a prerequisite for effective garbage collection is that the amount of reachable storage must at all times be less than the amount of memory available for the heap. We apply Little's Law and get this: The mean lifetime of a heap allocated object must be less than the time it takes to exhaust the heap.
That naturally follows: in order for garbage collection to “work”, we need something to become garbage before we completely run out of memory. But Little's Law allows us to make a much stronger statement. The mean lifetime of all objects (where lifetime is measured in subsequent allocations), is, by Little's Law, equal to the mean size of reachable storage. The mean size of the reachable storage has to be less than the amount of memory available for the heap. Since the mean lifetime of an object must be small, and we have the occasional very long-lived object, it must be the case that the vast majority of objects have very short lifetimes. Or in other words,
This statement is often called the weak generational hypothesis. It is usually presented as an empirical observation. By Little's Law, it is equivalent to saying that the size of the reachable objects is small, which is not a hypothesis, but a strong pre-condition of garbage collection.
Proof: This is Little's Law with λ=1.
This turns out to be a pretty handy theorem. Here's an example.
In a prior post, I mentioned that a prerequisite for effective garbage collection is that the amount of reachable storage must at all times be less than the amount of memory available for the heap. We apply Little's Law and get this: The mean lifetime of a heap allocated object must be less than the time it takes to exhaust the heap.
That naturally follows: in order for garbage collection to “work”, we need something to become garbage before we completely run out of memory. But Little's Law allows us to make a much stronger statement. The mean lifetime of all objects (where lifetime is measured in subsequent allocations), is, by Little's Law, equal to the mean size of reachable storage. The mean size of the reachable storage has to be less than the amount of memory available for the heap. Since the mean lifetime of an object must be small, and we have the occasional very long-lived object, it must be the case that the vast majority of objects have very short lifetimes. Or in other words,
Most allocated objects die young.
This statement is often called the weak generational hypothesis. It is usually presented as an empirical observation. By Little's Law, it is equivalent to saying that the size of the reachable objects is small, which is not a hypothesis, but a strong pre-condition of garbage collection.
Wednesday, November 16, 2011
Little's Law and Lisp
We model the Lisp heap as a queuing system.
We can now apply Little's Law:
We are measuring object lifetime in `allocations'. If desired, we could measure the allocation rate in units of time, and then we would get the object lifetime also in units of time. Instead, we are determining the average number of word allocations that occur before an object becomes unreachable.
The boundary conditions for Little's Law are quite lax. If the heap begins and ends empty we can satisfy them. The heap naturally starts empty, and when a program ends, we discard the heap, so we can pretend it is empty again. (This puts an upper limit on the lifetime of all objects.) As I noted in a previous post, the mean heap size for MIT/GNU Scheme compiling itself is about 300,000 words. Therefore, the mean object lifetime is 300,000 word allocations.
I will expand on this in the next post...
A “queuing system” consists of discrete objects we shall call “items” that “arrive” at some rate to the “system.” Within the system the items may form one or more queues and eventually receive “service” and exit.Our discrete objects are the objects returned by
-- Graves and Little in Building Intuition: Insights from Basic Operations Management Models and Principles, edited by D. Chhajed and T. J. Lowe, Springer Science+Business Media, LLC, New York, 2008
cons
, make-vector
, make-string
, and the other primitive constructors. The heap is where these objects wait, and the objects “exit” the system when they become unreachable.We can now apply Little's Law:
L=λW
We will assign λ, the average number of items arriving per unit time, to 1. In other words, we normalize time to be equal to the allocation rate. Little's Law simplifies to:
L/λ=W (when λ is 1)
L=W
We let L, average number of items in the queuing system, be the size of the heap in allocation units. W, the average waiting time in the system for an item, is therefore the mean lifetime of an object. Provided that we meet the necessary boundary conditions:
The mean lifetime of a heap allocated object is equal to the mean size of the reachable heap.L=W
We are measuring object lifetime in `allocations'. If desired, we could measure the allocation rate in units of time, and then we would get the object lifetime also in units of time. Instead, we are determining the average number of word allocations that occur before an object becomes unreachable.
The boundary conditions for Little's Law are quite lax. If the heap begins and ends empty we can satisfy them. The heap naturally starts empty, and when a program ends, we discard the heap, so we can pretend it is empty again. (This puts an upper limit on the lifetime of all objects.) As I noted in a previous post, the mean heap size for MIT/GNU Scheme compiling itself is about 300,000 words. Therefore, the mean object lifetime is 300,000 word allocations.
I will expand on this in the next post...
Tuesday, November 15, 2011
Little's Law
In Chapter 5 of Building Intuition: Insights from Basic Operations Management Models and Principles, edited by D. Chhajed and T. J. Lowe, Springer Science+Business Media, LLC, New York, 2008, Steven Graves and John Little discuss Little's Law:
(It seems to me that Little's Law is related to Stoke's Theorem. The size of the queue is the integral of the “flux” across the queue boundaries.)
more to come...
A “queuing system” consists of discrete objects we shall call “items” that “arrive” at some rate to the “system.” Within the system the items may form one or more queues and eventually receive “service” and exit.
Little's Law says that, under steady state conditions, the average number of items in a queuing system equals the average rate at which items arrive multiplied by the average time that an item spends in the system. Letting
L = average number of items in the queuing system,
W = average waiting time in the system for an item, and
λ = average number of items arriving per unit time, the law is
L=λW
Graves and Little give several examples, and I'll paraphrase one here.
Caroline is a wine buff. Her wine rack in the cellar holds 240 bottles. She seldom fills the rack to the top but sometimes after a good party the rack is empty. On average it seems to be about 2/3rds full. She buys, on average, about eight bottles per month. How long, on average, does a bottle languish in the cellar before being consumed?Using Little's Law, we let L be the average number of bottles in the wine rack, 2/3×240=160. λ is the average number of items arriving per unit time, 12×8=96 bottles/year. Divide L by λ and we get W≅1.67 years per bottle.
(It seems to me that Little's Law is related to Stoke's Theorem. The size of the queue is the integral of the “flux” across the queue boundaries.)
more to come...
Thursday, October 27, 2011
Another chart
Here is a chart of the number of words in use after each garbage collection (for MIT/GNU Scheme compiling itself with a heap of 1935360 words):
This is the smallest heap that is big enough for the compilation to complete. The limiting factor seems to be the big spike on the left at about 0.9×109. We obviously wouldn't want to run with the minimum size heap; 3935 garbage collections are needed and almost a third of the total time is spent in GC. But the more frequently you collect, the better resolution you get.
Wednesday, October 26, 2011
Garbage collection gives us the illusion of nearly unlimited memory for temporary storage. To pull off this illusion, we must be able to satisfy every allocation request (at least up to the point where we throw up our hands and raise an “out of memory” condition). In other words, there must be free or unused memory available whenever we need some. Since the point of automatic memory management is to free the programmer from having to think about it, we expect to be able to allocate temporary storage at any time we wish during program execution.
A data structure can be considered “in use” if modifications to it can change the future behavior or result of a program. Since we cannot in general statically determine the future behavior, we compute the transitive closure of the data structures under the data access primitives. This is a superset of the data “in use”. That is, if you can't access the data, it cannot be “in use”.
In order for garbage collection to succeed as a storage management strategy, there must be at all times (anytime we want to allocate) some storage that is either unallocated or unreachable. The size of the heap must always exceed the size of the reachable data. Or we could say that the amount of reachable data must at all times be less than the size of the heap. (By ‘size of the heap’ I mean the total amount of storage both in use and free. For example, I could have a five megabyte heap, but only have 3 words in use.)
Any allocated data structure has but one of two ultimate fates: it remains reachable for the remainder of the run of the program or it becomes unreachable at some point (and we may re-use the storage). Assuming the heap starts empty, the size of reachable storage at any point in time is the difference between the total amount of all allocation and the total amount of all storage that has become unreachable up to that point. For example, if my program allocates sixty megabytes over its lifetime and it abandons fifty-nine megabytes, then the heap must contain the remaining megabyte.
We now consider the differential case, which is to say, the rates of allocation, collection, and heap growth or shrinkage. If the rate of allocation exceeds the rate of collection, the amount of storage in use in the heap must grow to meet the demand. Conversely, if the amount of reachable storage in the heap is decreasing, then we are abandoning storage at a faster rate than we are allocating it. But there are hard boundaries on the size of the heap. We can never have a negative amount of storage in use, and we have the requirement (see above) that the heap must be larger than the reachable storage at all times. The heap size is limited, so the rate of increase in reachable storage must be quite small over the long term if we are not to run out of heap.
The rate of storage allocation must be nearly equal to the rate of deallocation (or abandonment) almost all the time.
(This is only slightly different from the “steady-state” assumption that the rates are essentially equal. We're simply admitting the possibility that the reachable storage grows indefinitely and that the program would eventually crash if it doesn't finish first.)
A data structure can be considered “in use” if modifications to it can change the future behavior or result of a program. Since we cannot in general statically determine the future behavior, we compute the transitive closure of the data structures under the data access primitives. This is a superset of the data “in use”. That is, if you can't access the data, it cannot be “in use”.
In order for garbage collection to succeed as a storage management strategy, there must be at all times (anytime we want to allocate) some storage that is either unallocated or unreachable. The size of the heap must always exceed the size of the reachable data. Or we could say that the amount of reachable data must at all times be less than the size of the heap. (By ‘size of the heap’ I mean the total amount of storage both in use and free. For example, I could have a five megabyte heap, but only have 3 words in use.)
Any allocated data structure has but one of two ultimate fates: it remains reachable for the remainder of the run of the program or it becomes unreachable at some point (and we may re-use the storage). Assuming the heap starts empty, the size of reachable storage at any point in time is the difference between the total amount of all allocation and the total amount of all storage that has become unreachable up to that point. For example, if my program allocates sixty megabytes over its lifetime and it abandons fifty-nine megabytes, then the heap must contain the remaining megabyte.
We now consider the differential case, which is to say, the rates of allocation, collection, and heap growth or shrinkage. If the rate of allocation exceeds the rate of collection, the amount of storage in use in the heap must grow to meet the demand. Conversely, if the amount of reachable storage in the heap is decreasing, then we are abandoning storage at a faster rate than we are allocating it. But there are hard boundaries on the size of the heap. We can never have a negative amount of storage in use, and we have the requirement (see above) that the heap must be larger than the reachable storage at all times. The heap size is limited, so the rate of increase in reachable storage must be quite small over the long term if we are not to run out of heap.
The rate of storage allocation must be nearly equal to the rate of deallocation (or abandonment) almost all the time.
(This is only slightly different from the “steady-state” assumption that the rates are essentially equal. We're simply admitting the possibility that the reachable storage grows indefinitely and that the program would eventually crash if it doesn't finish first.)
Monday, October 24, 2011
Boring charts
Here is a chart that shows the amount of free space remaining after each garbage collection (MIT/GNU Scheme compiling itself, heap size 1935360 words):
Compare that chart with this one which shows the amount of free space remaining after each GC with heaps of sizes
8388608, 16777216, 33554432, and 100663296 words. Note that the y axis is now displayed in log scale in order to get the wide variation in heap size on one chart.
As we move to larger and larger heap sizes, the proportion of memory that is freed after each GC increases. This makes sense because the program allocates memory in the same way regardless of how much heap is present
(a program that adjusts its allocation profile in response to the heap size would look much, much different). When the heap is small, large allocations of temporary storage (like the one at about 8.5×108) make a significant difference in the amount of free space, but when the heap is big, these differences virtually disappear.
It is almost disappointing that the detail is lost when we use larger heap sizes. The charts for the large heap sizes are essentially flat. At large heap sizes, the variations in temporary storage usage become insignificant. In other words, the number of garbage collections becomes less dependent upon the moment by moment allocation of the program being run and approaches the simple limit of the heap size divided by the total allocation.
It is almost disappointing that the detail is lost when we use larger heap sizes. The charts for the large heap sizes are essentially flat. At large heap sizes, the variations in temporary storage usage become insignificant. In other words, the number of garbage collections becomes less dependent upon the moment by moment allocation of the program being run and approaches the simple limit of the heap size divided by the total allocation.
Friday, October 21, 2011
Scheming and plotting
Another interesting thing to plot is the amount of space that remains in use after the garbage collector runs. This is simply the total size of the heap minus the amount the collector frees (on average). For example, if we start MIT/GNU Scheme with
--heap 4096
, we'll get a heap of (* 4096 1024) = 4194304 words
. With this size heap, the garbage collector will run 1589 times and recycle 6204751035 words, or 3904815 words per collection. This gives us (- 4194304 3904815) = 289489 words
that are still in use (averaging over all 1589 collections, of course). If we do this calculation for various heap sizes and plot the results, we get this:
Sunday, October 16, 2011
Adjusting for extra collections
Here is a log-log plot of garbage collection counts as function of heap size. The green line is the predicted number of collections assuming that the product of the heap size and GC count is constant. The red points are actual GC counts for MIT/GNU Scheme compiling itself.
As you can see, the line matches pretty good in the middle, but not so much at the ends. It took me a while, but I figured out that the discrepancy at the lower right of the curve (where the heap size is very large and the number of collections quite small) can be attributed to a few extra garbage collections that are triggered for the purposes of compaction. If we take these into account, our graph looks more like this:
That takes care of the lower end, but it is the discrepancy at the upper end of the curve (smaller memory and more frequent GC) that is more interesting.
Dividing the total amount of memory used by the size of the heap gives us a pretty good estimate of the number of garbage collections provided that each garbage collection frees the entire heap. This is generally not the case in a real program. We should be dividing the total amount of memory used by the average of the amount of storage freed by the garbage collector. In this next chart, we adjust for both the extra collections and the space in use that cannot be collected. In this chart, we can see the effect of not reclaiming 100% of the storage on each collection. When the heap size is small, the memory that survives garbage collection becomes a larger fraction of that heap. This causes the GC count to increase faster than it would if it could free all the storage. There is a vertical asymptote to the line. This is where the heap size is equal to the averaged amount of storage retained at each GC. Basically, when the heap is only a little larger than the amount of storage retained, you have to do a lot of extra collection. This next chart is simply a close-up of the previous chart showing the divergence better. It is interesting to see that retaining a modest amount of storage (empirically, it is about 400K words) can have a noticeable effect even when the heap is tens of times larger. (Remember, the vertical scale is logarithmic, so differences that appear small are actually quite a bit larger.)
If it is possible to use a heap ten, twenty, fifty, or even a hundred times larger than the averaged amount of storage retained, then we'll be beyond the worst part of the curve and down in the linear region. Another way to state this is that you want the amount of storage retained at each GC to be only a few percent of the entire heap.
Dividing the total amount of memory used by the size of the heap gives us a pretty good estimate of the number of garbage collections provided that each garbage collection frees the entire heap. This is generally not the case in a real program. We should be dividing the total amount of memory used by the average of the amount of storage freed by the garbage collector. In this next chart, we adjust for both the extra collections and the space in use that cannot be collected. In this chart, we can see the effect of not reclaiming 100% of the storage on each collection. When the heap size is small, the memory that survives garbage collection becomes a larger fraction of that heap. This causes the GC count to increase faster than it would if it could free all the storage. There is a vertical asymptote to the line. This is where the heap size is equal to the averaged amount of storage retained at each GC. Basically, when the heap is only a little larger than the amount of storage retained, you have to do a lot of extra collection. This next chart is simply a close-up of the previous chart showing the divergence better. It is interesting to see that retaining a modest amount of storage (empirically, it is about 400K words) can have a noticeable effect even when the heap is tens of times larger. (Remember, the vertical scale is logarithmic, so differences that appear small are actually quite a bit larger.)
If it is possible to use a heap ten, twenty, fifty, or even a hundred times larger than the averaged amount of storage retained, then we'll be beyond the worst part of the curve and down in the linear region. Another way to state this is that you want the amount of storage retained at each GC to be only a few percent of the entire heap.
Thursday, October 13, 2011
Demystified
I discovered why the total amount of memory used does not seem approximately constant across the different heap sizes. I had forgotten to take into account that a garbage collection cycle can be run for reasons other than reclaiming storage. In particular, it can be run as part of a compaction phase that precedes dumping a binary object or moving an object out of the collected heap into “pure” storage. These are not particularly interesting operations; they contribute little to the total resource usage or total time. They just cause an overestimate of the amount of storage used in very large heaps. I could correct for this, but I'm far more interested in what happens on the other end of the curve when you have a relatively small heap. More to come.
Thursday, October 6, 2011
Confusing
I found some of the data in my GC timings to be a little confusing. I can understand how different runs of the compiler might have different space usages even when compiling the same thing (things like timestamps are going to change), but the variation is too large to explain away like this. I think the problem is that by instrumenting the code to measure the timings I have inadvertently changed the behavior of the code beyond simply collecting the data.
Tuesday, October 4, 2011
Some GC stats
I invested some time making MIT/GNU Scheme compile itself with various sizes of heap. I did one set of runs with the variable
The Total Free column contains the sum of the memory freed after each GC totaled for the entire compilation.
Some of my thoughts about this (with graphs) in a bit. (It becomes very interesting when the data seriously disagrees with the math,
but it is tedious to collect enough data to either support or refute the mathematical model.)
compiler:preserve-data-structures?
set to #f
and another with it set to #t
. The MIT/GNU Scheme compiler was written in a way that allows the garbage collector to reap the intermediate storage for the compiler after each phase of compilation. When compiler:preserve-data-structures?
is set to #t
, however, the intermediate data structures are retained until the next compilation. This should not change the total amount of storage used by the compiler, but it should have the effect of reducing the amount of storage recovered by a GC cycle. The number of GC cycles can be seen to be larger for the smaller heap sizes. When the heap is huge, the extra amount of storage in use has little effect on the GC count.The Total Free column contains the sum of the memory freed after each GC totaled for the entire compilation.
Memory Size | Discard Data Structures | Preserve Data Structures | |||
---|---|---|---|---|---|
(blocks) | (words) | GC Count | Total Free | GC Count | Total Free |
1890 | 1935360 | 3935 | 6213003931 | - | - |
1900 | 1945600 | 3882 | 6211940217 | - | - |
1920 | 1966080 | 3818 | 6213153480 | - | - |
2048 | 2097152 | 3502 | 6210260535 | - | - |
3072 | 3145728 | 2178 | 6205257571 | - | - |
4096 | 4194304 | 1589 | 6207697668 | 1626 | 6200942665 |
5000 | 5120000 | 1283 | 6206081325 | 1303 | 6198903924 |
6144 | 6291456 | 1033 | 6210675904 | 1043 | 6205324016 |
8192 | 8388608 | 766 | 6212253195 | 771 | 6207156928 |
16384 | 16777216 | 377 | 6222514645 | 379 | 6233468540 |
32768 | 33554432 | 188 | 6257358354 | 188 | 6244416384 |
40000 | 40960000 | 154 | 6266232939 | 154 | 6257300338 |
50000 | 51200000 | 124 | 6316240152 | 124 | 6311996423 |
65536 | 67108864 | 94 | 6283277196 | 94 | 6276744843 |
98304 | 100663296 | 63 | 6324424897 | 63 | 6320187112 |
131072 | 134217728 | 48 | 6430202774 | 48 | 6426705701 |
262144 | 268435456 | 25 | 6704242312 | 25 | 6702940561 |
Friday, September 23, 2011
Knee surgery
As an anonymous person pointed out, the knee in the curve is an illusion. Here is the same chart displayed with varying scales for the x and y axes:
You can see that the position of the “knee” depends upon the scale of the X and Y axes.
The whole chart is a problem. Not only does it show a knee, but the rest of the chart has the data all squished up along either the X or Y axis. It is going to be hard to compare different charts if they all look like this. I want the whole chart to be scale invariant.
(Can you guess where this is going?)
The curved line in the chart represents the solution to
Now that we have something of an interesting baseline, it would be instructive to plot other processes that generate garbage and see how they look. (upcoming post)
The whole chart is a problem. Not only does it show a knee, but the rest of the chart has the data all squished up along either the X or Y axis. It is going to be hard to compare different charts if they all look like this. I want the whole chart to be scale invariant.
(Can you guess where this is going?)
The curved line in the chart represents the solution to
x × y = 6100000
. This line appears to come pretty close to the actual garbage collection count when you use MIT/GNU Scheme to compile itself. Since the product is a constant, we have a hyperbolic curve. To transform this into something more reasonable, we want to take the logarithm of the curve in both the X and Y direction. This will change our line from x × y = constant
to x + y = constant
(because logarithms transform multiplication into addition).
This chart is certainly more interesting! First, notice how the measured garbage collection counts fall in almost a straight line. The curve that approximates the count is a straight line, and you can see that the actual garbage collection count deviates from this line slightly at the ends. Second, notice that the lines 10 × x
, x
, and x/10
become parallel. This is convenient, too. Finally, notice that the knee is gone. This sort of chart won't fool us with a fake knee.Now that we have something of an interesting baseline, it would be instructive to plot other processes that generate garbage and see how they look. (upcoming post)
A bit of a problem
Once again, here is a plot of the actual garbage collection counts as a function of heap size:
and we mention that we want to be “beyond the knee” in the curve. So where is that “knee”?
The green line is the plot of
The green line is the plot of
f(x) = 6100000/x
, or, to put it another way, the plot of x × y = 6100000
. The knee in the curve ought to be where x = y = √6100000 ≈ 2470
. But look at the chart! Just by eyeballing it, we can see that 2470 is nowhere near what anyone would call the knee.
Thursday, September 22, 2011
More on GC
I had been thinking of improving the garbage collector in MIT/GNU Scheme. I carefully measured the current GC performance so I could quantify any improvements. This uncovered a bug (see earlier post), but it also uncovered something more interesting. Here is a plot of the number of garbage collections as a function of heap size for the problem of compiling all the Scheme code in MIT/GNU Scheme:
This graph looks an awful lot like the graph of minimal collections.
In fact, if we plot the minimum number of garbage collections necessary for about 6100000 blocks, we get this:
That appears to be a really good fit. This isn't too surprising, because a GC that collects only when it is almost out of memory ought not to be doing much more work than the minimum necessary.
What makes this interesting is this: We can clearly see the “knee” of the curve, and we have enough memory to be “well beyond the knee” when we compile all of Scheme. This means that the total number of garbage collections will be quite low. If the garbage collector is fast enough, the total GC time will also be quite low. For example, when given 40000 blocks of memory, MIT/GNU Scheme will compile itself in about 3 minutes of which only 2.6 seconds total are spent in garbage collection. And as I noted previously, with sufficient memory, there need not be any garbage collection at all.
Notice that the details of the collector are unimportant. MIT/GNU Scheme uses a variation of the Minsky-Fenichel-Yochelson-Cheney-Arnborg algorithm, but it could just as easily use a mark-sweep collector. The point is that when you are only doing a few dozen collections, it doesn't matter much what algorithm you use because changing the algorithm will make little impact on the total run time. (Suppose I had the world's greatest garbage collector that just blows away the current one. Then I could reduce the total time from 182.6 seconds to exactly 180 seconds. This is less than a 5% improvement and not worth pursuing.) (A reference count GC, however, would be a problem because it traces garbage rather than live storage.)
So my original plan is pointless. There is no advantage to improving the garbage collector if it doesn't take more than a small amount of time, and it looks like you can achieve this by simply throwing memory at the process. So long as you have sufficient memory, you can make the cost of a collection be close to (or even equal to) zero.
At least it looks that way...
What makes this interesting is this: We can clearly see the “knee” of the curve, and we have enough memory to be “well beyond the knee” when we compile all of Scheme. This means that the total number of garbage collections will be quite low. If the garbage collector is fast enough, the total GC time will also be quite low. For example, when given 40000 blocks of memory, MIT/GNU Scheme will compile itself in about 3 minutes of which only 2.6 seconds total are spent in garbage collection. And as I noted previously, with sufficient memory, there need not be any garbage collection at all.
Notice that the details of the collector are unimportant. MIT/GNU Scheme uses a variation of the Minsky-Fenichel-Yochelson-Cheney-Arnborg algorithm, but it could just as easily use a mark-sweep collector. The point is that when you are only doing a few dozen collections, it doesn't matter much what algorithm you use because changing the algorithm will make little impact on the total run time. (Suppose I had the world's greatest garbage collector that just blows away the current one. Then I could reduce the total time from 182.6 seconds to exactly 180 seconds. This is less than a 5% improvement and not worth pursuing.) (A reference count GC, however, would be a problem because it traces garbage rather than live storage.)
So my original plan is pointless. There is no advantage to improving the garbage collector if it doesn't take more than a small amount of time, and it looks like you can achieve this by simply throwing memory at the process. So long as you have sufficient memory, you can make the cost of a collection be close to (or even equal to) zero.
At least it looks that way...
Monday, September 19, 2011
Thoughts about GC
Suppose I have a program that, when run, allocates a total of 10 GiB
(10×230 bytes). If it happens that I have
more than 10 GiB of heap, then I don't need to worry about running
out. If my heap is smaller than 10 GiB, I'm going to have a
problem.
It is often possible to re-use storage by overwriting the previously written information. If the result computed by a program (including the side effects as part of the ‘result’) is completely invariant with respect to the value of a location in memory, then that memory location could be re-used. Depending upon the program and the runtime system, re-use can be the responsibility of the programmer, or there can be an automatic means of storage management (e.g. a garbage collector). Regardless of how re-usable storage is identified and made available to the program, we can use the pigeonhole principle to determine the minimum number of times that we must re-use some part of memory. For example, if my program uses 10×230 and I have 10×230−1 bytes of heap, then at least one byte somewhere will have to be re-used.
So let us suppose that our program which allocates 10 GiB is to be run with a 2 GiB heap and our runtime system provides a garbage collector so we don't have to manage our storage manually. Suppose further that our garbage collector is not continuous, but has some sort of identifiable ‘cycle’ during which it locates re-usable memory. On each cycle the garbage collector can reclaim at most 2 GiB, so we know the runtime system must perform a minimum of at least five garbage collection cycles. It may perform many, many more than five, but it cannot perform fewer because it would have to recycle more than 100% of the heap at some point.
Another way to state this is that the number of garbage collections times the size of the heap must be equal to or greater than the total amount of storage allocated by a program. Here is a graph that shows the minimum number of collections needed for a program that allocates 50 bytes of storage.
The discrete jumps in the graph are because we aren't considering partial allocations (less than 1 byte). When working with realistic heap sizes, we can approximate this graph by dividing the total allocation by the heap size. The green line in this chart plots 50/x.
More to come...
It is often possible to re-use storage by overwriting the previously written information. If the result computed by a program (including the side effects as part of the ‘result’) is completely invariant with respect to the value of a location in memory, then that memory location could be re-used. Depending upon the program and the runtime system, re-use can be the responsibility of the programmer, or there can be an automatic means of storage management (e.g. a garbage collector). Regardless of how re-usable storage is identified and made available to the program, we can use the pigeonhole principle to determine the minimum number of times that we must re-use some part of memory. For example, if my program uses 10×230 and I have 10×230−1 bytes of heap, then at least one byte somewhere will have to be re-used.
So let us suppose that our program which allocates 10 GiB is to be run with a 2 GiB heap and our runtime system provides a garbage collector so we don't have to manage our storage manually. Suppose further that our garbage collector is not continuous, but has some sort of identifiable ‘cycle’ during which it locates re-usable memory. On each cycle the garbage collector can reclaim at most 2 GiB, so we know the runtime system must perform a minimum of at least five garbage collection cycles. It may perform many, many more than five, but it cannot perform fewer because it would have to recycle more than 100% of the heap at some point.
Another way to state this is that the number of garbage collections times the size of the heap must be equal to or greater than the total amount of storage allocated by a program. Here is a graph that shows the minimum number of collections needed for a program that allocates 50 bytes of storage.
The discrete jumps in the graph are because we aren't considering partial allocations (less than 1 byte). When working with realistic heap sizes, we can approximate this graph by dividing the total allocation by the heap size. The green line in this chart plots 50/x.
More to come...
Tuesday, August 30, 2011
Measuring GC times
MIT/GNU Scheme has a command line parameter,
For kicks, I decided to vary the value of the heap size and see what effect that had on the task of MIT/GNU Scheme compiling itself.
Collecting this data was tedious, so I spent a fair amount of time thinking about a model that could explain the results. At first glance, everything looks about right: the more memory, the fewer GCs and the less total time is spent garbage collecting. The problem is that the curve predicted by my model doesn't fit the empirical data at all. Of course this means that my model is either wrong or incomplete.
I have since figured out the problem, but I thought I'd let people puzzle over it themselves for a bit, first.
--heap
, that allows one to specify the size of heap in 1024 word blocks. On a 32-bit machine, the heap size is limited because MIT/GNU Scheme stores the type tag in the upper bits of the word. On a 64-bit machine the word size is wider, so it is possible to specify extremely large heaps without worrying that the address will overflow into the type tag.For kicks, I decided to vary the value of the heap size and see what effect that had on the task of MIT/GNU Scheme compiling itself.
Heap | GC Count | Cumulative GC Time in ms |
---|---|---|
6000 | 1056 | 25800 |
8192 (64 MiB) | 764 | 19120 |
10000 | 623 | 17910 |
16384 (128 MiB) | 377 | 10980 |
32768 (256 MiB) | 188 | 7350 |
50000 | 124 | 5560 |
65536 (512 MiB) | 94 | 5000 |
100000 | 63 | 4410 |
131072 (1 GiB) | 48 | 3910 |
150000 | 42 | 4160 |
200000 | 32 | 3800 |
250000 | 26 | 3560 |
262144 (2 GiB) | 25 | 3050 |
300000 | 22 | 3360 |
350000 | 19 | 3260 |
393216 (3 GiB) | 18 | 2740 |
Collecting this data was tedious, so I spent a fair amount of time thinking about a model that could explain the results. At first glance, everything looks about right: the more memory, the fewer GCs and the less total time is spent garbage collecting. The problem is that the curve predicted by my model doesn't fit the empirical data at all. Of course this means that my model is either wrong or incomplete.
I have since figured out the problem, but I thought I'd let people puzzle over it themselves for a bit, first.
Thursday, April 28, 2011
50 years ago — April 28, 1961
This is an excerpt from the forth coming LISP 1.5 Programmer's Manual.
Provision is made in LISP 1.5 for allocating blocks of storage for data. The data may consist of list structure or data words. Arrays may have up to three indicies.
To declare an array, us the function array. Its argument is a list of arrays to be declared. Each one must be for list structure or non-list structure.
SupposeALPHA
is to be a 10 x 10 array of list structure andBETA
a 5 x 5 x 5 array of non-list structure. The array function is
ARRAY (( (ALPHA (10,10) LIST) (BETA (5,5,5) NONLIST) ))
To find the value of B3,4,2:(BETA 3,4,2)
or(BETA I J K)
with a pair list.
To setALPHA
3,4 to "YES":(ALPHA (QUOTE SET) (QUOTE YES) 3 4)
Array uses marginal indexing for maximum speed. The total number of words used by an array whose size is D1 x D2 x D3 is 4 + D1 + D1D2 + D1D2D3. If the array is 2 dimensional, D1 = 1. If the array is 1 dimensional, D1 and D2 = 1.
To save space, specify the dimensions in increasing order.
ALPHA (3,4,5)
takes less words thanALPHA (5,3,4)
.
Compatability of LISP 1 and LISP 1.5.
EVALQUOTE
has two arguments whileAPPLY
has three. To change a LISP 1 program for LISP 1.5 simply eliminate the p-list if it is null. If the p-list is needed, then the function apply is available in LISP 1.5,- Arithmetic in LISP 1.5 is new, improved, and generally incompatible with LISP 1.
- LISP 1.5 has many extra features. There [sic, these] are being written up for the new LISP 1 Programmer's Manual. Until it comes, check in Room 26-265.
— Michael Levin
Artificial Intelligence Project
RLE and MIT Computation Center Memo 24
Arithmetic in Lisp 1.5
Friday, April 22, 2011
Fun with scammers
No Scheme or Lisp content today. I was amusing myself for a while with this email exchange. The strings of digits are courtesy www.random.org. I even found a site that creates fake Western Union receipts.
I sent this message to Bob's other account:
I just got this message that supposedly came from you. It is plausible enough that your friends might fall for it! You should email your friends and assure them that you are OK and to NOT SEND MONEY
~jrm
I'm glad to hear your email is ok, it would be terrible to be mugged and hacked. Where do you need it sent?
~jrm
I was only able to wire $300 (I hit the limit on my ATM). I can send a lot more tomorrow, but will you have time to get it?
~jrm
Robert, did you receive the money? Are you still in trouble? Is there anything more I can do?
My apologies, Robert, (I know you are in trouble, but I do have to sleep at times!) The MTCN is 2665592734
HOWEVER!!! IMPORTANT!!!!
I didn't want to broadcast that over email because it could be read by someone unscrupulous. So I added in your wife's birth date. Just subtract that out to get the real tracking number. I hope you can figure that out. Please let me know. If you want, I have an additional $500 I can send right away.
Trip number 1. I'm shocked it didn't work.
What?! Did you remember to subtract your wife's birth date from the number I gave you?
Uh oh. I'm busted... maybe I can vamp a bit....
How can I do that? You know I don't have a scanner. I've got an idea! What's the phone number of the hotel that is giving you problems? I'll call the manager myself and put it on my credit card.
No reply. I was going to give up but it occurred to me to check for an on-line receipt generator (they have everything on line, don't they?) They do. I sent him a nice receipt.
Good news! I found a scanner. Please let me know if everything is ok.
Hmm, he read the receipt, but didn't make a Western Union run. I'll string him along a bit.
I'll have to go by the bank (again) to get an additional check. I should be able to do that after work, in a few hours. Hang in there! I'm very worried.
I'm trying to get the guy to go `off script'.
URGENT
Robert, I need to know if you want the money in US Dollars, GB Pounds, or Euros, and what the exchange rate is. Can you get back to me as soon as you can? I'm really sorry about this, hope you are surviving.
No reply. Better dangle some cash.
I sent some more money. Did you get it? Is everything all right?
The MTCN is 4677254413 Please let me know when you get it!
Robert, did you receive the money? Are you still in trouble? Is there anything more I can do?
This guy just has a one-track mind. It seems impossible to get him `off script'.
Robert,
What is going on? Are you all right?
I just checked with Western Union and they say you haven't picked up either of the two wires that I sent! Are you having difficulty getting them?
The Western Union Control Number M.T.C.N # are these:
2665591653 --- amount $300
3207519347 --- amount $500
Please let me know if you have any problems retrieving the money.
~jrm
Yes! Trip number 2. He's probably figured out I'm playing him, though. Let's goad him into one more trip. A bit of an insane rant to allay his suspicions and we'll sweeten the pot just a bit more...
I don't know what to say, Robert. Perhaps they have a different system in Edinburgh and the numbers don't match? Are you sure that you are telling them the right numbers and didn't get them mixed up?
I checked with Western Union, and the agent told me that the money was there and that you didn't pick it up. Why don't you go to the Royal Bank of Scotland? They are at 4 Princess Street, right around the corner from where you are staying. They should be able to straighten this out right away.
I'm surprised you suggest that I am kidding. Why would I be so cruel? I have $800 on the line, and I'm doing this for your benefit. I'm not the victim here, you are. Do you think I would wire you this kind of money and then make a joke about it? Do you think I get a kick out of imagining you walk back and forth to Western Union for no reason? This is a bad situation and you seem to think I'm making it worse! I'm not sure you are taking this as seriously as I am. Are you sure you didn't get injured?
I'm going to try this one more time, and I expect action on your part. I have wired an additional $200 dollars for a total of One Thousand ($1000) dollars. I really cannot afford any more, so I hope this is enough. The details are these:
Sender: Joseph Marshall
Receiver: Robert G, 48 Princess Street, Edinburgh Scotland
Amount: $1000.00 (One Thousand)
Money Transfer Control Number: 1822631749
Transfer Fee: 50.00
Number: 10341731365
And I am enclosing a receipt. Please be sure to get these details correct. I am awaiting to hear that you have collected the cash.
~jrm
The receipt was fake and I included a “security question”: Who's your daddy? I was sure this would tip my hand, but....
Robert, it has been 7 hours since I sent this. Did you get it?
~jrm
YES! Three trips! Well, that's good enough for me.
Hello,
I'm writing this with tears in my eyes,my family and I came down here to Edinburgh Scotland for a short vacation unfortunately we were mugged at the park of the hotel where we stayed,all cash,credit card and cell were all stolen from us but luckily for us we still have our passports with us.
We've been to the embassy and the Police here but they're not helping issues at all and our flight leaves in few hrs from now but we're having problems settling the hotel bills and the hotel manager won't let us leave until we settle the bills the amount needed now is just $2,700..I am so confused right now and thank God i wasn't injured because I complied immediately.
Waiting to hear from you.
Robert
I sent this message to Bob's other account:
I just got this message that supposedly came from you. It is plausible enough that your friends might fall for it! You should email your friends and assure them that you are OK and to NOT SEND MONEY
~jrm
Thanks for checking on me, well the message is from me and not a hoax. I'm presently stuck in UK and really in a bad situation.....I have been talking to the lazy embassy but it's taking them time process me help, and it's hard to get hold of a phone, that was why I had to send out email for help....I need a financial help with getting back home...all i need is $2,700.,but anything you could assist with now be helpful ..let me if can you have the money wire to me through western union
Thanks.
Robert
~jrm
Glad you replied back.
I am full of panic now and the police only asked me to write a statement about the incident and directed me to the embassy,i have spoken to the Consulate here but they are not responding to the matter effectively,I really need your help to get myself out of this place.
Well all i need now is just $2,700 but any amount you can afford will be appreciated, you can have it wired to my name via Western Union i'll have to show my passport as ID to pick it up here and i promise to pay you back as soon as i get back home.
Here's the info you need at western union location below
Name: Robert G
Address: 48 Princes Street, Edinburgh, Scotland
Kindly email me the transfer details as soon as you have it done.Please let me know if you are heading out to western union now.
Thanks
Robert
What's going on?
I was only able to wire $300 (I hit the limit on my ATM). I can send a lot more tomorrow, but will you have time to get it?
~jrm
Please Let me have the western union confirmation number # M.T.C.N, so i can be able to pick up the Money.
I'm freaked out at the moment please keep me posted.
Robert.
Robert, did you receive the money? Are you still in trouble? Is there anything more I can do?
Thanks for your effort,Let me have the western union confirmation number # M.T.C.N.
Awaiting for your reply.
Robert
I left him dangling on this one.
What's going on?
Robert .
My apologies, Robert, (I know you are in trouble, but I do have to sleep at times!) The MTCN is 2665592734
HOWEVER!!! IMPORTANT!!!!
I didn't want to broadcast that over email because it could be read by someone unscrupulous. So I added in your wife's birth date. Just subtract that out to get the real tracking number. I hope you can figure that out. Please let me know. If you want, I have an additional $500 I can send right away.
What's going on,I just got back from Western Union and i was told there was no record for the transaction,kindly check the receipt and get back to me with the correct confirmation #
Awaiting to reply.
Robert .
Trip number 1. I'm shocked it didn't work.
What?! Did you remember to subtract your wife's birth date from the number I gave you?
Omg!! I am full of panic now,Let me have the scanned receipt For Payment giving to you by the Western Union Agent.
Uh oh. I'm busted... maybe I can vamp a bit....
How can I do that? You know I don't have a scanner. I've got an idea! What's the phone number of the hotel that is giving you problems? I'll call the manager myself and put it on my credit card.
No reply. I was going to give up but it occurred to me to check for an on-line receipt generator (they have everything on line, don't they?) They do. I sent him a nice receipt.
Good news! I found a scanner. Please let me know if everything is ok.
Thanks for your help.. i just got the money but there seems to be a problem with the conversion rate, i noticed the money came out here in $300but all i need is $2,700 as i requested, don't know that the conversion is that low, please i will need your further assistance so i will be needing $2,400.
Please let me know when you will be heading out to western union.
Waiting to read from you.
Hmm, he read the receipt, but didn't make a Western Union run. I'll string him along a bit.
I'll have to go by the bank (again) to get an additional check. I should be able to do that after work, in a few hours. Hang in there! I'm very worried.
Ok,I will be waiting ,Do e-mail the full Western Union Transfer Details once you have it done.
Keep me posted.
Robert.
I'm trying to get the guy to go `off script'.
URGENT
Robert, I need to know if you want the money in US Dollars, GB Pounds, or Euros, and what the exchange rate is. Can you get back to me as soon as you can? I'm really sorry about this, hope you are surviving.
No reply. Better dangle some cash.
I sent some more money. Did you get it? Is everything all right?
OK...Can i have the full transfer details (MTCN) so i will be able to pick up the money.
Waiting to read from you.
Thanks
Robert.
The MTCN is 4677254413 Please let me know when you get it!
Are you there?
Robert, did you receive the money? Are you still in trouble? Is there anything more I can do?
Let me have the confirmation details # M.T.C.N.
Awaiting for your urgent reply.
Robert
Please do get back to me with the Western Union Control Number M.T.C.N#
Please Keep me posted.
Robert
This guy just has a one-track mind. It seems impossible to get him `off script'.
Robert,
What is going on? Are you all right?
I just checked with Western Union and they say you haven't picked up either of the two wires that I sent! Are you having difficulty getting them?
The Western Union Control Number M.T.C.N # are these:
2665591653 --- amount $300
3207519347 --- amount $500
Please let me know if you have any problems retrieving the money.
~jrm
Are you kidding me or what,cause i was told at western union office the money you sent to me was not there.
Please Let me know what's going on am freaked out here.
Robert.
Yes! Trip number 2. He's probably figured out I'm playing him, though. Let's goad him into one more trip. A bit of an insane rant to allay his suspicions and we'll sweeten the pot just a bit more...
I don't know what to say, Robert. Perhaps they have a different system in Edinburgh and the numbers don't match? Are you sure that you are telling them the right numbers and didn't get them mixed up?
I checked with Western Union, and the agent told me that the money was there and that you didn't pick it up. Why don't you go to the Royal Bank of Scotland? They are at 4 Princess Street, right around the corner from where you are staying. They should be able to straighten this out right away.
I'm surprised you suggest that I am kidding. Why would I be so cruel? I have $800 on the line, and I'm doing this for your benefit. I'm not the victim here, you are. Do you think I would wire you this kind of money and then make a joke about it? Do you think I get a kick out of imagining you walk back and forth to Western Union for no reason? This is a bad situation and you seem to think I'm making it worse! I'm not sure you are taking this as seriously as I am. Are you sure you didn't get injured?
I'm going to try this one more time, and I expect action on your part. I have wired an additional $200 dollars for a total of One Thousand ($1000) dollars. I really cannot afford any more, so I hope this is enough. The details are these:
Sender: Joseph Marshall
Receiver: Robert G, 48 Princess Street, Edinburgh Scotland
Amount: $1000.00 (One Thousand)
Money Transfer Control Number: 1822631749
Transfer Fee: 50.00
Number: 10341731365
And I am enclosing a receipt. Please be sure to get these details correct. I am awaiting to hear that you have collected the cash.
~jrm
The receipt was fake and I included a “security question”: Who's your daddy? I was sure this would tip my hand, but....
Robert, it has been 7 hours since I sent this. Did you get it?
~jrm
No i didn't,I will like to go back to western union and make this complain to them that all the confirmation number they give you was invalid ok..
YES! Three trips! Well, that's good enough for me.
Tuesday, April 12, 2011
Oops
I made a mistake that has lead to a lot of confusion. It is a common mistake, so I have a lot of company. To assuage the trauma to my ego, I'll point out that no one who argued with me (or, for that matter, agreed with me) identified it. Allow me to explain it and correct myself.
The Church-Turing thesis is often stated as follows:
The problem is that phrasing, while not exactly incorrect, is not exactly right, either. It is vagueness disguised as definition. Because it is a definition, it is “correct, by definition”, but it is vague enough to admit interpretations that are absurd. Thus, if one isn't careful, one can deduce several absurdities that are “correct, by defintion”.
So what is the correct statement? The Church-Turing thesis is a statement about “effective calculability”. This is an intuitive concept which tries to generalize on the idea of an algorithm, or “plug and chug” math formulas, or “mechanistic” procedures like Newton's method. Many mathematicians have attempted to formalize the notion of “effective calculability”, among them Church and Turing, but also Gödel, Kleene, Rosser, and Post (and many others after them, but these are the big names). A correct statement of the Church-Turing thesis is more like this:
A whole set of non-computable quantities can be made available via reflection. Recall the definition of a Turing machine. The next state of a Turing machine is a function of the current state and the symbol on the tape under the head. It is not a function of, for example, the the size of the state transition table. This is not to say that these quantities are unknowable, but that they cannot be calculated without introspection (certainly bounds can calculated, and certainly any machine can be instrumented or even simulated, but consider this: any Turing machine can be augmented by any number of inaccessible states. There is no way to calculate the size of the state table without knowing the number of these states.) Also note, that there is nothing magic about these quantities, you can calculate to your heart's content with them after you obtain them. It is obtaining them in the first place that involves something other than computation.
My mistake is this: I didn't specifically and explicitly state when I was referring to the informal “effective calculation” and the superset that is the informal “computing” that one can do on a modern digital computer. In other words, I applied the Church-Turing thesis to too broad a class.
Given that, I'll rephrase my Exercise 1 and see how people react:
On the other hand, this is a far more satisfying rewording than
The Church-Turing thesis is often stated as follows:
Anything that can be computed by some means can also be computed by a Turing machine.And this is usually coupled with Rosser's observation that λ-calculus and recursion theory are equivalent formulations.
The problem is that phrasing, while not exactly incorrect, is not exactly right, either. It is vagueness disguised as definition. Because it is a definition, it is “correct, by definition”, but it is vague enough to admit interpretations that are absurd. Thus, if one isn't careful, one can deduce several absurdities that are “correct, by defintion”.
So what is the correct statement? The Church-Turing thesis is a statement about “effective calculability”. This is an intuitive concept which tries to generalize on the idea of an algorithm, or “plug and chug” math formulas, or “mechanistic” procedures like Newton's method. Many mathematicians have attempted to formalize the notion of “effective calculability”, among them Church and Turing, but also Gödel, Kleene, Rosser, and Post (and many others after them, but these are the big names). A correct statement of the Church-Turing thesis is more like this:
The formal terms “computable by a Turing machine” and “λ-definable” fully capture the intuitive notion of “effectively calculable”.Naturally, the bulk of what a modern digital computer does is in fact an “effective calculation” and thus a Turing machine or λ-calculus is a very good model for what a computer can calculate. However, there are things a modern digital computer can do other than calculate. These activities fall outside the scope of the Church-Turing thesis and there is no reason to believe that they would be subject to the limitations of computability.
Does anybody really know what time it is?
— Robert Lamm of Chicago
A good example of a non-computable, yet common task that a digital computer performs is to determine the current time. There is no algorithm or calculation that you can perform that will yield the current time. You simply must check a clock. Even if you somehow know the elapsed time since some event, you nonetheless have to check a clock in order to determine the beginning of the ‘epoch’ (baseline time).A whole set of non-computable quantities can be made available via reflection. Recall the definition of a Turing machine. The next state of a Turing machine is a function of the current state and the symbol on the tape under the head. It is not a function of, for example, the the size of the state transition table. This is not to say that these quantities are unknowable, but that they cannot be calculated without introspection (certainly bounds can calculated, and certainly any machine can be instrumented or even simulated, but consider this: any Turing machine can be augmented by any number of inaccessible states. There is no way to calculate the size of the state table without knowing the number of these states.) Also note, that there is nothing magic about these quantities, you can calculate to your heart's content with them after you obtain them. It is obtaining them in the first place that involves something other than computation.
My mistake is this: I didn't specifically and explicitly state when I was referring to the informal “effective calculation” and the superset that is the informal “computing” that one can do on a modern digital computer. In other words, I applied the Church-Turing thesis to too broad a class.
Given that, I'll rephrase my Exercise 1 and see how people react:
Exercise 1: Write a program for a class of universal Turing machines that, when run, calculates whether the machine it is on is imposing additional space complexity.When I phrase it that way, it sort of sounds obvious that there are going to be problems.
On the other hand, this is a far more satisfying rewording than
Write a portable program that is non-portable.
Friday, April 8, 2011
Too much exercise
I've spent the past few days thinking about how to answer the exercises in my previous post. I think I finally came up with a good analogy. (Bear with me.)
These days, it is not uncommon to run programs on virtual hardware. Programs like VMWare or Virtual Box provide nearly undetectable emulation of an x86 PC. In fact, modern x86 CPUs actually emulate the x86 instruction set through hardware accelerated interpretation or jit compiling. Although it can be tricky to emulate the entire body of computer hardware and peripherals that might go into a PC, it is straightforward to emulate the bare bones PC hardware with close to perfect fidelity, even down to the CPUID.
So could one write a portable x86 program — one that can run without modification on a real PC with real hardware or on a real PC with virtual hardware (like a Transmeta CPU) or on completely virtual hardware — that can reliably and predictably detect whether it is being “run directly” or “just being emulated”? The entire raison d'être of virtualization rests on the fact that, given enough effort, you can fool all of the programs all of the time. (Bugs notwithstanding.)
So let's reconsider Exercise 1: Write a program that returns 1 if run on any tail-recursive implementation of a language, but returns 0 if run on any non tail-recursive implementation of that same language.
Restate this as: Write a portable x86 program — one that can run without modification on a real PC with real hardware or on a real PC with virtual hardware (like a Transmeta CPU) or on completely virtual hardware — that can NOT ONLY reliably and predictably detect whether it is being “run directly” or “just being emulated”, BUT FURTHERMORE can reliably and predictably detect whether the underlying execution mechanism is “recursive” or “iterative”?
I assert that this question is barely worth thinking about (unless you work at VMWare) and that the answer could hardly be anything but “No.”
Exercise 1a (extra credit): Write a program that crashes or doesn't return if run on a any tail-recursive implementation of a language, but returns 0 if run on a any non tail-recursive implementation of that same language.
If a program cannot detect whether or not it is being emulated, perhaps it can “half” detect. If it could reliably crash the emulator, then you could at least demonstrate that some sort of emulation is being done even if you couldn't perform a conditional branch in this situation. Suppose the program were being emulated. Suppose further that the emulator is recursive (for whatever reason). One could, in theory, crash the emulator via a stack overflow or out-of-memory error (maybe that emulator is recursive, but it uses heap-allocated stack frames). Alas, this isn't what we asked for. We are looking for a program that can reliably crash an iterative emulator. We're out of luck.
Exercise 1b, essay (extra extra credit): Discuss the implications of your answers to exercises 1 and 1a to the semantics of the programs (woah, my apologies, programs don't have semantics, languages do. Pretend I wrote “languages”). In particular, briefly outline what changes to the various semantic models — denotational, operational, and/or axiomatic — take place upon introducing tail recursion.
The implication is this: the semantics of a programming language are logically independent of the implementation of the language. (If they were not, virtual machines would be impossible.)
Exercise 2: What trivial change can Louis make to his code for
Exercise 3: What trivial change can Louis make to his code for
Exercise 4: Implement
These days, it is not uncommon to run programs on virtual hardware. Programs like VMWare or Virtual Box provide nearly undetectable emulation of an x86 PC. In fact, modern x86 CPUs actually emulate the x86 instruction set through hardware accelerated interpretation or jit compiling. Although it can be tricky to emulate the entire body of computer hardware and peripherals that might go into a PC, it is straightforward to emulate the bare bones PC hardware with close to perfect fidelity, even down to the CPUID.
So could one write a portable x86 program — one that can run without modification on a real PC with real hardware or on a real PC with virtual hardware (like a Transmeta CPU) or on completely virtual hardware — that can reliably and predictably detect whether it is being “run directly” or “just being emulated”? The entire raison d'être of virtualization rests on the fact that, given enough effort, you can fool all of the programs all of the time. (Bugs notwithstanding.)
So let's reconsider Exercise 1: Write a program that returns 1 if run on any tail-recursive implementation of a language, but returns 0 if run on any non tail-recursive implementation of that same language.
Restate this as: Write a portable x86 program — one that can run without modification on a real PC with real hardware or on a real PC with virtual hardware (like a Transmeta CPU) or on completely virtual hardware — that can NOT ONLY reliably and predictably detect whether it is being “run directly” or “just being emulated”, BUT FURTHERMORE can reliably and predictably detect whether the underlying execution mechanism is “recursive” or “iterative”?
I assert that this question is barely worth thinking about (unless you work at VMWare) and that the answer could hardly be anything but “No.”
Exercise 1a (extra credit): Write a program that crashes or doesn't return if run on a any tail-recursive implementation of a language, but returns 0 if run on a any non tail-recursive implementation of that same language.
If a program cannot detect whether or not it is being emulated, perhaps it can “half” detect. If it could reliably crash the emulator, then you could at least demonstrate that some sort of emulation is being done even if you couldn't perform a conditional branch in this situation. Suppose the program were being emulated. Suppose further that the emulator is recursive (for whatever reason). One could, in theory, crash the emulator via a stack overflow or out-of-memory error (maybe that emulator is recursive, but it uses heap-allocated stack frames). Alas, this isn't what we asked for. We are looking for a program that can reliably crash an iterative emulator. We're out of luck.
Exercise 1b, essay (extra extra credit): Discuss the implications of your answers to exercises 1 and 1a to the semantics of the programs (woah, my apologies, programs don't have semantics, languages do. Pretend I wrote “languages”). In particular, briefly outline what changes to the various semantic models — denotational, operational, and/or axiomatic — take place upon introducing tail recursion.
The implication is this: the semantics of a programming language are logically independent of the implementation of the language. (If they were not, virtual machines would be impossible.)
Exercise 2: What trivial change can Louis make to his code for
smaller
that will disable tail recursion?(define (smallest list predicate) ;; Returns the smallest element of list. (if (null? list) #f (let ((answer (if (predicate (car list) (cadr list)) (smallest (cons (car list) (cddr list)) predicate) (smallest (cdr list) predicate)))) answer)))There is no need for a flag to enable or disable tail recursion. Tail recursion only “works” when the caller does nothing to the return value but immediately return it itself. A
LET
expression is syntactic sugar:
(define (smallest list predicate)
;; Returns the smallest element of list.
(if (null? list)
#f
((lambda (answer) answer)
(if (predicate (car list) (cadr list))
(smallest (cons (car list) (cddr list)) predicate)
(smallest (cdr list) predicate)))))
You can now see that the value of the conditional expression is not immediately returned, but rather passed as an argument to a (rather trivial) lambda
expression. (Yes, a smart compiler could notice that the lambda
expression is simply the identity function in this case. If that is a problem, simply make sure that the function you use to disable the tail recursion is late-bound so that the compiler cannot prove that it can be elided.)Exercise 3: What trivial change can Louis make to his code for
smaller
that will enable tail recursion on all but the initial call to smaller
?
(define (smallest list predicate) ;; Returns the smallest element of list. (define (smallest list predicate) (if (null? list) #f (if (predicate (car list) (cadr list)) (smallest (cons (car list) (cddr list)) predicate) (smallest (cdr list) predicate)))) (let ((answer (smallest list predicate))) answer))Again, we don't need special flags to have fine control over tail recursion.
Exercise 4: Implement
disable-tail-recursion
as a macro.;;; For example: (define-syntax disable-tail-recursion (syntax-rules () ((disable-tail-recursion form) (let ((answer form)) answer))))You could be more sophisticated and invoke a late-bound function, but this is, frankly, overkill. You hardly need macros, special forms, flags, decorations, etc. to control tail recursion. This is especially true if all you want is a pretty stack trace for debugging. Tail recursion is as easy to turn on and off as a
printf
.
Friday, April 1, 2011
Exercises
Apparently I was not as clear with my questions as I thought I was. I've edited the questions to make them more precise. The additional text is in this alternative color.
Exercise 1: Write a program that returns 1 if run ona any tail-recursive implementation of a language, but returns 0 if run on a any non tail-recursive implementation of that same language. The goal is not to find a program and language implementation pair that exhibit the behavior, but to design a program that can correctly infer whether on not any supplied language implementation supports tail recursion by examining its own behavior.
Exercise 1a (extra credit): Write a program that crashes or doesn't return if run ona any tail-recursive implementation of a language, but returns 0 if run on a any non tail-recursive implementation of that same language. The goal is not to find a program and language implementation pair that exhibit the behavior, but to design a program that can correctly infer that any supplied implementation does not support tail recursion (this is a weaker condition because we do not require that the program correctly infer support of tail recursion, but only lack of support).
Exercise 1b, essay (extra extra credit): Discuss the implications of your answers to exercises 1 and 1a to the semantics of the programs. In particular, briefly outline what changes to the various semantic models — denotational, operational, and/or axiomatic — take place upon introducing tail recursion.
Louis Reasoner wrote this program to sort numbers:
Exercise 2: What trivial change can Louis make to his code for
Louis again has a problem. “I have a stack trace, but, but, well, just LOOK at it!
Exercise 3: What trivial change can Louis make to his code for
Homework: Fix Louis's code.
Exercise 1: Write a program that returns 1 if run on
Exercise 1a (extra credit): Write a program that crashes or doesn't return if run on
Exercise 1b, essay (extra extra credit): Discuss the implications of your answers to exercises 1 and 1a to the semantics of the programs. In particular, briefly outline what changes to the various semantic models — denotational, operational, and/or axiomatic — take place upon introducing tail recursion.
Louis Reasoner wrote this program to sort numbers:
(define (smallest list predicate) ;; Returns the smallest element of list. (if (null? list) #f (if (predicate (car list) (cadr list)) (smallest (cons (car list) (cddr list)) predicate) (smallest (cdr list) predicate)))) (define (louis-sort list predicate) (do ((remainder list (cdr remainder)) (answer '() (append answer (cons (smallest remainder predicate) '())))) ((null? remainder) answer))) (define test-list (do ((i 0 (+ i 1)) (answer '() (cons i answer))) ((>= i 100) answer)))He complains “I am unable to debug this because
smallest
tail calls itself and leaves no trace on the stack. Look!1 ]=> (louis-sort test-list <) ;The object (), passed as an argument to safe-car, is not a pair. ;To continue, call RESTART with an option number: ; (RESTART 1) => Return to read-eval-print level 1. 2 error> (debug) There are 9 subproblems on the stack. Subproblem level: 0 (this is the lowest subproblem level) Expression (from stack): (predicate (car list) ###) subproblem being executed (marked by ###): (cadr list) Environment created by the procedure: SMALLEST applied to: ((0) #[arity-dispatched-procedure 39]) The execution history for this subproblem contains 1 reduction. You are now in the debugger. Type q to quit, ? for commands. 3 debug> H H SL# Procedure-name Expression 0 smallest (predicate (car list) (cadr list)) 1 smallest (if (predicate (car list) (cadr list)) (smalle ... 2 do-loop (cons (smallest remainder predicate) (quote ())) 3 do-loop (append answer (cons (smallest remainder predi ... 4 do-loop (do-loop (cdr remainder) (append answer (cons ... 5 %repl-eval (let ((value (hook/repl-eval s-expression envi ... 6 %repl-eval/write (hook/repl-write (%repl-eval s-expression envi ... 7 do-loop (begin (if (queue-empty? queue) (let ((environ ... 8 loop (loop (bind-abort-restart cmdl (lambda () (der ...“I was expecting to see pending stack frames as the program recursively called
smaller
, but since they are all tail-recursive calls, I don't have a way of knowing how deep it went. I wish I could selectively enable or disable tail recursion for this one call...”Exercise 2: What trivial change can Louis make to his code for
smaller
that will disable tail recursion?Louis again has a problem. “I have a stack trace, but, but, well, just LOOK at it!
1 ]=> (louis-sort test-list <) ;The object (), passed as an argument to safe-car, is not a pair. ;To continue, call RESTART with an option number: ; (RESTART 1) => Return to read-eval-print level 1. 2 error> (debug) There are more than 50 subproblems on the stack. Subproblem level: 0 (this is the lowest subproblem level) Expression (from stack): (predicate (car list) ###) subproblem being executed (marked by ###): (cadr list) Environment created by the procedure: SMALLEST applied to: ((0) #[arity-dispatched-procedure 39]) The execution history for this subproblem contains 1 reduction. You are now in the debugger. Type q to quit, ? for commands. 3 debug> H H SL# Procedure-name Expression 0 smallest (predicate (car list) (cadr list)) 1 smallest (if (predicate (car list) (cadr list)) (smalle ... 2 smallest (let ((answer (if (predicate (car list) (cadr ... 3 smallest (let ((answer (if (predicate (car list) (cadr ... 4 smallest (let ((answer (if (predicate (car list) (cadr ... 5 smallest (let ((answer (if (predicate (car list) (cadr ... 6 smallest (let ((answer (if (predicate (car list) (cadr ... 7 smallest (let ((answer (if (predicate (car list) (cadr ... 8 smallest (let ((answer (if (predicate (car list) (cadr ... 9 smallest (let ((answer (if (predicate (car list) (cadr ... 10 smallest (let ((answer (if (predicate (car list) (cadr ... 11 smallest (let ((answer (if (predicate (car list) (cadr ... 12 smallest (let ((answer (if (predicate (car list) (cadr ... 13 smallest (let ((answer (if (predicate (car list) (cadr ... 14 smallest (let ((answer (if (predicate (car list) (cadr ... 15 smallest (let ((answer (if (predicate (car list) (cadr ... 16 smallest (let ((answer (if (predicate (car list) (cadr ... 17 smallest (let ((answer (if (predicate (car list) (cadr ... 18 smallest (let ((answer (if (predicate (car list) (cadr ... 19 smallest (let ((answer (if (predicate (car list) (cadr ... 20 smallest (let ((answer (if (predicate (car list) (cadr ... 21 smallest (let ((answer (if (predicate (car list) (cadr ... 22 smallest (let ((answer (if (predicate (car list) (cadr ... 23 smallest (let ((answer (if (predicate (car list) (cadr ... 24 smallest (let ((answer (if (predicate (car list) (cadr ... 25 smallest (let ((answer (if (predicate (car list) (cadr ... 26 smallest (let ((answer (if (predicate (car list) (cadr ... 27 smallest (let ((answer (if (predicate (car list) (cadr ... 28 smallest (let ((answer (if (predicate (car list) (cadr ... 29 smallest (let ((answer (if (predicate (car list) (cadr ... 30 smallest (let ((answer (if (predicate (car list) (cadr ... 31 smallest (let ((answer (if (predicate (car list) (cadr ... 32 smallest (let ((answer (if (predicate (car list) (cadr ... 33 smallest (let ((answer (if (predicate (car list) (cadr ... 34 smallest (let ((answer (if (predicate (car list) (cadr ... 35 smallest (let ((answer (if (predicate (car list) (cadr ... 36 smallest (let ((answer (if (predicate (car list) (cadr ... 37 smallest (let ((answer (if (predicate (car list) (cadr ... 38 smallest (let ((answer (if (predicate (car list) (cadr ... 39 smallest (let ((answer (if (predicate (car list) (cadr ... 40 smallest (let ((answer (if (predicate (car list) (cadr ... 41 smallest (let ((answer (if (predicate (car list) (cadr ... 42 smallest (let ((answer (if (predicate (car list) (cadr ... 43 smallest (let ((answer (if (predicate (car list) (cadr ... 44 smallest (let ((answer (if (predicate (car list) (cadr ... 45 smallest (let ((answer (if (predicate (car list) (cadr ... 46 smallest (let ((answer (if (predicate (car list) (cadr ... 47 smallest (let ((answer (if (predicate (car list) (cadr ... 48 smallest (let ((answer (if (predicate (car list) (cadr ... 49 smallest (let ((answer (if (predicate (car list) (cadr ... 50 smallest (let ((answer (if (predicate (car list) (cadr ... 51 smallest (let ((answer (if (predicate (car list) (cadr ... 52 smallest (let ((answer (if (predicate (car list) (cadr ... 53 smallest (let ((answer (if (predicate (car list) (cadr ... 54 smallest (let ((answer (if (predicate (car list) (cadr ... 55 smallest (let ((answer (if (predicate (car list) (cadr ... 56 smallest (let ((answer (if (predicate (car list) (cadr ... 57 smallest (let ((answer (if (predicate (car list) (cadr ... 58 smallest (let ((answer (if (predicate (car list) (cadr ... 59 smallest (let ((answer (if (predicate (car list) (cadr ... 60 smallest (let ((answer (if (predicate (car list) (cadr ... 61 smallest (let ((answer (if (predicate (car list) (cadr ... 62 smallest (let ((answer (if (predicate (car list) (cadr ... 63 smallest (let ((answer (if (predicate (car list) (cadr ... 64 smallest (let ((answer (if (predicate (car list) (cadr ... 65 smallest (let ((answer (if (predicate (car list) (cadr ... 66 smallest (let ((answer (if (predicate (car list) (cadr ... 67 smallest (let ((answer (if (predicate (car list) (cadr ... 68 smallest (let ((answer (if (predicate (car list) (cadr ... 69 smallest (let ((answer (if (predicate (car list) (cadr ... 70 smallest (let ((answer (if (predicate (car list) (cadr ... 71 smallest (let ((answer (if (predicate (car list) (cadr ... 72 smallest (let ((answer (if (predicate (car list) (cadr ... 73 smallest (let ((answer (if (predicate (car list) (cadr ... 74 smallest (let ((answer (if (predicate (car list) (cadr ... 75 smallest (let ((answer (if (predicate (car list) (cadr ... 76 smallest (let ((answer (if (predicate (car list) (cadr ... 77 smallest (let ((answer (if (predicate (car list) (cadr ... 78 smallest (let ((answer (if (predicate (car list) (cadr ... 79 smallest (let ((answer (if (predicate (car list) (cadr ... 80 smallest (let ((answer (if (predicate (car list) (cadr ... 81 smallest (let ((answer (if (predicate (car list) (cadr ... 82 smallest (let ((answer (if (predicate (car list) (cadr ... 83 smallest (let ((answer (if (predicate (car list) (cadr ... 84 smallest (let ((answer (if (predicate (car list) (cadr ... 85 smallest (let ((answer (if (predicate (car list) (cadr ... 86 smallest (let ((answer (if (predicate (car list) (cadr ... 87 smallest (let ((answer (if (predicate (car list) (cadr ... 88 smallest (let ((answer (if (predicate (car list) (cadr ... 89 smallest (let ((answer (if (predicate (car list) (cadr ... 90 smallest (let ((answer (if (predicate (car list) (cadr ... 91 smallest (let ((answer (if (predicate (car list) (cadr ... 92 smallest (let ((answer (if (predicate (car list) (cadr ... 93 smallest (let ((answer (if (predicate (car list) (cadr ... 94 smallest (let ((answer (if (predicate (car list) (cadr ... 95 smallest (let ((answer (if (predicate (car list) (cadr ... 96 smallest (let ((answer (if (predicate (car list) (cadr ... 97 smallest (let ((answer (if (predicate (car list) (cadr ... 98 smallest (let ((answer (if (predicate (car list) (cadr ... 99 smallest (let ((answer (if (predicate (car list) (cadr ... 100 smallest (let ((answer (if (predicate (car list) (cadr ... 101 smallest (let ((answer (if (predicate (car list) (cadr ... 102 do-loop (cons (smallest remainder predicate) (quote ())) 103 do-loop (append answer (cons (smallest remainder predi ... 104 do-loop (do-loop (cdr remainder) (append answer (cons ... 105 %repl-eval (let ((value (hook/repl-eval s-expression envi ... 106 %repl-eval/write (hook/repl-write (%repl-eval s-expression envi ... 107 do-loop (begin (if (queue-empty? queue) (let ((environ ... 108 loop (loop (bind-abort-restart cmdl (lambda () (der ...“I wish I could start eliminating the tail call from the second call onwards...
Exercise 3: What trivial change can Louis make to his code for
smaller
that will enable tail recursion on all but the initial call to smaller
? (Hint, the bold text in the following backtrace shows how this differs from the first backtrace.)
1 ]=> (louis-sort test-list <) ;The object (), passed as an argument to safe-car, is not a pair. ;To continue, call RESTART with an option number: ; (RESTART 1) => Return to read-eval-print level 1. 2 error> (debug) There are 10 subproblems on the stack. Subproblem level: 0 (this is the lowest subproblem level) Expression (from stack): (predicate (car list) ###) subproblem being executed (marked by ###): (cadr list) Environment created by the procedure: SMALLEST applied to: ((0) #[arity-dispatched-procedure 39]) The execution history for this subproblem contains 1 reduction. You are now in the debugger. Type q to quit, ? for commands. 3 debug> H H SL# Procedure-name Expression 0 smallest (predicate (car list) (cadr list)) 1 smallest (if (predicate (car list) (cadr list)) (smalle ... 2 smallest (let ((answer (smallest list predicate))) answer) 3 do-loop (cons (smallest remainder predicate) (quote ())) 4 do-loop (append answer (cons (smallest remainder predi ... 5 do-loop (do-loop (cdr remainder) (append answer (cons ... 6 %repl-eval (let ((value (hook/repl-eval s-expression envi ... 7 %repl-eval/write (hook/repl-write (%repl-eval s-expression envi ... 8 do-loop (begin (if (queue-empty? queue) (let ((environ ... 9 loop (loop (bind-abort-restart cmdl (lambda () (der ...Louis seems to have gotten his wish and is currently making progress despite the existence of tail recursion. Cy D. Fect, however, is unimpressed. He complains “Sure, it is trivial to disable tail recursion whenever you desire, but I don't like guessing whether the compiler is going to emit a tail call, and I'd simply rather not learn. I'd prefer some sort of declaration or decoration so I can explicitly tell the compiler to not emit a tail call. Something like this:
(define (smallest list predicate) ;; Returns the smallest element of list. (if (null? list) #f (if (predicate (car list) (cadr list)) ;; This branch should leave a backtrace - CDF (disable-tail-recursion (smallest (cons (car list) (cddr list)) predicate)) (smallest (cdr list) predicate))))Exercise 4: Implement
disable-tail-recursion
as a macro.Homework: Fix Louis's code.
Wednesday, March 30, 2011
Tail recursion and debugging
java.lang.NullPointerException at java.io.FilterInputStream.close(FilterInputStream.java:172) at sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream.close(JarURLConnection.java:108) at com.alba.vware.ResourceServlet.doGet(ResourceServlet.java:396) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at com.alba.vware.ServletDefinition.doService(ServletDefinition.java:261) at com.alba.vware.ServletDefinition.service(ServletDefinition.java:175) at com.alba.vware.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:62) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) [Continued on next page]
When a tail recursion is eliminated, there's no stack frame left to use to print a traceback when something goes wrong later.
— Guido van Rossum
[Continued from previous page] at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) [Continued on next page]
The elimination of stack frames doesn't do anything for the algorithmic complexity of the code, but it does make debugging harder.
— Guido van Rossum
[Continued from previous page] at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) [Continued on next page]
My issue with this optimization is that you lose debugging information.
— Ian Bicking
[Continued from previous page] at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) [Continued on next page]
Tail recursion can be easily inlined, however it is my understanding that the Java creators specifically chose not to implement this, as it makes resultant code hard to debug (if not impossible).
— Andrew Monkhouse
[Continued from previous page] at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) [Continued on next page]
This will confuse users who inadvertently wrote something recursive (the recursion isn't obvious in the stack trace printed), and makes debugging hard.
— Guido van Rossum
[Continued from previous page] at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) [Continued on next page]
I am a big fan of being able to get stack backtraces when I have a problem to debug. But tail call recursion silently throws away stack frames. You get great performance benefits from doing so, but I'd like an option when debugging to get at the information that has been thrown away.
— btilly
[Continued from previous page] at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) [Continued on next page]
I'm uninterested in optimizing tail recursion.
— Guido van Rossum
[Continued from previous page] at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.FilterDefinition.doFilter(FilterDefinition.java:167) at com.alba.vware.FilterChainInvoke.doFilter(FilterChainInvoke.java:58) at com.alba.vware.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.alba.vware.Filter.doFilter(Filter.java:113) at com.alba.vware.FilteredServlet$Chain.doFilter(FilteredServlet.java:176) at com.alba.vware.FilteredServlet.service(FilteredServlet.java:145) at com.alba.vware.HttpConnection.runServletFromWithinSpan(HttpConnection.java:933) at com.alba.vware.HttpConnection.access$000(HttpConnection.java:71) at com.alba.vware.HttpConnection$1.runServletFromWithinSpan(HttpConnection.java:854) at com.alba.vware.TraceHelper$TraceableServletRunnable$2.run(TraceHelper.java:467) at com.alba.vware.LocalTraceSpanRunnable.run(LocalTraceSpanRunnable.java:56) at com.alba.vware.LocalTraceSpanBuilder.run(LocalTraceSpanBuilder.java:518) at com.alba.vware.TraceHelper$TraceableServletRunnable.beginNewTrace(TraceHelper.java:411) at com.alba.vware.TraceHelper$TraceableServletRunnable.runWithTracingEnabled(TraceHelper.java:377) at com.alba.vware.TraceHelper$TraceableServletRunnable.run(TraceHelper.java:339) at com.alba.vware.HttpConnection.runServlet(HttpConnection.java:850) at com.alba.vware.HttpConnection.run(HttpConnection.java:815) at com.alba.vware.DispatchQueue$WorkerThread.run(DispatchQueue.java:379)
The default value for thread stack size is 128K for 32-bit server and 256K for 64-bit server.
— Java manual
Subscribe to:
Posts (Atom)