Friday, May 1, 2009

You knew I'd say something, Part III

Earlier I said that tail recursion allows a program access to a more powerful computation model. I'm referring to Hewitt's Actor model. At its core, the Actor model is extremely simple. Computation agents called ‘Actors’ interact only through message-passing. The full Actor model allows for multiple asynchronous agents, each acting independently. On a single processor computer, it can be much more convenient to use a synchronous, serial Actor model in which only one Actor at a time is running and control of the processing is passed along with the messages.

One important feature of the Actor model is that messages are unidirectional. When an actor sends a message it does not require a reply as part of the protocol. If the actor needs a reply or wishes to handle a reply, it sets up a callback. The callback can have a well-known name that any actor can use in a message, or it can be an unnamed piece of code (a lambda expression) that shares the context of the sender.

Since messages are unidirectional, they don't require a ‘return address’. The message may have a return address as part of a callback, but that's a protocol decision, it isn't built-in to the message itself. The recipient of the message might simply ignore the callback (it is allowed to).

The recipient of a message can pretty much do anything it wants. Of course we expect the program to be written such that something useful happens, but the recipient of a message is allowed to do things like
  • compute something
  • change its state
  • send a message
  • create new actors
Or any combination of the above. For example, upon receiving a message, an actor could compute something based on the message contents, create a new actor, and resend the message (with the newly computed value) on to the newly created actor. This kind of arbitrary delegation is a key feature of the actor model. Once a message is sent, the sender is completely out of the picture (although he could put his return address in a callback). The message recipient is allowed to delegate all or any part of its action to other actors. Control structure emerges from the patterns of message passing.

Function calling = message passing (almost)

A procedure is like an actor. Nothing happens until you invoke it (send it a message), it can compute something, change its state (if it is a closure or has access to something mutable), and call other procedures (create new actors and send them messages). But a procedure will be a limited form of actor if it cannot fully delegate its computation to another procedure.

In virtually every computer language, a procedure call has an implicit continuation (informally, the caller). ‘Returning a value’ is ‘invoking the continuation’. Whether a procedure can fully delegate to another depends on how the language manages the implicit continuation. If the language supports proper tail recursion, then when a procedure tail calls another, the implicit continuation is passed directly along to the callee. If the language does not support proper tail recursion, then the implicit continuation is not passed to the callee. Instead, a new implicit continuation is allocated that contains (wraps around) the old one. The wrapper doesn't actually do anything in the tail call situation, it just takes up space and possibly retains pointers to reclaimable storage, but you can't get rid of it because it is built in to the language.

If you cannot pass the implicit continuation on to the callee, then you cannot fully delegate the computation. If you cannot fully delegate, then you cannot fully model message passing with function calls. On the other hand, if you can fully delegate, then a function call is exactly the same passing a message (in our synchronous, serial actor model).

Tail recursion unifies message passing and function calling. This is the reason tail recursion is cool. The fact that this allows you to write a loop as a tail call is sort of a side benefit. (It directly follows because all control structure can be modeled as message passing, but as I mentioned repeatedly, I have nothing against looping constructs per se.)

With tail recursion, you have message passing embedded in your language. By embedded I mean that you don't have to directly manipulate message objects and inject and extract values from them. That happens automatically as part of the function call syntax.

Without tail recursion, you have to implement message passing as a separate, ‘bolted on’, feature. You will have explicit message objects to allocate, initialize, enqueue, poll for, destructure, and deallocate. The control structure that emerges from the message passing patterns will interfere with the control structure in your code (you'll have to structure all your code around the ‘driver loop’).

20 comments:

  1. I posted in part II, complaining that you need to show something that isn't better written another way. (I still disagree with your later reply, because it isn't fair to say you need to globally rewrite code you specifically chose for the fact nobody would write it that way; it would have been iterators from day one.)

    Honor compels me to say that this strikes me as a much better argument. You can of course hack something into current Python, but it'll be dirty and weird, as you say, and this opens an entire paradigm up that Python has no good story for right now. This is the best argument I've heard on this topic yet.

    (BTW, to anyone who might reply, I've seen a couple of different libraries, some based on generators and some not (Candygram, for instance), and none of them are as clean as a tail-call based system could be.)

    ReplyDelete
  2. Fantastic. You gave me a much better understanding of the motivation behind tail calls beyond "simply" to save stack space. Thank you.

    ReplyDelete
  3. I'd like to add, that Erlang is a good practical example of this. Essentially tail-recursion is what empowers it to completely decouple different parts of the system and make them as lightweight, as possible.

    ReplyDelete
  4. This is how I first ran into continuations. I was 'bolting' on the Actor Model in C++. To me, its the epitome of Object Orientation, because every method must be a 'tell, don't ask' method. You can't ask (without sending a continuation message). You are also forced to be decoupled in time, because you can't depend on when another Actor might execute his message. This model really makes it easy to see when code 'wants' to be decoupled, and makes it easy to do so.

    ReplyDelete
  5. That's what I was thinking, Vsevolod.

    ReplyDelete
  6. I find this example more persuasive but I would like to see code for the Python with TCO versus with trampolines. One advantage with trampolines is that you cannot accidentally create a stack by misunderstanding the details of the optimization. Explicit TCO would be a better fit for Python, but only if the TCO code is much cleaner than the trampoline version.

    ReplyDelete
  7. You say:

    "Without tail recursion, you have to implement message passing as a separate, ‘bolted on’, feature. You will have explicit message objects to allocate, initialize, enqueue, poll for, destructure, and deallocate."

    Someone on reddit says:

    def actor(f,*x):
    try:
    while 1:f,x=f(*x)
    except StopIteration:pass


    One of you is incorrect. Which one?

    ReplyDelete
  8. I see a lot of words, but no actual code to back it up.

    ReplyDelete
  9. Oh, here is some code:

    > One of the innovations provided by the asynchronous pattern is that the caller decides whether a particular call should be asynchronous. It is not necessary for a called object to do additional programming for supporting asynchronous behavior by its clients; asynchronous delegates provide for this in the pattern. The common language runtime handles the difference between the caller and called object views.

    http://msdn.microsoft.com/en-us/library/aa719595(VS.71).aspx

    Oh dear, it seems to be written for a language that doesn't support TCO.

    Well better luck in part IV.

    ReplyDelete
  10. Paul Prescod said...
    Someone on reddit says:

    def actor(f,*x):
    try:
    while 1:f,x=f(*x)
    except StopIteration:pass

    This approach won't work in the asynchronous case, when each f should be executed in it's own thread. (And that's where the Actor model shines)

    ReplyDelete
  11. Paul Prescod/Vsevold:

    And what happens when one message/actor needs to "return" several other actions to be taken?

    One thing about bolt-on-messaging is that it breaks the 1-to-1 mapping.

    There is more to this than just state machines. What if I want to pass my continuation on to multiple "forward" recipients?

    ReplyDelete
  12. I think that there may be some confusion about the things being compared here. The two things ARE:

    1. Python with TCO.

    2. Python without TCO.

    We are NOT comparing Python with Erlang or E or something like that. That's a totally different conversation.

    So the last two posters should demonstrate -- in python or scheme code -- how one automatically gets async delivery, threading or 1 to many messaging by adding TCO to a language that does not have it.

    ReplyDelete
  13. D'gou, allowing dunctions to return a list of recipients would add two lines of code. Python's list is very easy to use as a command queue. Instead of assigning the result of the function calls to a variable, you would enqueue them. Instead of calling the variable, you pop and then call.

    It isn't clear to me how TCO makes this easier. Every function has at most one "tail call" just as every list ha at most one tail. That's the definition of "tail".

    ReplyDelete
  14. Yes Paul, there's a confusion, but, i think, it's on your part. If you read carefully from the beginning, the series is not about Python, but about TCO benefits in general. :) You won't automatically get threading, because it's orthogonal to TCO, but with lightweight threading and TCO you'll get easy async delivery and 1-to-many messaging. For a good demonstration see Joe Armstrong "Programming Erlang" (chapter 8).

    ReplyDelete
  15. Vsevlod: "You won't automatically get threading, because it's orthogonal to TCO, but with lightweight threading and TCO you'll get easy async delivery and 1-to-many messaging. For a good demonstration see Joe Armstrong "Programming Erlang" (chapter 8)."

    Let me paraphrase:

    Barista: "With $1.50 and a coupon, you'll get a cup of coffee"

    Me: "How much is coffee usually?"

    Barista: "$1.50"

    ReplyDelete
  16. Paul, not quite. If you remove TCO you can't get actor model with threading only (without a global dispatcher, which makes a system coupled and bloated).

    ReplyDelete
  17. @JonathanAllen:

    Can't understand words?

    ReplyDelete
  18. @alex: no need to be insulting. Code is more precise. We're technologists: precision is valuable for us.

    ReplyDelete
  19. To implement the Actors model in Python is why the Twisted framework was invented. It works great. I have no doubt that a language with Actors model baked right in, such as the E programming language (http://erights.org ) would be a bit smoother than Twisted, which is Actors "bolted on" to Python. On the other hand, I do have doubt that tail call optimization is either necessary or sufficient to make Python useful as an Actors-model language.

    ReplyDelete
  20. See http://www.ccs.neu.edu/home/will/Research/Lisp50/scheme33.pdf
    for the historical perpective that led from actors to Scheme (and lexical closures and TCO)

    ReplyDelete