Tuesday, February 8, 2011

More about shuffle.

In my last post, I asked why
public static void shuffle(List<?> list) was given a ‘B’ grade. A few readers didn't see the problem, so I thought I'd clarify.

I was writing code to perform some operations on the elements of a collection. Multiple copies of the code will be running. I want to avoid having each copy being synchronized with the others, so it makes sense to iterate over the list elements in a non-deterministic order. This will not prevent multiple instances of the code from occasionally ‘colliding’ on a single object, but it will make this case the exception rather than the rule.

I originally wrote the code using a Set<Foo> rather than a List<Foo> because there is no natural ordering of Foo objects. The code looked like this:
    for (Foo foo : fooSet) {
        operateOnFoo (foo);
    }
where fooSet was an instance variable declared as Set<Foo>.

A Set is an unordered abstraction. Obviously, an iterator on a Set will return the objects is some order, but abstractly, there is no meaningful way to ‘change’ the order of the Set itself. A List, on the other hand, is an ordered abstraction that imposes a external order on the elements. (An array imposes the additional unnecessary constraints of compactedness on the indexes and random access.) Changing the instance variable from Set<Foo> to a List<Foo> was trivial.

The problem is now to ‘iterate pseudorandomly’. Pseudorandom iteration is not an uncommon operation, so I assumed there was a way to do it. Pseudorandom iteration is not common enough that a special iteration construct would be defined for it, so I assumed I would simply use a normal iterator and iterate over a permutation of the list. What immediately came to mind was something like:
    for (Foo foo : fooList.permute()) {
        operateOnFoo (foo);
    }
That doesn't work.

So Java doesn't provide a ‘permute’ method on a list. I vaguely recalled that Java had something called ‘shuffle’ rather than ‘permute’. However, the List class doesn't have a ‘shuffle’ method either. None of the methods on a List will permute it. That is strange because permuting a list is a fairly common operation.

I was sure that I had seen ‘shuffle’ somewhere, so I hauled out grep to search for some code that called it. I found some and chased down the documentation. The shuffle method is a static method in the Collections class. This means that I have to use a different syntax:
    for (Foo foo : Collections.shuffle(fooList)) {
        operateOnFoo (foo);
    }
That doesn't work.

The shuffle method mutates its argument but does not return a value. This means I cannot simply call the method as an expression, but I have to hoist the call to the nearest ‘Command’ level:
    Collections.shuffle(fooList);
    for (Foo foo : fooList) {
        operateOnFoo (foo);
    }
That works, but now I have to revisit my original design: is it OK to directly shuffle the fooList, or should I copy it first? Fortunately, I don't need a copy.

To recap, what I wanted to write was this:
    for (Foo foo : fooList.permute()) {
        operateOnFoo (foo);
    }
what I had to write was this:
    Collections.shuffle(fooList);
    for (Foo foo : fooList) {
        operateOnFoo (foo);
    }
By making shuffle ‘return’ void, the author has relegated the method to command contexts only. I happened to want to use it in an expression context.

I'm sure a lot of readers are wondering “So what?” It is in fact a trivial point, and now that I've discovered how to permute a list I doubt I'll encounter this particular problem in the future. But the point is that Java distinguishes between commands and expressions. You are prohibited from using a command where an expression is called for. However, you are permitted to use an expression where a command is called for. The programmer has to keep track of which things are commands because they require a command context. The programmer doesn't have to remember which things are expressions because they can be used in either context. By making shuffle a command, the author has added yet another item to my mental “remember these aren't expressions” list. If shuffle returned the shuffled list, I wouldn't have to think about it at all.

Arcane Sentiment noted that “ Java uses the return type as a way of declaring that the operation has side effects... ”. I guess I'm weird because I use the return type to declare the type of object that will be returned.
Post a Comment