Sunday, January 3, 2010

Dynamic Dispatch in Scala

Clojure has runtime polymorphism that goes beyond the single, receiver-based dispatch that you find in Java or Scala. You can dispatch based not only on the type of all objects involved (multiple dispatch), but also on the result of an arbitrary function (predicate dispatch).

Why should I care? Let's explore this in Scala using the Bunny/Lion example as shown on Clojure site.


trait Animal {
def encounter(a: Animal) = "Generic animal encounter"
}

class Bunny extends Animal {
override def encounter(a: Animal) = "Generic Bunny encounter"
def encounter(b: Bunny) = "Mate"
def encounter(l: Lion) = "Run away"
}

class Lion extends Animal {
override def encounter(a: Animal) = "Generic Lion encounter"
def encounter(b: Bunny) = "Eat"
def encounter(l: Lion) = "Fight"
}


This does what we want in a straight-forward example:

val bunny = new Bunny
val lion = new Lion
bunny encounter lion // Run away
bunny encounter bunny // Mate
lion encounter lion // Fight
lion encounter bunny // Eat


But, what happens if we are dealing with a List[Animal]? Or, we have some convenience function like this:

def printEncounter(a1: Animal, a2: Animal) = {
def s(a: Animal) = a.getClass.toString.split("\\.").last
println(s(a1) + " vs " + s(a2) + ": " + (a1 encounter a2))
}
val bunny = new Bunny
val lion = new Lion

printEncounter(bunny, lion)
printEncounter(bunny, bunny)
printEncounter(lion, lion)
printEncounter(lion, bunny)

/* Output:
* Bunny vs Lion: Generic Bunny encounter
* Bunny vs Bunny: Generic Bunny encounter
* Lion vs Lion: Generic Lion encounter
* Lion vs Bunny: Generic Lion encounter
*/


Oh, my! We seem to have dynamically determined which method definition to use based on the runtime type of the receiver object, but not on any of the objects in the argument list. This is the standard functionality you get with Java or Scala.

How would we fix this in Scala? A first pass might look like this:

trait Animal {
def encounter(a: Animal) = "Generic animal encounter"
}

class Bunny extends Animal {
override def encounter(a: Animal) = a match {
case b: Bunny => "Mate"
case l: Lion => "Run away"
case _ => "Generic Bunny encounter"
}
}

class Lion extends Animal {
override def encounter(a: Animal) = a match {
case b: Bunny => "Eat"
case l: Lion => "Fight"
case _ => "Generic Lion encounter"
}
}


Of course, like Clojure's multimethods, pattern matching in Scala is not limited to just type information, but can also look at dynamic runtime information. You could easily imagine a lazy rabbit that only ran from lions that had been identified as hungry at the time:

class Bunny extends Animal {
override def encounter(a: Animal) = a match {
case b: Bunny => "Mate"
case l: Lion if (l.isHungry) => "Run away"
case l: Lion => "Ignore"
case _ => "Generic Bunny encounter"
}
}


So, how is this different from the Clojure approach? Rich Hickey commented on this, which I'll quote here:
Multimethods differ from pattern matching in a number of ways. Two that matter (to me) are:

Pattern matching is usually ‘closed’, in the sense that all possibilities must be enumerated in a single scope.

Pattern matches are usually declaration order dependent.

Multimethods, OTOH, are an open set. You can define a new method any time, anywhere – they need not be lexically co-located. Multimethod operation is independent of definition order. They are very dynamic, and thus a good fit for Clojure.

There are other differences, of course, e.g. pattern matching usually has sugar for structural and type matches, and often exhaustiveness checking. I’m still waiting for someone to contribute a pattern-matching macro for Clojure, as I don’t think one makes the other redundant.


What could we do in Scala to address some of the deficiencies that Rich is pointing out? A dispatch mechanism could be defined like this:

object DynDispatch {
class DynMethod[A,R] {
val methods = scala.collection.mutable.ListBuffer[PartialFunction[A,R]]()
def defMethod(m: PartialFunction[A,R]) = {
methods += m
}
def apply(args: A): R = {
methods.reverse.find(_.isDefinedAt(args)) match {
case Some(f) => f(args)
case None => throw new Exception("huh?")
}
}
}
def defMulti[A,R] = new DynMethod[A,R]
}


Class and method definition would then look like this:

trait Animal {
val encounter = DynDispatch.defMulti[Animal, String]
encounter.defMethod { case a: Animal =>
"Generic Encounter"
}
}

class Bunny extends Animal {
encounter.defMethod { case a =>
"Generic Bunny Encounter"
}
encounter.defMethod { case b: Bunny =>
"Mate"
}
encounter.defMethod { case l: Lion =>
"Run away"
}
}

class Lion extends Animal {
encounter.defMethod { case a =>
"Generic Lion encounter"
}
encounter.defMethod { case b: Bunny =>
"Eat"
}
encounter.defMethod { case l: Lion =>
"Fight"
}
}


Calling syntax is the same as with the traditionally-defined methods, but the methods are dispatched dynamically. This seems to address the "closed" issue of needing to enumerate all the possibilities within a single scope. Heck, you can even add or redefine methods for individual instances of objects:

val lion = new Lion
val bunny = new Bunny
bunny.encounter.defMethod { case l: Lion if (!l.isHungry) =>
"Ignore"
}
bunny encounter lion // depends on if the lion is hungry


Note that the methods are still type-safe, as the type of the argument list is required to be declared in the DynDispatch.defMulti[A,R] method. For methods that have differing arities or disjoint types, you lose some type safety, but the syntax is still convenient:

class TigerWoods {
val tryst = DynDispatch.defMulti[AnyRef,Alimony]
tryst.defMethod { case m: Mistress =>
Alimony(1000000)
}
tryst.defMethod { case (m1: Mistress, m2: Mistress) =>
Alimony(5000000)
}
}

// ...

val t = new TigerWoods
t.tryst(Mistress("Diane"))
t.tryst(Mistress("Emily"), Mistress("Sara"))


One big disadvantage of this implementation is the runtime overhead of method definition and queuing with each object instantiation. I don't know to what extent this could be minimized with some magical combination of caching and laziness.

We are also still dependent on order of definition to resolve any ambiguities within the method definitions. This naive implementation takes the easiest path and matches from bottom to top within a scope, with scopes from instance -> subclass -> superclass. A more sophisticated approach would find the "most specific" combination of the receiver and argument list, with ambiguities resolved with runtime exceptions or some other approach (like order dependence).

I like the order dependence of pattern matching within Scala, where the different cases are all present within the same block. However, for this application, with methods defined in multiple places, I can see it as a source of confusion. But, I'm also not a big fan of discovering ambiguities at runtime.

What is your favorite way to dispatch?

4 comments:

Anonymous said...

in clojure you can do whatever the * you want. it's a lisp, the language is like putty in your hands. that's all that need be said really.

Unknown said...

Thanks for the post, really interesting... "Greenspunning" Scala is easier than I thought!

Anonymous said...

Nice dispatch and this fill someone in on helped me alot in my college assignement. Thanks you as your information.

imbunatatire said...

thaaanks...it's been troubling me for 3 weeks now !!

Really thanks! :)