Thursday, April 16, 2015

Dwemthy's Array in Python, With a Lengthy Aside Into Transparent Proxying

Dwemthy's Array, or Since This is Python, Dwemthy's List

Dwemthy's Array is an example that is often [citation needed] used to illustrate the ease of metaprogramming in Ruby. This blog post is to talk about implementing Dwemthy's Array in Python, to show how some of the those metapgoramming techniques would be accomplished.

The Array

In Dwemthy's Array, you play the role of a rather adventuresome rabbit. You head into the Array and fight monsters until either they or you are dead. The general flow is something along the lines of:

from dwemthy import the_array, the_rabbit

the_rabbit ^ the_array # throw your boomerang at the first monster in the Array
the_rabbit * the_array # throw a bomb at the first monster in the Array
the_rabbit / the_array # strike with your sword at the first monster in the Array
the_rabbit % the_array # eat some lettuce...at the first monster in the Array

Each of the actions above results in messages being printed to the console, usually indicating your immediate death but sometimes indicating that you both injured a monster and survived to play another turn.

The original Ruby implementation showed three interesting things:

  • A mechanism using automatically generated properties to easily declare monster classes.
  • The overriding of standard mathematical operators to implement attacks.
  • That battling the Array is equivalent to battling the first monster in the Array.

From what I've seen around the Internet, people seem to focus on the first point (the automatically generated properties) as being the most interesting part, but I personally think that it is the last point that is the most interesting.

Defining Creature Classes

Creatures are all instances of the class Creature. This class provides the hit and fight methods. Invoking fight fights the creature, and hit is invoked whenever the creature is hit:

class Creature:

    name = "Creature"

    def hit(self, damage):
        p_up = random.randint(0, self.charisma)
        if p_up % 9 == 7:
            self.life += p_up // 4
            print("[{} magick powers up {}!]".format(self.name, p_up))

        self.life -= damage
        if self.life <= 0:
            print("[{} has died!]".format(self.name))

    def fight(self, enemy, attack):
        if self.life <= 0:
            print("[{} is too dead to fight!]".format(self.name))
            return

        your_hit = random.randint(0, self.strength + attack)
        print("[You hit with {} points of damage!]".format(your_hit))
        enemy.hit(your_hit)

        if enemy.life > 0:
            enemy_hit = random.randint(0, enemy.strength + enemy.weapon)
            print("[{} hit you with {} points of damage!]".format(enemy.name, enemy_hit))
            self.hit(enemy_hit)

We now want to define a bunch of different subclasses of Creature, to provide lots of different kinds of monsters to fight. The original Ruby implementation made this simple by defining several "traits" that could be defined on Creature subclasses, like life for hitpoints and weapon for weapon strength. There was some thought about adding THAC0, but this was abandoned when no one could figure out what it meant.

In my original implementation of Dwemthy's Array in Python, I used decorators to assign these attributes to creatures:

# implementation of life, strength, etc, elided...
@life(42)
@strength(2)
@charisma(44)
@weapon(4)
@bombs(3)
class Rabbit:
    name = 'A Rather Adventuresome Rabbit'

This worked fine and is a perfectly valid implementation tactic, but I rewrote my implementation to instead use higher-order mixin classes to define the properties:

import functools

def Stat(name, default=None, doc=None):
    propname = "__stat_{}".format(name)

    def getter(self):
        return getattr(self, propname, default)

    def setter(self, value):
        if not isinstance(value, int) or isinstance(value, bool):
            raise TypeError("stats must be ints")

        setattr(self, propname, value)

    class Inner:
        pass

    setattr(Inner, name, property(getter, setter, None, doc))
    return Inner

Life = functools.partial(Stat, "life")
Strength = functools.partial(Stat, "strength")
Charisma = functools.partial(Stat, "charisma")
Weapon = functools.partial(Stat, "weapon")
Bombs = functools.partial(Stat, "bombs")

This lets us define the creatures that inhabit Dwemthy's Array with default values for their stats:

class IndustrialRaverMonkey(Life(46),
                            Strength(35),
                            Charisma(91),
                            Weapon(2),
                            Creature):
    name = "Industrial Raver Monkey"

class DwarvenAngel(Life(540),
                   Strength(6),
                   Charisma(144),
                   Weapon(50),
                   Creature):
    name = "Dwarven Angel"

class AssistantViceTentacleAndOmbudsman(Life(320),
                                        Strength(6),
                                        Charisma(144),
                                        Weapon(50),
                                        Creature):
    name = "Assistant Vice Tentacle and Ombudsman"

class TeethDeer(Life(655),
                Strength(192),
                Charisma(19),
                Weapon(109),
                Creature):
    name = "Teeth Deer"

class IntrepidDecomposedCyclist(Life(901),
                                Strength(560),
                                Charisma(422),
                                Weapon(105),
                                Creature):
    name = "Intrepid Decomposed Cyclist"

class Dragon(Life(1340),
             Strength(451),
             Charisma(1020),
             Weapon(939),
             Creature):
    name = "A Rather Large Dragon"

You'll notice that all of the various creatures have the same attributes: life, strength, charisma, and weapon. Why didn't we just make those part of the Creature base class, and have their default values specified in a constructor, or even as class variables?

We certainly could have done it that way, but part of what Dwemthy's Array illustrates is that not all of the creatures in its world have the same attributes. This leads us to you, the Rabbit:

class Rabbit(Life(8),
             Strength(2),
             Charisma(44),
             Weapon(4),
             Bombs(3),
             Creature):

    name = "The Adventuring Rabbit"

    def __xor__(self, enemy):
        self.fight(enemy, 13)

    def __truediv__(self, enemy):
        self.fight(enemy, random.randint(0, 4 + (enemy.life % 10) ** 2))

    def __mod__(self, enemy):
        lettuce = random.randint(self.charisma)
        print("[Healthy lettuce gives you {} life points!!]".format(lettuce))
        self.life += lettuce
        self.fight(enemy, 0)

    def __mul__(self, enemy):
        if self.bombs <= 0:
            print("[UHN!! You're out of bombs!!]")

        else:
            self.bombs -= 1
            self.fight(enemy, 86)

The Rabbit has an additional attribute, as indicated by its inheriting from the Bombs mixin. The Rabbit (i.e. you) has equipped itself (i.e. yourself) with bombs (i.e. bombs). These were trivially mixed in to the definition of Rabbit using the higher-order mixin technique, as an example of metaprogramming.

Note also that Rabbit overrides the __xor__, __truediv__, __mod__, and __mul__ methods. These are magic methods that are invoked by the Python runtime to implement the ^, /, %, and * operators. This is a feature of the Python data model that allows user-defined data types to participate in programs as first-class types.

The simple definition of the various Creature subclasses and the default values for their stats is what seems to be all the rage for a lot of people who read about and implement Dwemthy's Array, and no one bats an eye at the overriding of the mathematical operators. Relatively few people, it seems, are as interested in the other example of metaprogramming in Dwemthy's Array.

An Aside - Transparent Proxy Objects in Python

The most interesting aspect of the original Ruby implementation of Dwemthy's Array is that attacking the Array is equivalent to attacking the first creature in the Array. This is done through Ruby's method_missing mechanism. When a message is sent to an object in Ruby that doesn't implement a method to handle it, the Ruby runtime passes the message and its arguments to the object's method_missing method, should it be defined. mehod_missing can choose to do whatever it wants with the message. In Ruby, the message is simply forwarded to the first object in the Array after the method_missing routine prints some messages. After the forwarded call returns, its return value is stored, method_missing does some more stuff, and then finally returns the value of the call.

This little implementation technique is something that is very easy in Ruby, and in other languages like Smalltalk-80, Objective-C, and Io. It's not as easy in Python.

Why isn't it as easy as in Python? What the Ruby version is doing is essentially making the Array itself as transparent proxy object for the first member in the Array. Transparent proxies in Python are doable, but aren't nearly as simple. This is for two main reasons:

  • Getting a member variable of an object is a distinct operation from invoking a method.
  • The mechanism used to get member variables is inconsistent.

Let me talk about the first point first, and then we'll get to the second one. When we invoke a method on an object in Python, say, like this:

x = "I'm a string!"
x.endswith("!") # returns True

What is really happening under the hood is something like this:

method = getattr(x, "endswith")
method("!") # likewise returns True

This second piece of code can be executed with the same results as the first one. What's happening is that the Python runtime sees an attribute access (the . operator) of object x, asking for attribute endswith. The runtime gets that attribute, which happens to be a bound method. It then sees that the attribute is being called (the () operator), and so it calls it -- but notice that this is a separate operation.

We can easily override the the attribute-retrieval part of the process by overriding __getattribute__:

class Proxy:
    """Wrap some object in a transparent proxy."""

    def __init__(self, other):
        self.other = other

    def __getattribute__(self, attr):
        try:
            result = object.__getattribute__(self, attr)
            return result

        except AttributeError:
            pass

        return getattr(self.other, attr)

x = "I'm a string!"
p = Proxy(x)
p.endswith("!") # returns True

If, however, we don't find an attribute of the given name in the proxy, we then look for the attribute in the proxied object. This method works even if we override methods:

class ReverseStringProxy(Proxy):
    def upper(self):
        """When we uppercase a string, also reverse it."""

        return "".join(reversed(self.other.upper()))

x = "I'm a string!"
p = ReverseStringProxy()
p.upper() # returns !GNIRTS A M'I

But what if we wanted the string to be reversed no matter what method we called?

This is where the problem arises. In Ruby, one simply overrides the method_missing method and that method is passed an object representing the invocation. One could simply perform the invocation, store its result locally, then do the reverse operation and return the reversed string, and it doesn't matter what method was invoked.

In Python, because the method is retreived first, and then executed, we have two options:

  • Override every method that we know will be invoked on the Proxied object.
  • Rewrite methods as they're retrieved.

The first option works well if we know we're only ever going to proxy a certain set of classes, but fails in the general case. The second solution is more robust, if less clear:

class TransparentReverseProxy:
    """Wrap some object in a transparent proxy."""

    def __init__(self, other):
        self.other = other

    def __getattribute__(self, attr):
        try:
            result = object.__getattribute__(self, attr)
            return result

        except AttributeError:
            pass

        other_attr =  getattr(self.other, attr)
        if callable(other_attr):
            return lambda *args, **kwargs: "".join(reversed(other_attr(*args, **kwargs)))
        return other_attr

x = "I'm a string!"
p = TransparentReverseProxy(x)
p.lower() # Returns !gnirts a m'i

See what we ended up doing? Whenever an attribute is retrieved from the proxied object, we check to see if it's callable. If so, we wrap it in a lambda that performs our transformation, regardless of what method we end up invoking.

But wait! Remember, I said there's another problem with proxying objects transparently in Python. Special methods like we discussed above, like __mul__ and __mod__ aren't looked up through __getattribute__.

The reason for this is fairly obscure. Say, for example, that we ran this code:

(1).__hash__() == hash(1) # Returns True

The hash builtin function ends up ultimately invoking the __hash__ method of the object it's testing.

However, if looking up the __hash__ method was done by invoking __getattribute__, it would fail if the method is invoked on the class itself. For example:

int.__hash__() == hash(int)

We grab the __hash__ method out of the int type, and execute it. But it's not a bound method, it's just a function that happens to be in the class. Bound methods belong to object instances, not classes. So we invoke the __hash__ function, not method, and the function expects an argument. Calling the code above results in a TypeError because of this.

The correct way to do this sort of thing is like this:

type(int).__hash__(int) == hash(int)

Because while we're still calling a __hash__ function, it's the function of int's metaclass, and we're passing int to it.

Anyway, long story short, the special methods like __hash__ and __len__ are exempt from lookup via __getattribute__.

What's more disappointing, at least to me, is that the Python data model doesn't let you override the __getattribute__ method of a classes metaclass to do special-method lookup. This would've been an elegant way to handle this sort of thing, but Python still bypasses __getattribute__ on special-method lookup for reasons of efficiency.

What all this means is that given our Proxy class above has a serious deficiency. Recall:

x = "I'm a string!"
p = Proxy(x)
x.endswith("!")

works. But this, sadly, does not:

x = "I'm a string!"
p = Proxy(x)
len(x) # raises a TypeError, because class Proxy doesn't define __len__

TL;DR

The way around this is simple, but laborious: we define versions of the various special methods inside the Proxy class that manually forward the calls to the other object's methods. It's laborious, but it only needs to be done once, since these special methods are a well-defined, finite set.

Howewever, this part of the discussion is relatively moot for the implementation of Dwemthy's Array in Python, because we don't expect to pass any special method invocations to the Array itself; all of the overridden special methods are in the Rabbit class and while the Rabbit is fighting the Array, it's not actually ever in it.

And Now DUN DUN DUN The Array

So the magic in Dwemthy's Array is that this:

a = DwemthysArray([Dragon()])
r = Rabbit()
r / a

is equivalent to this:

d = Dragon()
r = Rabbit()
r / d

To make this work, we create a subclass of collections.UserList called DwemthysArray and override the __getattribute__ method to grab the attributes from the first object in the array, using the technique described above to transparently rewrite method calls:

class DwemthysArray(collections.UserList):

    def __getattribute__(self, attr):
        if len(self.data) > 0:
            if hasattr(self.data[0], attr) and callable(getattr(self.data[0], attr)):
                func = getattr(self.data[0], attr)
                def inner(*args, **kwargs):
                    answer = func(*args, **kwargs)
                    if self.data[0].life <= 0:
                        self.data.pop(0)
                        if len(self.data) == 0:
                            print("[Whoa. You defeated Dwemthy's Array!]")

                        else:
                            print("[{} has emerged.]".format(self.data[0].name))

                    return answer
                return inner

            else:
                return getattr(self.data[0], attr)

        return object.__getattribute__(self, attr)

Finally, we can define the Array itself, and the Rabbit:

the_array = DwemthysArray([
    IndustrialRaverMonkey(),
    DwarvenAngel(),
    AssistantViceTentacleAndOmbudsman(),
    TeethDeer(),
    IntrepidDecomposedCyclist(),
    Dragon()])

the_rabbit = Rabbit()

And now, at last, to battle!

>>> the_rabbit / the_array
[You hit with 4 points of damage!]
[Industrial Raver Monkey hit you with 35 points of damage!]
[The Adventuring Rabbit has died!]

Hooray!

No comments:

Post a Comment