Saturday, 20 March 2010

A Crime Against Nature

Every so often, while writing Python, I've found myself wishing I could easily dispatch method calls according to the type of the arguments. The urge usually passes quickly, but... oh, the hell with it, there's no point trying to justify what I've done. Just look:

>>> import bondage
>>>
>>> class C(object):
... @bondage.discipline(int)
... def foo(self, arg):
... print 'int'
... @foo.discipline(str)
... def foo(self, arg):
... print 'str'
... @foo.discipline(int, str, int)
... def foo(self, arg1, arg2, arg3):
... print 'int, str, int'
...
>>> c = C()
>>> c.foo(1)
int
>>> c.foo('a')
str
>>> c.foo(1, 'a', 1)
int, str, int
>>> c.foo([])
Traceback (most recent call last):
File "", line 1, in <module>
File "bondage.py", line 18, in <lambda>
return lambda *args: self._dispatch(obj, *args)
File "bondage.py", line 22, in _dispatch
return self._argspecs[argspec](obj, *args)
KeyError: (<type 'list'>,)
>>>

I'd like to make it clear that there is absolutely no excuse for perpetrating this sort of insanity, ever. With that said, here's how I did it:

class discipline(object):

def __init__(self, *argspec):
self._argspecs = {}
self.discipline(*argspec)

def discipline(self, *argspec):
self._argspec = argspec
return self

def __call__(self, f):
self._argspecs[self._argspec] = f
return self

def __get__(self, obj, objtype=None):
return lambda *args: self._dispatch(obj, *args)

def _dispatch(self, obj, *args):
argspec = tuple(map(type, args))
return self._argspecs[argspec](obj, *args)

Obviously it's a stupid implementation, and if you wanted to do this properly you'd have to pay attention to subtypes, and do something clever with numeric types, and... oh, God, what am I saying?

Enough!

If you really want to do this "properly", use some other language where it's already built in, and begone.

Tuesday, 2 March 2010

The Joy of Self

I have a distant and fuzzy memory, from back in the day... when I was but a wee slip of a lad, sallying forth to do battle with million-line C++ monstrosities (and just barely escaping with my sanity intact purple monkey dishwasher), I came upon a Path class. It was probably called CPath: classes were new and shiny, so far as any of us knew at the time, and absolutely deserved a prefix to underline their special status. Look, ma, I'm programming Object Orientedly!

And, yeah, it was horrible. Big, clunky, confusing... I'd like to say that the blistering speed made up for it all, but that would be entirely untrue. It was nasty.

As a result of this, I had never since felt the slightest urge to create a Path class of my own... until today. I was trying to get an overgrown build script under control, and - despite my scars - it suddenly seemed like a good idea.

So I had a go.

And... well, this sort of thing is why I love Python, and is also the clearest illustration I've yet seen of why explicit self is a Good Thing. The following code is, as usual, hacked up from memory and may therefore contain hilariously deadly bugs; caveat lector.

import os, shutil

class Path(str):

def __new__(cls, path):
abs_ = os.path.abspath(path)
norm = os.path.normpath(abs_)
return str.__new__(cls, norm)

exists = property(os.path.exists)
isfile = property(os.path.isfile)
isdir = property(os.path.isdir)
# etc...

move = shutil.move
listdir = os.listdir
# etc...

def __getattr__(self, name):
return self.join(name)

def join(*args):
return Path(os.path.join(*args))

def delete(self):
if self.isdir:
shutil.rmtree(self)
else:
os.remove(self)
# etc...

Now, IMO, this was a massive win: I made the client code a lot less verbose, and hence clearer, and I did most of the work by trivially subclassing a builtin type and dropping in a bunch of standard library functions as methods. The real one has many more bells and whistles (in fact, I think I got a bit carried away) but hopefully you get the idea.

The crucial point is that I couldn't have done it so neatly without explicit self; also, amusingly, most of the explicit selfs in this class are in fact implicit. So there.

Thursday, 14 January 2010

Where's a plumber when you need one?

Assertion: On Win32, there's no point bothering with subprocess.PIPE -- just use tempfile.TemporaryFile instead.

Context: You create a Popen(cmd, stdout=PIPE, stderr=PIPE), and you let it run (with a timeout); sometimes it completes successfully, which is cool, and sometimes it doesn't, in which case you read stdout and stderr and try to figure out what went wrong. This is all fine and dandy until one day you add a *little* bit more logging to the tool you're calling, and it suddenly wedges forever.

Explanation: Of course, this is because you've filled up some buffer, which you should have been periodically emptying. However, you can't just select() and read one byte at a time, because select doesn't work with pipes on Win32; you can't just read() what's there, because it blocks and stops the timeout from working; you don't want to spin off another thread to do your reading, because that involves tedious extra code and feels like killing a fly with a sledgehammer; and you don't want to screw around with readline() because that also involves tedious bookkeeping and extra code.

Solution: So, just do the following (warning, coded from memory):


from subprocess import Popen
from tempfile import TemporaryFile
from time import time, sleep

def assert_runs(cmd, timeout=10):
out = TemporaryFile()
err = TemporaryFile()
end_time = time() + timeout
process = Popen(cmd, stdout=out, stderr=err)
while process.poll() is None:
sleep(0.1)
if time() > end_time:
process.terminate()
if process.returncode != 0:
raise AssertionError('%s FAILED (%s)\nstdout:\n%s\nstderr:\n%s' % (
cmd, process.returncode, out.read(), err.read()))


Indeed, it's icky to fill up your hard disk rather than some internal buffer, but you can get a lot more done before you run out of HD space. Now, this surely feels nasty, but IMO it's slightly less nasty (or, at least, less code) than anything else I've tried. Hopefully you, gentle reader, have an infinitely superior solution that you will detail in the comments. Surprise me!

Please note that "Don't use Windows, har har", and variants thereof, fail to qualify as "surprising" ;-).

Thursday, 24 December 2009

Spare batteries for IronPython

As we all know, Python comes with batteries included in the form of a rich standard library; and, on top of this, there are many awesome and liberally-licensed packages just an easy_install away.

IronPython, of course, includes *most* of the CPython standard library, but if you're a heavy user you might have noticed a few minor holes: in the course of my work on Ironclad, I certainly have. Happily for you I can vaguely remember what I did in the course of bodging them closed with cow manure and chewing gum; here then, for your edification and delectation, is my personal recipe for a delicious reduced-hassle IronPython install, with access to the best and brightest offered by CPython, on win32.

  • Install IronPython 2.6.

  • Download Jeff Hardy's zlib for IronPython and copy IronPython.Zlib.dll into IronPython's DLLs subdirectory (create it if it doesn't exist).

  • Download Jeff Hardy's subprocess.py for IronPython and copy it into IronPython's site-packages subdirectory.

  • Download Ironclad, and copy the ironclad package into IronPython's site-packages subdirectory. Yeah, maybe I'll sort out an installer one day, but don't hold your breath.

  • Install CPython 2.6.

  • Add CPython's Dlls subdirectory to your IRONPYTHONPATH environment variable.

  • Copy csv.py, gzip.py, and the sqlite3 directory from CPython's Lib subdirectory to IronPython's site-packages subdirectory.

  • Copy xml/sax/expatreader.py from CPython's Lib subdirectory to the corresponding location in IronPython's Lib subdirectory.

  • Download FePy's pyexpat.py, copy it to IronPython's Lib/xml/parsers subdirectory, and rename it to expat.py.

  • Download and install NumPy 1.3.0 and SciPy 0.7.1 for CPython, and copy them from CPython's site-packages subdirectory to IronPython's.


...and you're done. Start your ipy sessions with a snappy 'import ironclad', and enjoy.

Incidentally, you could just add CPython's site-packages to your IRONPYTHONPATH, and then you wouldn't have to copy extra packages over; the reason I don't do that is because having matplotlib on your path currently breaks scipy under ironclad -- can't remember exactly why -- and it's nice to have matplotlib installed for CPython.

Oh, and let me know if I've made any mistakes above: I just hacked this post together from slightly aged notes, and I'm too lazy to tear down and rebuild my environment to check that every detail is perfect.

Monday, 7 December 2009

Another snippet

I realised you could do this about a year ago, but I only just found an opportunity to use it.

def merge_dicts(d1, d2):
return dict(d1, **d2)

Satisfying :).

Saturday, 5 December 2009

Python return dictifier

I wrote this the other week, and thought it was cute enough to share:


def _dictify(keys, result):
if len(keys) == 1:
return { keys[0]: result }
return dict(zip(keys, result))

def return_dict(keys):
keys = keys.split()
def decorator(f):
def g(*_, **__):
return _dictify(keys, f(*_, **__))
return g
return decorator

Use it as follows:

>>> @return_dict('foo bar baz')
... def f():
... return 1, 2, 3
...
>>> f()
{'baz': 3, 'foo': 1, 'bar': 2}

Enjoy!

Thursday, 10 September 2009

Silver Bullets Aplenty

Context: We all know that there's no silver bullet that will slay the metaphorical werewolf embodying the fact that Making Software Is Difficult; we're also familiar with the difference between essential complexity and accidental complexity.

Assertion: what looks to us like essential complexity is, in reality, a symptom of inadequate domain understanding; just like "artificial intelligence", the phrase "essential complexity" fundamentally signifies "stuff we can't automate yet".

Support: For eample, consider Ruby on Rails, about which I know almost nothing except that it makes for good blog-fodder. As I understand it, it makes the development of a good-sized class of applications relatively trivial: it lets you focus on what's unique to your app, and magically takes care of everything else. To be instructively tautological: it removes most of the complexity inherent in implementing the sort of app that can be easily developed with Rails.

Further: Every time you hand a task off to code that someone else has written, you've carved off a small part of (what might, 10 years ago, have been thought to be) the essential complexity of the task you're coding, and negated the burden of worrying about it.

You don't have to personally understand everything that happens behind the scenes, because someone else has done the hard work of understanding the domain and providing you with effective abstractions. (Assume good-enough abstractions that don't leak in "normal use".)

Further further: Every time you hand a task off to code that *you* already wrote, you're doing the same thing. And as you figure out the right abstractions for your domain, whatever it is, the ugly details (hopefully) migrate away from the high-level get-stuff-done API; eventually, you can allow yourself the luxury of forgetting how the sausages are made[1], and just use your library without worrying about the details.

The common thread should be clear: to return to the original metaphor, software development is in fact the process of building (or borrowing) wooden stakes, silver bullets, phase plasma rifles or rapid-fire bee cannons, as appropriate, and employing them to ruthlessly clear the field of distracting accidental complexity.

My point is that there *are* silver bullets: we use them all the time, they work well, and we're always coming up with still more effective variants. The future's shiny, and smells of cordite.

[1] They're made out of a whole bunch of metaphors, mixed together quite hideously. Eww.