Thursday 10 September 2009

Silver Bullets Aplenty

Context: We all know that there's no silver bullet that will slay the metaphorical werewolf embodying the fact that Making Software Is Difficult; we're also familiar with the difference between essential complexity and accidental complexity.

Assertion: what looks to us like essential complexity is, in reality, a symptom of inadequate domain understanding; just like "artificial intelligence", the phrase "essential complexity" fundamentally signifies "stuff we can't automate yet".

Support: For eample, consider Ruby on Rails, about which I know almost nothing except that it makes for good blog-fodder. As I understand it, it makes the development of a good-sized class of applications relatively trivial: it lets you focus on what's unique to your app, and magically takes care of everything else. To be instructively tautological: it removes most of the complexity inherent in implementing the sort of app that can be easily developed with Rails.

Further: Every time you hand a task off to code that someone else has written, you've carved off a small part of (what might, 10 years ago, have been thought to be) the essential complexity of the task you're coding, and negated the burden of worrying about it.

You don't have to personally understand everything that happens behind the scenes, because someone else has done the hard work of understanding the domain and providing you with effective abstractions. (Assume good-enough abstractions that don't leak in "normal use".)

Further further: Every time you hand a task off to code that *you* already wrote, you're doing the same thing. And as you figure out the right abstractions for your domain, whatever it is, the ugly details (hopefully) migrate away from the high-level get-stuff-done API; eventually, you can allow yourself the luxury of forgetting how the sausages are made[1], and just use your library without worrying about the details.

The common thread should be clear: to return to the original metaphor, software development is in fact the process of building (or borrowing) wooden stakes, silver bullets, phase plasma rifles or rapid-fire bee cannons, as appropriate, and employing them to ruthlessly clear the field of distracting accidental complexity.

My point is that there *are* silver bullets: we use them all the time, they work well, and we're always coming up with still more effective variants. The future's shiny, and smells of cordite.

[1] They're made out of a whole bunch of metaphors, mixed together quite hideously. Eww.

.NET Marshalling: a mildly sarcastic Q&A

1) Want to read or write a chunk of unmanaged memory that you know holds a double?

Easy! Just use Marshal.ReadDou... oh. Hmm.

OK, it seems that -- while doubles work fine in arguments, return values and struct fields -- the Marshal.Read/WriteDouble methods presumably got left in some internal backwater and never made it to the public interface. I'm not sure why this would be -- it seems quite the oversight -- but perhaps it's intended as an oblique philosophical statement: that if anyone needs to use unmanaged doubles directly then they are somehow Doing It Wrong.

Regardless, it may be that you really do need to read or write the odd unmanaged double, in which case you can just subvert the existing paths that do work with doubles. The two obvious options are Marshal.Copy (which has a mind-boggling range of overloads, all of which expect arrays); and PtrToStructure/StructureToPtr, which need you to define a struct containing a single double and read/write that.

Alternatively, you could write a couple of trivial functions in C:

double ReadDouble(double* address)
{
return *address;
}

void WriteDouble(double* address, double value)
{
*address = value;
}

...and, once you've loaded the resulting dll, and acquired the pointers to those functions, you can use Marshal.GetDelegateForFunctionPointer to make them available to your .NET code.

[DllImport("kernel32.dll")]
public static extern IntPtr LoadLibrary(string _);
[DllImport("kernel32.dll")]
public static extern IntPtr GetProcAddress(IntPtr _, string __);

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
public delegate double dgt_ReadDouble(IntPtr _);

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
public delegate void dgt_WriteDouble(IntPtr _, double __);

...

public dgt_ReadDouble ReadDouble;
public dgt_WriteDouble WriteDouble;

void Init(string path)
{
IntPtr lib = LoadLibrary(path);
IntPtr fpRead = GetProcAddress(lib, "ReadDouble");
ReadDouble = (dgt_ReadDouble)Marshal.GetDelegateForFunctionPointer(
fpRead, typeof(dgt_ReadDouble));
IntPtr fpWrite = GetProcAddress(lib, "WriteDouble");
WriteDouble = (dgt_WriteDouble)Marshal.GetDelegateForFunctionPointer(
fpWrite, typeof(dgt_WriteDouble));
}

I should point out that this approach is (1) moderately tedious to implement and (2) somewhat opaque to casual inspection, so I can't really recommend it in normal circumstances. Still, I wrote all that code -- I may as well post it.

2) Want to stub out unmanaged code with a managed delegate?

No problem: Marshal.GetFunctionPointerForDelegate returns a perfectly good function pointer, just as expected. Whether you then use it to neatly overwrite another function pointer somewhere, or just to poo a JMP instruction on top of the original implementation, is between you and your conscience.

However, there is at least one subtlety that may cause problems. What happens if unmanaged code somehow passes that function pointer back into managed code when you're not expecting it?

Let's say you're expecting a callback that can be converted to a FooDelegate. If you're given a genuine unmanaged function pointer, there's no problem: you can convert it to any delegate type you like (and, of course, suffer the consequences if you pick the wrong one). However, if you happen to be passed an unmanaged function pointer that was originally converted from some *other* managed delegate type, say a BarDelegate, you're out of luck -- the cast will fail.

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
public delegate int FooDelegate(IntPtr _, IntPtr __);

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
public delegate int BarDelegate(IntPtr _, IntPtr __);

No matter that a FooDelegate has the same return type, parameter types, calling convention and star sign as a BarDelegate -- it seems that they are so fundamentally different from one another that any attempt to convert between them, however circuitous the path, must be forbidden. I can understand why; I just don't like it. Goddamn static typing weenies ;).

And, if you hit problems of this nature, there's really nothing you can do about it but to autogenerate a bunch of code to define *one* delegate type per signature, and to *always* use that delegate type for unmanaged functions with that signature. It's stupid and ugly and it sucks, and it has a terrible tendency to combine with nasty unmanaged interfaces to create names like 'dgt_ptr_ptrptrptrptrintintintintptrptrintptr', but it will help you get around this issue.

3) You've just called across the boundary for the first time, and it didn't crash, and you're feeling on top of the world?

Sweet! But, before you go any further, please make sure that the arguments you passed in actually arrived safely at the other end.

If they aren't exactly as you expect, you probably used the wrong calling convention (and weren't lucky enough to crash immediately). Doing this will mortally wound your stack, but the process will probably limp along for a while, until it finds a suitably misleading moment to explode messily. This explosion will be reproducible, but it will also be utterly bizarre, and the most apparently trivial of changes can lead to new explosions in apparently unrelated locations; you can easily lose a day or two chasing the stack pointer fairies. Not fun.

And, if you'd just checked your data in the first place, you might have noticed you were slinging garbage about *before* you set off on your little jaunt.

4) Want to use an unmanaged API that takes a FILE*?

I weep for your soul, but you can indeed do such a thing.

Stream.DangerousGetHandle will enable you to extract the underlying file handle from a managed stream, after which you can get a FILE* with _open_osfhandle and _fdopen... but be advised, that 'Dangerous' there ain't just for show. The moment that unmanaged code operates on the file, the .NET stream becomes a massive and deadly liability: simple operations may fail silently, which is bad enough, but sometimes giant rocks fall from the sky and kill everyone.

So, don't do that, if you can possibly avoid it. Ideally, figure out some way to use unmanaged file handles throughout, and wrap them yourself for .NET if you have to.

5) So, you've taken my advice, and now you want to play around with unmanaged streams?

Go ahead. The aforementioned wrapping is completely irrelevant to this topic, so I leave that as an exercise for the reader (ha!). However, you probably need to be aware that you may be working with multiple versions of fopen, fread, fwrite, etc; one for each of the many versions of the Microsoft C runtime.

So, if and when you see inexplicable crashes when something passes an obviously valid FILE* to (say) fread, you should check whether the producing fopen and the consuming fread come from the same runtime. If not, you've found your problem; the solution is easily stated (use functions from matching runtimes) and might even be easily implemented. Your mileage may vary.

6) Want to know how to marshal C varargs into a .NET method?


Ideally, don't. The best solution I could come up with was frankly too evil to live, or ever to speak of again; and I say that as a man who has cold-bloodedly perpetrated every technique discussed in this post.

If you have a nice way to do it, I'd be interested to hear :-).