unwrapping wrappers
I started writing this on Thursday and got distracted indulging my inner eukaryote. I was hung up on interface design again, so I dove into coding a working example then went out to live a little outside the house, all of which resulted in a backlog of thoughts that I'm finally publishing now on Tuesday because this is a time traveling blog.
My first idea for building a libnbd
IO manager for libext2fs
was for pyext2fs
to expose an IOManager
type that could be
subclassed from python, where I'd implement the IO manager interface
through methods (presumably using the libnbd
python
bindings). Then I'd have an internal glue structure that exposes the
callbacks libext2fs
is expecting, which would each
CallFunction()
on a stored pointer to a concrete IOManager
.
The first hiccup there is that io_manager.open()
, and subsequently
ext2fs_open()
, doesn't accept user data, which means I have no way
of storing that pointer to do the dispatching from within the
callbacks. I could still pass in an object to the high level
ext2fs.open()
that I would use to fiddle with a global type
pointer and have the callbacks dispatch through there. I've seen this
in the pyalpm wrapper for
instance, where callbacks specified by the public portions of the
python module are stored in a global lookup table. In the case of
libalpm
those callback functions do accept a data pointer, so that
design isn't as necessary there as it is here. But even though I could
make the argument of necessity here, I'm still shying away from that
statefulness. I'm worried that shared state might prevent me from
reading from one filesystem backed by one type of storage and writing
to another backed by a different type of storage, or at least make it
more difficult. Though not a high priority use case at the moment, the
approach sounds too limiting from the get go. I don't think I'm even
considering locking access to that pointer as an option, I think I'm
motivated to seek out an alternative.
Another thought was to implement an upstream ext2fs_open3()
that
does accept user data to be handed to io_manager.open()
. That
would break any extant custom IO managers out in the world so in that
approach I think I'd need to amend that structure to support an
additional open2()
, and re-implement unix_io_manager.open()
to
call open2(..., NULL)
. If I had that then I could pass a python
object to store in the io_channel
that the io_manager.open2()
implementation allocates, as I first had in mind.
I tossed around ideas of separate library packages, like libext2io
and pyext2io
, but eventually dove in by adding the IO manager to
pyext2fs
itself. That means that package now has a dependency on
libnbd
, but I think I have a rough outline for iterating away from
there. I modeled the opening API after the engine URIs in
sqlalchemy
. By default the library assumes a file
scheme,
which dispatches to the "unix" IO manager provided by libext2fs
.
import ext2fs # equivalent to un-qualified "path/to/image.ext2fs" with ext2fs.open("file:path/to/image.ext2fs") as fs: with fs.open("/a/file") as fin: print(fin.read(1024))
The NBD IO manager uses nbd_connect_uri()
under the hood, and any
URIs with supported schemes are passed there. So if only the location
of the image changes then only the URI has to change:
import ext2fs with ext2fs.open("nbd+unix:///?socket=nbd.sock") as fs: with fs.open("/a/file") as fin: print(fin.read(1024))
From here I think I'll move the NBD parts to a separate
pye2nbd
. He'll need a dependency on pyext2fs
, which will need
to expose a factory to wrap the opener that pye2nbd
implements. Then pyext2fs
can attempt an import of the
"engine"-specific IO manager package, and that can be where I dictate
and serialize the URI scheme mapping. And then, finally, e2http
will depend on both of those packages.