Global Filesystem

Many people complain about filesystems, and many people are trying to fix it, but they usually do so on an incremental basis. For example, John Cowan has been discussing a new logical filesystem paper that he found, and Gnome Storage is based on a database on top of the current filesystem. But what if we could start from scratch? What would we want from our filesystems then?

The proposition that this note explores is this: that global file identifiers can scale down, if you like, to local hard drives making things much more organized in many ways.

A Global Identifier Scheme

The obvious requirements for global filesystem identifiers is that they must be short, memorable, and unique in their social delegation. URIs fulfill the former of these requirements very well, can be memorable, but aren't particularly efficient in their size. The idea that I'll be persuing is, therefore, the Java packaging/JANET path idea of the backwards domain name.

But these identifiers must be free for all to use—if you had to register a domain name to use any computer, even one not connected to the internet, you wouldn't get that computer. So this can't be based on DNS. Since this is a very social requirement, not particularly difficult from the technical point of view (DNS has proven that the idea works), it would likely best lie in the hands of governments, who may want to outsource it to libraries.

One possible idea is to use: <country-ID>.<initials>.<chosenID> as the root for people's spaces. I would, therefore, have my pick of uk.sbp.* domains that haven't previously been taken, and perhaps one might even be assigned a default one, or be allowed to choose a new one every five years, or something like that. It's not much different to the delegation of social security or National Insurance numbers, only they have to be short, memorable, etc.

I must stress that this is, of course, a purely hypothetical system. It's technically possible to implement, but again it's the social aspect: there is absolutely no way that anything discussed herein will be implemented and I'm under no illusion of that. It's just nice to wonder what one could do were one given free reign over such matters; there may be valuable conclusions that can be drawn that can then be applied to matters over which one does have some sway.

So, let's say I pick—or am assigned by miracle—uk.sbp.inamidst. That would then take the place, in a way, of / on my filesystem, only it's actually more like it's taking the place of /home/sbp Note that len('uk.sbp.inamidst.') - len('/home/sbp/') is 6, so we haven't made anything easier yet.

Global Packages

The first hint of things starting to become easy is in getting things from, and publishing to, the internet. Usually, when you download something onto your computer, you choose a filename for it. With this system, you could still do that, but often you'll just download it "in place", if you like. So, for example, if the firefox browser is made available at us.corp.mozilla.firefox-1.0.3.tar.gz, then I can just do "cache us.corp.mozilla.firefox-1.0.3.tar.gz" and it'll be downloaded to the local computer. In fact, that filename will be the same on everyone's computer. Now you can see where the consistency starts to creep in.

Next, think about serving some files. All you do is something along the lines of "publish uk.sbp.inamidst.web" and that tree could be made publically available to everyone in the world. When they want to get something from that hierarchy, they can cp it into their own hierarchy, if they like, but again, more generally they'll probably just use the "cache" operation.

All of a sudden, the very idea of downloading starts to become blurry—the whole computer is much better integrated with the internet, conceptually.

Now, imagine this with the filesystem in general. Normally, on linux, you have /usr/bin/local, /opt, /usr/bin, etc.; whereas with this system, all of that would disappear: the packages would just live in their global-ID locations. The principle is, thus, that instead of having N filesystem designs, where N is the amount of computers in the world, instead you have 1. You know where everything is, and everything is the same on every other computer in the world. Whether something is available or not only depends on two things: whether you have internet access, or whether it's cached locally.

Other Ramifications

It's very hard to say what other advantages this would lead to without implementing it, but the possibility to start from scratch MIME types and metadata for file objects, merging them in such a way to be consistent, may also yield a number of tremendous benefits.

There are some very subtle things: for example, when you have /path/filename on filesystems today, /path, you can usually be certain, will be a directory. In this filesystem, on the other hand, both uk.sbp.inamidst.path and uk.sbp.inamdist.path.filename could both quite easily be files.

[etc.]

Addenda

Cody notes that the globalids are essentially crap; but note that on your computer you'd still be able to alias, say, / to uk.sbp.inamidst so that then you could use the computer in a very similar manner.

When you think about it, the computer is designed to be against you anyway. The system gets the root, whilst you're lumbered with /home/username on linux, and c:/Documents and Settings/username/My Documents or whatever on Windows. That's really not a good way to go about things, even on multi-user systems. And yes, ~/ and other such tricks do get around that to some extent, but even then we have programs stealing our space with their .config files, and in the end you have to recognize that filesystems are just thought of inherently backwards because designers think about how the system works first and then how the user interacts with it second.

Another interesting detail is that ucspi-http may already make some of this technologically possible to some extent. (Also: web.archive.org link for ucspi-http).

Sean B. Palmer