Date: Thu, 31 May 2007 14:01:54 +0100
From: "Sean B. Palmer" <sbp@...>
To: "Tav Ino" <tav@...>
Subject: The Stable P2P Fallacy

I've been thinking about whether Stable P2P systems are required
anymore for storage now that drive space is so cheap.

By "Stable P2P", I mean a P2P system that you inject a file to and can
trust that it exists on the network whether you're connected or not.
Write once, worry not, as in Freenet. That's very different to
ostensible "Filesharing P2P" systems where the the likelihood of
accessing a file is directly proportional to how popular it is.

Stable P2P is only required if its benefits outweigh its
disadvantages. The only benefits of Stable P2P that I can think of are
a) you don't have to host your files, and b) if the system is designed
to accommodate it, you get many redundant copies of your files
automatically.

The only way that a Stable P2P system can work, then, is if there's
always more space available than there is data stored.

I did a few thought experiments. The first one was one new user to a
system every second, instantly uploading at 100kBps. There'd be only
32 million users in the system after a year, and yet there'd be over
360,000 petabytes! For comparison, the Internet Archive only has 2 PB,
and it archives the whole web many times over.

Next, I wondered how many 500 GB accounts you could give people on a
system that has five times more storage space than the Internet
Archive. It turns out to be only 20,000 accounts; that's on a 10 PB
system. Transferring and storing just 1 PB for a month on Amazon S3
costs about $350,000.

Yet there are places opening up unlimited storage. Yahoo! Mail, in
fact, so the rumour goes, are going to be opening up unlimited storage
accounts. Obviously the space isn't physically unlimited, but they're
gambling that people won't be able to transfer to the system enough
that their storage capacity will be exhausted.

It's easy to generate petabytes of information if you're a research
place or something like that, for example NOAA have 1 PB of climate
data, but for the average person storing just a small amount of the
most important data is quite possible to do for free if you know where
to look.

The basic model that's going on is that the Web forces us to account
for our data in the form of physical servers. There's no distribution.
But some places now have physical servers so large that anybody can
use them for individual practical services. The question is: how long
will storage capacity be larger than the rate of polite and ordinary
data production? Probably quite a long time when you consider that
unless we break into a new medium such as video holograms (oh yeah!),
we're not moving that far away from bytes. This message that I'm
writing is now about 2500 bytes... That means I could store this
document about a thousand billion times over in the Web Archive.

If people want to produce unreasonable amounts of data, I think it's
only natural that they should have to pay. The barrier for where
pricing starts should be set really high. For the rest of us,
producing reasonable amounts of data, it should be free.

The good news is that, with a little intelligence, the web is already
like that! So I don't think I see any room for Stable P2P.

-- 
Sean B. Palmer, http://inamidst.com/sbp/