Date: Thu, 31 May 2007 14:01:54 +0100 From: "Sean B. Palmer" To: "Tav Ino" Subject: The Stable P2P Fallacy I've been thinking about whether Stable P2P systems are required anymore for storage now that drive space is so cheap. By "Stable P2P", I mean a P2P system that you inject a file to and can trust that it exists on the network whether you're connected or not. Write once, worry not, as in Freenet. That's very different to ostensible "Filesharing P2P" systems where the the likelihood of accessing a file is directly proportional to how popular it is. Stable P2P is only required if its benefits outweigh its disadvantages. The only benefits of Stable P2P that I can think of are a) you don't have to host your files, and b) if the system is designed to accommodate it, you get many redundant copies of your files automatically. The only way that a Stable P2P system can work, then, is if there's always more space available than there is data stored. I did a few thought experiments. The first one was one new user to a system every second, instantly uploading at 100kBps. There'd be only 32 million users in the system after a year, and yet there'd be over 360,000 petabytes! For comparison, the Internet Archive only has 2 PB, and it archives the whole web many times over. Next, I wondered how many 500 GB accounts you could give people on a system that has five times more storage space than the Internet Archive. It turns out to be only 20,000 accounts; that's on a 10 PB system. Transferring and storing just 1 PB for a month on Amazon S3 costs about $350,000. Yet there are places opening up unlimited storage. Yahoo! Mail, in fact, so the rumour goes, are going to be opening up unlimited storage accounts. Obviously the space isn't physically unlimited, but they're gambling that people won't be able to transfer to the system enough that their storage capacity will be exhausted. It's easy to generate petabytes of information if you're a research place or something like that, for example NOAA have 1 PB of climate data, but for the average person storing just a small amount of the most important data is quite possible to do for free if you know where to look. The basic model that's going on is that the Web forces us to account for our data in the form of physical servers. There's no distribution. But some places now have physical servers so large that anybody can use them for individual practical services. The question is: how long will storage capacity be larger than the rate of polite and ordinary data production? Probably quite a long time when you consider that unless we break into a new medium such as video holograms (oh yeah!), we're not moving that far away from bytes. This message that I'm writing is now about 2500 bytes... That means I could store this document about a thousand billion times over in the Web Archive. If people want to produce unreasonable amounts of data, I think it's only natural that they should have to pay. The barrier for where pricing starts should be set really high. For the rest of us, producing reasonable amounts of data, it should be free. The good news is that, with a little intelligence, the web is already like that! So I don't think I see any room for Stable P2P. -- Sean B. Palmer, http://inamidst.com/sbp/