Gallimaufry of Whits (2007-06)

These are quick notes taken by Sean B. Palmer on the Semantic Web, Python and Javascript programming, history and antiquarianism, linguistics and conlanging, typography, and other related matters. To receive these bits of dreck regularly, subscribe to the feed. To browse other months, check the contents.

2007-06-03 12:57 UTC:

The Pamphlet Saga so far: Pamphleteering and More Pamphleteering. Today has been a day of Countryfile, chatting with Terje about Blue Moon, and the like. These sorts of things often make me think about the Pamphleteering idea. I really need to boil the concept down into its most useful constituents and give it at least a preliminary name so that I can start to hype the crap out of it. I'm really stoked about this idea, even half a month since conceiving it.

I've already got 5000+ words written about the idea, and that's just a brief overview of the whole thing. William said that after scribbling down notes to myself I should think about writing them for one or two other people, and that sounds like a pretty good stepping stone.

2007-06-03 14:10 UTC:

I set a Swhack Challenge today: find the original source of Lagrange's famous comment about Newton. Not long after, I found the answer myself: it's from Jean-Baptiste Delambre's eulogy of Lagrange printed in 1816. Translated into English, thanks to Robin Berjon and Daniel Biddle, the phrase reads something like, "also Mr Lagrange, who often quoted him as the greatest genius ever to have existed, added immediately: and the most fortunate; one only finds once a system of the world to establish".

2007-06-06 13:21 UTC:

Charles Goodier, "mediovia", mentioned today in passing that he's reading Whits, which was a nice surprise. Hello Charles! I met him years ago at the WTF1 conference in London, and he gave me a disk of the Wikipedia dump that he'd kindly prepared for me. I still have it... you don't forget an introduction like that. "Oh hello. Here's that huge amount of information you asked for."

Charles is working on the finance side of things with ESP, which I've mentioned once or twice. The current state of affairs with that is that Tav, who started the 24 Weeks project, has bumbled off to Spain until the 18th, and so people have been left to regroup a bit in his wake. I've gone back to playing neutral observer; there isn't a huge amount to observe, but it's an interesting time of mutiny on the high leas.

Of course I'm distracted anyway with the camera that I mentioned back in February. It's always fun when there's a change of routine since you can dispel old habits. BreakBrokenTraditions and all that. The manual mode is a lot easier to use on this camera, so I've been playing about with all the settings and I'm taking some pretty stonkingly good photos. I've got macro down pat, and outdoor shots are getting pretty good. Indoors is a bit hit and miss. There's a slight problem with the centre weighted zoom in that it's not all that... centre weighted, so I might switch to point zoom and see if that's any better. Apart from that, it's proving great fun. I love photography.

2007-06-06 13:35 UTC:

To readers of Lo and Behold!, if there be any: I haven't forgotten. I've got some cracking ideas, but they're not coming together at the moment, so I'll probably have to switch to Plans B meanwhile. I'm still planning to get the first part of the serialisation published this month, after the current busy spell, which is reducing my outputs on various fronts such as Lo! and Whits, but probably increased overall from the previous few months, which themselves were quite productive. Anyway, I haven't forgotten about Lo and Behold!

2007-06-07 09:58 UTC:

Schuyler read my thing about Stable P2P. The idea of it was that using the web for storage is easier and better than using a P2P system where all of the files are stored persistently. Schuyler pointed out that for one use case, that of a large non-profit organisation, this isn't true; and that bandwidth is another interesting consideration.

We also came up with the parallel of a SETI@home type project only harvesting disc space rather than CPU cycles. Might be a good way for the large non-profit organisations to get their needed disc space in a distributed but non-peer environment.

2007-06-08 12:34 UTC:

In talking about capacitors, squirrels, and eventually horses with James Arthur today, I remarked elsewhere that it was a right "carnival of thought". Searching about for that almost novel phrase led me to this wonderful sentence from The New Monthly Magazine edited by William Harrison Ainsworth (1855), Volume 104:

"M. Chasles tries to awaken a becoming interest in his character, and curiosity as to his style—that chaos of parentheses, ellipses, and latent meanings, that carnival of thought and language, that labyrinth without an Ariadne's thread, that mingle-mangle of impracticable events, impossible geography, unaccountable characters, of quotations, interjections, exclamations, puns, epigrams, impertinent episodes, abrupt discords, measureless digressions, merciless divarications."

Chasles was talking about Jean Paul, "a German writer, best known for his humorous novels and stories". Might be a fellow worth checking out!

2007-06-08 12:46 UTC:

I've put up some Design Stuff, including some CSS and Javascript tricks, web tests, and some miscellaneous interesting stuff like a Java (the country) Train Timetable for spies that Tufte mentioned somewhere.

2007-06-08 16:07 UTC:

On a MacBook keyboard, en-dash, –, is Alt+-, whereas em-dash, —, is Shift+Alt+-. This is logical, but not huffmanised! Huffmanisation should probably win in this case. Of course, given that the MacBook keyboard has a whole key devoted to the section symbol but # is hidden away as the very annoying Alt+3, there are worse problems that need fixing...

2007-06-09 11:02 UTC:

Gran Paradiso, Firefox 3.0a5 was released today, so I tried it out. It was sorely unimpressive, slow and buggy without any noticable feature changes over 1.5. This got me on another of my browser migration wonderances, and I tried out Safari, Camino, and eventually Opera with various problems and annoyances along the way.

I didn't settle on migrating to Opera, but I was intrigued by its Speed Dial feature. There's an extension for Firefox that implements Speed Dial, but I thought that I could go one better. My current start page is a list of some 30 links or so, to general services and the like. They're all text links, and I only use about ten of them with any regularity, so I thought it'd be neat to get logos of those sites and put them on a Speed Dial like page.

The results are quite cool; the principle of having a larger target for more frequently used items is followed in comparison to the old small text links, and it looks great. I had to hunt about a bit for some of the logos: the OED one is a third party logo; the Froogle UK one is the old logo that someone's archived somewhere (I had to resize it, and others). The Gandi one was a transparency, so I had to set the background colour. Apart from that, it was quite trivial.

This got me thinking again about having a start page for creativity, though. This new start page, and indeed my current one, are motivated towards letting me find sites quickly when I have a task in mind already. I figure I want to buy a book, so I go to Amazon. I want to find out where something is, so I go to Get-a-map. The tasks are decided before I go to the page. But what about when I'm trying to generate ideas and be creative? How can the web help me then?

The only page I've thought of so far that could go on a Creativity Start Page is Wikipedia Random Page; but I've tried using that before to spark some creative idea, and to be honest it's not all that good an approach. Does anybody else have any ideas? Perhaps I should link to lots of things that I'm interested in... but random queries of interesting areas (Notes & Queries?) seem to be the way to go. That might require some scripting magics.

2007-06-09 11:15 UTC:

"Prove it in Python!" - The new #esp motto.

2007-06-09 19:38 UTC:

When you move a page from somewhere.html to somewhere/index.html the server automatically puts in a redirect for you usually (it adds trailing slashes to things for you), but if you do the reverse it won't strip them off. Here's something that I came up with a while ago that will achieve that in Apache:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1.html -f [OR]
RewriteCond %{DOCUMENT_ROOT}/$1.txt -f
RewriteRule ^(.+)\.(html|txt)/$ /$1 [R]

I implemented it on infomesh.net when I found that I had quite a few things which only consisted of a single index.html file in a directory.

2007-06-10 12:59 UTC:

Today I took photos of a jellyfish and a swan. They came out pretty well; those were just two from a set of 150 or so. I've used Flickr partly to just try it out, partly to give back to it since I've used it fairly often, and partly because the originals are quite large. On that last point, however, it turns out that Flickr requires you to upgrade your account (£12.65/annum!) to let people download the larger photos, which I'm not going to fork over given that I've only uploaded a couple of photos.

Christopher Schmidt, who just got married (congrats!), says that they'll retroactively enable the hi-res downloads if you upgrade though, so I can use it intermittently before deciding, which is nice. Björn suggests using www-archive which would be funny for one or two photos, but is probably not a good long term strategy!

2007-06-13 14:40 UTC:

I must needs petition my library, for it has erred. I went to browse the Notes and Queries collection, as I often do, only to find that it had been moved to a new section. Okay, I thought... But when I arrived at the new section I found that the pre-1950s editions weren't there! I checked in the catalogue and they had been moved to holding storage.

So I went to the enquiries desk to ask what had happened, and they said that they'd been reorganising to make some more space, and had therefore moved some stuff like the old Notes and Queries collections to storage. I pointed out that I use the collection regularly and they gave me a suggestions form to fill out, by which to make my petition. The Notes and Queries collection was really one of the best things about the library, and it's an awesome library anyway. Let's hope they listen to common sense eventually!

It is possible, of course, to request things from storage, but I think you have to ask in advance, so it rather precludes the whole browsing workflow pattern. And anyway, it's an unnecessary tax on both them and me, so hopefully the quick suggestion will do the trick.

2007-06-14 12:24 UTC:

Reading E.K. Chambers's Facts and Problems II this morning, I came across one of those many delightful little quotes, this one of Dryden recounting Jonson. Oddly, though, it's only reproduced on a single webpage before this one:

"In John Dryden's Essay on Dramatique Poetry of the Last Age appears the following passage: 'In reading some bombast speeches of Macbeth, which are not to be understood, he [Ben Jonson] used to say it was horrour.'"

I wonder when Chambers goes into the Public Domain? Wikipedia says that anything "before January 1, 1923" (or 95 years ago, whichever), or where the "last surviving author died at least 70 years before January 1 of the current year" is generally public domain. Facts and Problems came out in 1930, so it's not been those 95 years yet. E.K. Chambers died in 1954, so it's only been 53 years. That means we have another seventeen years to go... 1954 + 70 = 2024. Disappointing!

His Elizabethan Stage came out in 1923, though, so there might be some stuff surrounding that which is publishable.

2007-06-15 20:24 UTC:

"This leads to the obvious desire to define a constraint based grammar for specifying home design" - Anselm, via Chris.

Björn was telling us about his awesome grammar engine, Alexander (a kind of RNG-for-strings), and one of the things that he noted is that he can provide complements of regexps. So for example, a regexp of all the strings which aren't URIs. I guess that sent me over the edge: I know that Anselm probably just meant interior design, and he was probably just joking, but there's something about using regexp to generate homes which seems like a good idea.

So for example, you might specify that the lounge and the kitchen must be adjacent, as a rule. But what constitutes a lounge, and what a kitchen? There you might have further rules: a lounge must be at least 21 feet long, and have at least one window which is over 5ft wide. And so on, since obviously there'd be all sort of rules for adjacency of windows, their height from the floor, and so on.

The aim is to get there to be enough constraints so that you can use a genetic algorithm or something to generate a load of house designs that conform to the constraint, and then to use that to gradually hone in on the best possible house that matches your house grammar. This depends on the requirements, i.e. the grammar productions, being correct, but for conservative noddy boxes that's probably not too much of an issue—there are probably quite hard and fast rules about things like how low to the ground you want to put a window. If not, perhaps people might end up sharing them.

One way that this might be worked is to have a mini-language that generates a 3D boxy object, with adjacent walls being a bit like adjacent nodes in a graph or whatever. Windows and doorways could be like attributes... Then you'd just end up writing a schema for that language. It might end up being more like schematron than RNG, because houses don't really have logical roots; they're really going to be more like graph structures than trees. Seems like an interesting idea, though. I'll bet somebody's already done something like this; too good an opportunity to have been missed until now. Perhaps with not quite this amount of compsci flavouring though.

2007-06-20 10:41 UTC:

Some more photos for your delectation, this time of a Thistle, Bugs and Butercups, and a Giant Daisy. All three of these are essentially macro shots, whereas the last two I posted were merely closeups.

I think I've got a little theme going here: common natural phenomena well taken. I mean, a swan, jellyfish, thistle, buttercup, and daisy aren't the sort of thing that you expect people to take photos of these days... it's kinda back to basics.

Behind the scenes, these three were selected from something like 390 photos that I took that day, the majority of which were landscape photos. Generally I'm taking photos from the top 10% but not the top 5% in quality. Going through hundreds of photos selecting ones to post is actually quite a longwinded process; I'm definitely thinking about getting Aperture.

2007-06-20 15:35 UTC:

The Whits Word of the Day: "ombotrophic" (fed by clouds).

2007-06-20 15:37 UTC:

I have lots of little ideas for fiction. I was wondering if I should collect them all together somewhere and have that be the fiction, with a very murky line between what is canon and what isn't. Structure is the only canon! The problem with fiction I find is coming up with characterisations that are both compelling and useful. Fiction is usually quite an umbrella discipline, allowing you to use the act of sub-creation in order to work on various things that you wouldn't normally think about. I really like umbrella disciplines... though I haven't thought of a great many really handy ones.

The Ghyll wiki was interesting for being creative and yet rigorous: the only way that you could play things out was by describing them fairly dispassionately. Actually there's a lot of scope for bending the rules in fictional encyclopaedia entries.

At least I wrote up the metaschema here!

2007-06-21 19:46 UTC:

Today, a post on Lo and Behold!: On Midsummer's Night. It's kinda ironic that I should start writing the Summer 2007 edition right in the middle of summer, but of course that depends somewhat on how you reckon the seasons, the subject of the second essay there. The cool thing about them so far is that they're actually quite practical and concentrated, for all their density. Typesetting the first 7000 words that comprise the Spring 2007 paper edition is going to be quite a challenge, though.

2007-06-22 13:09 UTC:

Dave Pawson contacted me with a question about modelling confidence in RDF. I'd done some work on this in EARL way back in the day, but the confidence stuff was recently removed from the EARL schema I see.

An insight that Dave came up with is that confidence isn't a property of information, it's the property of the source of the information. So for example, say you're told that Paul Brown was born in Rochester. The source of your information is Paul's friend, Michelle. You'd say { :Paul :birthplace :Rochester } :says :Michelle, and then you can say :Michelle :trust "3".

The problem with this is that you might trust Michelle middlingly on this information, but more on another piece of information. So you'd end up with :Michelle :trust "1", "2", "3" etc., which doesn't make any sense. Clearly, confidence is at least a property of the utterance and the utterer. So you might have { { :Triples :terms "3" } :says :Sean } :confidence "3" . ("I, the writer of this document, kinda believe Sean when he says that triples have three terms") and { { :Triples :terms "4" } :says :ManInTheMoon } :confidence "2" . ("I, the writer of this document, barely believe the Man i' th' Moon when he says that triples have four terms").

Dave appears to believe that it can work on a layered scale. So you might have a default trust level for a person, but then you can add information about a particular utterance by that person later. So say you have { :Triples :terms "3" } :says :Sean . and :Sean :defaultTrustLevel "2" . in a document. You might derive from that that { { :Triples :terms "3" } :says :Sean } :confidence "2" . unless you have something specifically saying { { :Triples :terms "3" } :says :Sean } :confidence "3" .

Layering the confidence seems like it might work to me, but trust is a rather dificult area when it comes to modelling. When you increase it to the size of a trust network, you're in for issues. The rather interesting discussion surrounding this happened on #swig with me, Dave, Dan Connolly, Jan Grant, and Dan Brickley involved.

2007-06-23 11:49 UTC:

Firefox annoyed me to the point of actually switching today, so I switched to Camino. I'll probably switch back because I'm so used to the Firefox workflow, plus Camino has some, shall we say, idiosyncracies of its own (no undo-close-tab, and non-displayed-tabs when the tab bar overflows).

What does a browser actually even do? Not very much, if you consider it as just a wrapper for the rendering engine. Rendering engines seem pretty good these days; not perfect, given that they almost all still fail the CSS 3 selectors test, but most of the practical problems that I face don't seem to stem from the rendering engines.

A browser itself seems to provide the following wrapper facilities around the rendering engine: navigation, tabs and windows, bookmarking, history, search widgets, control over Javascript fonts and other features of the rendering engine, form filling in and password remembering, and... that's about it. You get some more wacky stuff like RSS readers and so forth, but I tend to think that applications should stick to the unix mentality on this sort of thing. Do one thing, but do it really well.

Well, okay, not quite the unix mentality since there are several really essential features that a browser provides, but it occurs to me that in actual fact many of them are facets of a single thing: navigation. Bookmarking and history, I mean, kinda get subsumed into navigation when you think about it from a higher perspective.

Consider what a browser effectively does as its core task. It shows you pages on the web, from a user specified entry point: clicking on a bookmark, clicking on a history item, entering a URI in the address bar, clicking a link in a webpage, being passed a URI from an external application. When this page renders, the page is saved in the browser's history. The act of navigating to a page in one of these ways and rendering it is an act which is recorded in the history.

But not only is it saved and recorded in the history, if you specify it to do so it also saves it as a tab. That is, you can render documents at parallel times and have easy access to them. I'm pointing out the obvious here to show something that is perhaps not so obvious: tabs are really a kind of strange history. They're a history where basically all of the members have been cached for easy access, and you have a nice nagivation bar exposing this particular cached history right in the browser window. They're stacked in the order in which you opened them, but in Firefox 1.5 onwards you can reorder them. This tab paradigm doesn't actually have a persistent history itself, but we do find such a thing useful, so we bolt it on in the form of session savers. I also find myself bookmarking all of my tabs. Tabs are really a slightly scummy (not persistent) and yet also advanced (more easily accessible and cached) form of the browser's history. And yet because it's a separate system from the actual history, it's considered orthogonal. You open a tab and an entry gets put in the history. They work side by side.

Bookmarks are more like the browser history than tabs. Bookmarks are kinda selected members of the history, with a user based structuring; that is to say, basically a hierarchy of folders. It's a tree structure, though some browsers are now allowing keywords. You can have multiple entries in bookmarks just like you can have multiple pages in your history (if you've visited a page two or more times). But the thing to remember is that bookmarks are user controlled in a more flexible way to the history: with bookmarks you can go into the structure and change things. And nothing gets added to the bookmarks as a consequence of another action, unlike history, so it's less automatic in that sense.

But what about navigation itself? When I click a link in a webpage, what I've done is to traverse a link on the internet. But in my history that just shows up as another entry. There's no link between the two items. There's also nothing that tells me the difference between typing a URI in the address bar and having been sent a link via an external application such as X-Chat Aqua. In other words, the browser history is descriptive—it's a record of the pages which are being visited in the browser—but it's selectively descriptive. You might say that it's imperfectly descriptive. Another thing that the history doesn't record is when you use the back and forward buttons in your current tab. It also doesn't record when you switch between tabs. What about if you could have a split screen of tabs? Then the topology to describe would get even more complex.

The history is a descriptive structure, but not descriptive enough. The bookmarks are a prescriptive structure, but perhaps not flexible enough, and not all that tightly coupled to the history. Say I have a bookmark and I want to find out all instances of that bookmark in my history. Can I do that easily in a browser at the moment? As for tabs, they're a kind of strange stunted prescriptive-oriented history. I can go in and change the structure and reorder the tabs, just like with my bookmarks. The main difference is that tabs are more easily accessible (or, actually, it depends where you place your bookmarks toolbar and if you even have one), and that most of all they're caching the recently gotten content of the page and also internal state such as text in forms and any changes to the DOM done by Javascript or whatever. There could be other forms of caching... you might cache just the original text of the page but not its rendering. A history is kinda like a cache of just the URIs and the titles, coupled with the time (but not the mode) of access.

It seems quite clear that these three things, history and bookmarks and tabs, can be improved and unified to some extent, and that other similar paradigms may be possible.

As for the other features, navigation and saving form information and the like, these can probably be improved upon significantly too. Camino doesn't let you save multiple input information for a form, which is just terrible. There's some application for OS X that actually lets you save the most common bits of inputted data, which I think is a pretty good idea. Things like that should be available everywhere.

And this is just the structural side of things. When you throw the UI into the equation, you've got a whole new set of ideas that can be explored. The placidity of browser designers these days is horrific. Browsers are engineered terribly, with a bad selection of the best default features, and an overproliferation of options that people don't need to commonly access. I hope that there's another Phoenix-like revolution at some point within the next year or two, but it doesn't seem likely. Even though a browser is mere cruft around a renderer!

2007-06-23 12:31 UTC:

Arnia and I discussed the semantics and behaviour of navigation in browsers somewhat more, with some interesting ideas.

2007-06-23 15:07 UTC:

I'm considering switching Whits to using a more Lo!-etc.-like structure, so that each post has a shortname. Thus, instead of /whits/2007/06#N231507 for this post it'd be merely /whits/shortnames. One drawback to this is that then I'd have to consider that each post must make sense out of context. It does, however, make them easier to link to. I'd aggregate all the month's posts into YYYY/MM pages still, too, so it'd be backwards compatible, not that it needs to be.

2007-06-25 19:42 UTC:

On my happening to mention that "I just fail to get excited at pretty much everything Semantic Web these days", I had a good chat with TimBL about some qualms I have with the Semantic Web, and some interesting hints of answers as well as (and in the form of) Tabulator development breadcrumbs.

One of the biggest things is that Jim Hollenbach has created a Tabulator Firefox extension. Though I link to it, there's not much point in trying to install it yet: I've tried with the help of both TimBL and Joe but there's stuff that needs fixing for it to work in Firefox 1.5 and 3.0a5, and even on 2.0 you need to be able to get the latest development tree of Tabulator and graft it in at the right place. All the same, the extension idea is one I had last year or so and this progress is certainly encouraging.

Another idea that I had for the interface was that rather than simply browsing around by clicking, some kind of text input would be nice too. Some kind of RDF path derivative, in fact, which allowed you to do commands as well. So for example, one use case I have in mind is updating my FOAF file. I might load it, and then want to traverse to some person and get to their data. I might want to then load a friend from their FOAF file who is now also my friend thanks to their introduction, and load some of their details (perhaps just name and mboxsum) into my own FOAF file. There are probably parts of that which are quicker with typing and parts which are quicker with drag 'n' drop and the like. The thing about Tabulator, as TimBL observed, is that there's so much potential with it for things like this.

Gallimaufry of Whits