Gallimaufry of Whits

Being for the Month of 2008-02

These are quick notes taken by Sean B. Palmer on the Semantic Web, Python and Javascript programming, history and antiquarianism, linguistics and conlanging, typography, and other related matters. To receive these bits of dreck regularly, subscribe to the feed. To browse other months, check the contents.

2008-02-01 10:57 UTC:

My slightly provocative piece yesterday, who uses OWL for the DL?, got more feedback than I expected, so I ought to explain a bit more about where I'm coming from.

You can see my thoughts yesterday, which started off as me just rambling about whatever and then publishing it to go for the feedback-gusto, in a clearer light if you read Pat Hayes's Catching the Dreams article from several years back. Perhaps his opinion's changed now, but back then he stated that "the overhead required by DLs, particularly the conceptual overhead, is now a barrier and an impediment to progress". What I'm saying certainly isn't new, and isn't uninformed.

What isn't true is my implication that nobody uses OWL tools for the DL. Of course there is a large industrial, commercial, enterprise, academic, intranet style usage of OWL and DL tools. But will these fairly private things trickle over into the public side? It's nice if our government records are stored in a database that's OWL/DL-powered, but I also want to be able to do more mundane things like check cinema times against train times. I can't do that mashup if the cinema and train companies won't release their data for me to remix.

TimBL said the other day that he feels "we may be quite close getting this to gel to the point where it becomes just an essential part of life". My provocative thought is: what part does OWL/DL have to play in that gelling process, of making the Semantic Web essential? Does it count if bits and bytes that only peripherally affect me are the only things OWL/DL has a part in? Surely not.

I don't mean to turn this into a diatribe against OWL. I'm not saying that OWL is a bad technology (it's very well engineered), and I'm not saying that nobody should ever use it (clearly companies are finding it useful). I aim to question whether it's useful for the more grassroots efforts, the FOAFs and the SIOCs and the things as yet uninvented, as those are the things that have the most direct impact on people; the things that will make the Semantic Web a visibly "essential part of life".

Moreover, as a Semantic Web developer, and as a Semantic Web developer who's been working on the whole shaboodle for a particularly long time, I have a kind of vote to cast in the form of which technologies I investigate and work on, and that's a very personal choice to have to make. What I'm saying is that at the moment, I'm thinking that OWL has a very small part to play in My Semantic Web.

2008-02-01 17:21 UTC:

If you're looking to use GRDDL in HTML 5 then you might be disappointed: there's no @profile. I asked about this on #swig yesterday, and DanC said "I think the reason head/@profile isn't in html5 is that it hasn't really crossed hixie's radar; i.e. it's not used in a statistically significant portion of web content". Kjetil told me today that it's been dropped because it's not cut-'n'-paste compatible, which we had a little debate about.

I had suggested that the transformation link type could be added to HTML 5, which would make @profile redundant, but of course that would break old tools. One approach which wouldn't break old tools would be to make a profile link type, such as the one that I proposed on the WHATWG Wiki today, and then make the XHTML namespace be a GRDDL document which bootstraps in the new profile link type.

As nsh says, "when you have discussions like this, the aliens laugh at us".

2008-02-01 17:36 UTC:

Something I've been pondering: how are we to go about creating new myths in a period when science tends to drown myth out? Tolkien failed to have his legendarium thought of as a myth because it got seen instead as one of the archetypes of a new genre, fantasy. If nobody had created a work subsequently that was imitative of it, perhaps it would have been classed as just a very recent version of a myth. But myths do tend to have that quality of being anonymous, and Tolkien's legendarium was really his own.

We do still have relatively new interesting folklores such as urban legends, ufologists, and things like that. We don't, however, have myths as they're generally understood, where the once upon a time doesn't matter. It's like with a true myth, you don't think of it as fiction: it's something from the wellsprings of history. That makes it feel ethnographically real, I think, in a way in which modern fantasy simply is not. It's a quality that's a bit hard to put a finger on.

Perhaps the point about it is that it's in some respect like trying to make a Stradivarius. All of the Stradivari that are going to be made have already been made, by definition. The situation mightn't be quite so drastic with mythology but on the other hand it may be. That doesn't stop people from making new violins, but it might take some technical (or, in the case of myth, imaginative) leap before any are made with the same kind of reputation.

2008-02-02 12:34 UTC:

I've published some Notes on Coleridge's Kubla Khan, mainly on ascertaining when and where the poem was written. Many thanks to Terje Bless for his kind and detailed assistance with editing.

2008-02-02 14:35 UTC:

Manually checking HTML5 is tiresome, so I've dusted off my old Validate With Logos project, and made a check referer script that I can link to, and a litmus stylesheet script. The litmus is basically to be referenced as a normal stylesheet, but then it generates one of the following three lines:

The first is if it's being used on a document outside of inamidst.com, the second is if it's being used on a valid HTML5 document, and the third is if it's being used on an invalid HTML5 document. There are some positive and negative tests for it if you want to see it in action, and I've also used it at the bottom of my Coleridge article.

So now, using this mechanism, I can tell from a glance whether a document is valid HTML5 or whether it needs fixing.

2008-02-02 15:25 UTC:

Morbus just published his essay Resources not Services. "It has taken me 11 years, I think, to come to a simple conclusion: I prefer building resources over services. I define a 'resource' as content you may be able to interact with; a 'service', on the other hand, is primarily something you interact with, with content as complementary."

His Video Underbelly is just the sort of thing that I should be working on, and am. Whereas Morbus is obsessed with videos and Fort and so on, I'm obsessed with Shakespeare and Fort and so on. It's not so much the object of the obsession as the obsession itself; but it's the object of the obsession inasmuch as it drives the obsession. When you think about single monumental achievements of people like Dr. Johnson, Virginia Woolfe, or Sir Thomas Browne, you might think about their largest works (e.g. the Dictionary), but if you know more than a trivial amount about them, it's all of their varied and dazzling output which is interesting. When you focus on what Morbus calls "trickling at content", like Simon Rodia and Alfred Wainwright did so brilliantly, it's possible to create a pretty big lake.

2008-02-04 20:02 UTC:

I wrote a little message to Aaron Swartz today called Printing Made Easier. The general idea, for people who might not be conversant with the particulars described in that message, is to have a very low-barrier method for getting works printed. The benefit is that printed works tend to be more persistent, in terms of their actually physically lasting (we don't know how digital information is going to fare yet, really), and in terms of their being "fixed"—obviating the whole "accessed on ..." problem.

In other words, I figure that if people want to publish their poetry or a little academic note or something like that, they should be able to format their manuscript and send it off somewhere to be printed in a kind of journal like thing. It should either be free of charge for the person trying to publish their work, or entail a micropayment, which leaves the rather big and possibly unresolvable question of who should pay for all this, especially if it got particularly popular.

Even if just a few libraries got funding to support such a scheme, it would seem to be a fairly good idea; you could still have some kind of entry barrier such as using TeX or following some style guide, or something, as long as it's not made too big a barrier. It'd be handy for all those little publications which are worth citing, but aren't worth getting peer reviewed or going to a traditional publisher for.

2008-02-04 20:11 UTC:

I've published a transcript of The Farmhouse of Kubla Khan by Morchard Bishop from the Times Literary Supplement (1957) online. It complements my notes on Kubla Khan, which I've been researching to a considerable extent these past few days. Coleridge is really fascinating. Cf. my sweetshop comment and mild explanation of Coleridge from the perspective of a Shakespearean researcher.

2008-02-04 20:36 UTC:

The dictionary came about by slow evolution, though Dr. Johnson's Dictionary is considered a milestone. The thesaurus, on the other hand, was more of a leap of imagination as far as I understand it. The two chief qualities which link Dr. Johnson and the Rev. Roget is their tremendous patience in the process of compilation. The main reason nobody did such huge achievements like this before is not, perhaps, chiefly because they couldn't be imagined; but because they couldn't be comfortably undertaken. And so it is that perhaps someone idly imagined a work like the thesaurus far before Roget got around to actually making one.

What, I wondered yesterday morning, would be a logical successor to these two works? Personally, I'd like to see a work of complex etymology: a book which outlines the etymologies of words in terms of similarities between how they evolved, outlining, for example, words of ecclesiastical Anglo-Saxon origin in one section, and those renaissance borrowings from Latin in another. Then the subdivisions might be based on meaning and phonology and the history of their introduction such as who coined them; things like that.

The aim would not, unlike the dictionary and thesaurus, to be particularly comprehensive however. Indeed, a dictionary is meant to be comprehensive in terms of etymology. What this work would do is to show the relationships and connections between words. What words, for example, use dis- as an intensifier and why? The explanations are more important than the raw facts, because the raw facts are often very boring; phonology is irritatingly so, for example, and yet you have to have a little knowledge of it to understand some of the more interesting textural characteristics of a language.

At first I suspected that the only class of people that this venture would bring value to are the conlangers, I say a little contemptuously even though I may be classed myself as a conlanger. But actually I think it'd have wider appeal: I know of plenty of people who to really understand a word, they look up the etymology. Just today I figured out what synthesis, the word, was all about, and I was yet again astounded that I hadn't realised a word's make-up through its lexicalisation. Chesterton's example of "holiday" is perhaps my favourite. So anyway, a work like this would be spiffing if it brought out the interesting elements of etymology.

2008-02-04 21:05 UTC:

I asked Daniel Biddle to give the badly-named "etythesaurus" a better name, but he merely suggested that "what you need is a book that tells you how parts of words can be put together to make words". In other words, an etythesaurus...

And he's as obsessed at the mo' with lizards (and moths) as I am with Coleridge.

2008-02-05 21:17 UTC:

I've been thinking about migrating from Equid to Mercurial for inamidst's version control software. The main problem is that my Updates and Changes page is strongly tied to the Equid system, which is quite customised for it and hence Mercurial simply might not be powerful enough to replace it. I'm also a bit annoyed about the .hg/ directories that it'll litter throughout the upload tree; I'll have to put in an rsync ignore for them. It'd be nicer if I could maintain them all in a completely separate shadow directory, but I've been through the Mercurial book and manuals and I can't find any documented way to make that happen.

2008-02-05 21:23 UTC:

One of the primary reasons why I'm thinking about switching to Mercurial for inamidst's version control software is a planned revision to Whits. Instead of having just a big old list of entries per month, I'm going to have individual pages for them, and some of them will have titles. I'd rather write things that stand alone than assume their context within a larger list of entries.

I'd also like to make Whits a bit simpler, so this is a chance to make the style and the backend system even simpler than it is now. Cody, who installed pretty much the same software, is going in the other direction and bolting things on and making them more complicated. I'm being careful to resist that.

One structural change would be to have two kinds of posts. In the first month of Whits, a lot of the content was very random, just junk notes of all kinds of things. When I noticed that a lot of friends were reading this, I started to make it more sensible, but I think that inadvertantly I made it less interesting. Now instead of it being random bits of interesting scripts and thoughts and design sketches and so on, it's just me monologuing as I do on IRC sometimes. With What Planet is This? I made sure to not use the first person pronoun except on very rare occasions: and I'd like to do a similar sort of thing for the titled posts in the planned Whits redesign.

Not entirely sure it'll be better, but I think it'll entice me to make more output.

The version problem came in when I was thinking about how to preserve all the histories of the files online. Eventually I decided that I'd just not edit the posts except to correct typos, but I realised that if I had considered this a requirement then Equid wouldn't have covered it. Moreover it's getting pretty slow to generate a new changeset delta, so at the very least I should probably look into Equid optimisation.

2008-02-07 15:59 UTC:

A friend was telling me yesterday about his current digital camera. "It's a Sony, but the way that it's shaped reminds me a lot of my old camera, a Kuhsel. Once I was on the alps mountaineering, based at Zermatt, and I was way up in the mountains taking some pictures with it. As I was taking one nice shot on a very high ridge, I fumbled and saw it fall out of my hands and a couple of thousand feet down a slope to a bergshrund.

"There was no way I could recover it, so I went back to base with my group, and we waited for the other group to arrive back. When they didn't arrive back, we went out looking for them, and found them carrying an injured man. We dashed up to help him and then they told us what had happened: this chap had noticed something glinting in the snow on the glacier, so he went to retrieve it. But just as he got it, there was a huge rockfall and one of the rocks hit him on the head quite badly, and he got a concussion.

"Of course he'd found my camera, and it was almost fine! I got the film out of it and was able to develop it, and the only problem was that the lens had been slightly depressed into the case, which prevented the loading, but otherwise I'd have been able to use it. The chap who found it for me was still feeling pretty bad three days later."

N.B. It wasn't called a Kuhsel, before you go Googling, but I didn't make a note of the actual manufacturer that he said.

Sean B. Palmer, inamidst.com