Gallimaufry of Whits (2008-01)

These are quick notes taken by Sean B. Palmer on the Semantic Web, Python and Javascript programming, history and antiquarianism, linguistics and conlanging, typography, and other related matters. To receive these bits of dreck regularly, subscribe to the feed. To browse other months, check the contents.

2008-01-05 20:22 UTC:

Roman Pottery Reconstructed is what happens when I go into storytelling mode and throw in some history and science and so on. I was going to write more today, but you know—it's Old Christmas Eve (twelfth night), and the Top Gear special is gonna be on in a little while!

2008-01-08 10:15 UTC:

Joe Geldart mentioned 25 Best Free Quality Fonts today, so I've been on a little fonts splurge. The two that I downloaded are Day Roman and Cardo. Cardo is okay, and has a large unicode repertoire for mediævalists etc. that might come in handy; but it's Day Roman which is exceptional. The beauty of it! Very impressive.

From its documentation: “The two-tiers font included in this archive, Day Roman, is a digitally redrawn version of what has come to be historically known as the 'Two Line Double Pica Roman', a typeface designed by 16th century French punchcutter François Guyot, and used in numerous books between 1535 and 1570, most notable of which are J. Steelsius's printing of The Bible (1541) and Frisius (1551) [etc.]”.

2008-01-08 11:46 UTC:

The licensing issues that I've been studying have come along quite a way. Rather than write my own license, I found that the Eiffel Forum License 2 meets many of my requirements—enough to use as a stopgap or license kludge.

Since this was mainly brought on by Noah Slater offering to make a debian package from phenny, I asked him what he thought of it and he approved. He also offered to package up Trio, the RDF API I've been working on, for debian too! I've been asking a fair few people about using the EFL 2, but I'd still like to hear more comments from people if anybody has them. It's a bit of an odd choice in some respects, but exceptionally interesting in others.

2008-01-08 12:17 UTC:

The first www-tag mail of the year is my Resource-Type Revisited which is a note about whether or not a Resource-Type header would be a good idea. Quite possibly I should write an Internet Draft for it... another Internet Draft, indeed, since I already published one for it back in 2002.

One of the neat things described in the email is a kind of 303 container mechanism, which is more fully elucidated further on in the thread; it comes from a discussion that I had on #swig several days ago. The idea is that you can't 303 to one thing one day and then a completely different thing the next: there has to be the same level of consistency as you'd expect between resources and their representations.

2008-01-10 12:04 UTC:

The BBC report today that blasphemy law in England and Wales may be absolished. I'd researched this, by coincidence, just a few months ago, and I thought it was interesting that it should now come up in the news. Perhaps I should get in a bit of covert civil disobedience before they strike the law off of the statute books, I wondered, so I went researching some more and found this:

"Although no blasphemy case has been prosecuted in England and Wales since the passage of the Human Rights Act, and what follows is therefore necessarily speculative, it is our view that any prosecution for blasphemy today—even one which met all the criteria described in paragraphs 5-7 above—is likely to fail on grounds either of discrimination or denial of the right to freedom of expression."
― House of Lords - Religious Offences in England and Wales - First Report

So they speculate, if I understand them right, that it is impossible to be successfully prosecuted for publishing blasphemous matter in England and Wales; which implies that, if their speculation is correct, it is not criminal to publish blasphemous material in England and Wales.

Which makes all the news about it rather trumped up and moot.

2008-01-10 13:43 UTC:

Meek Beaver - An 18th Century Puritan Satire.

2008-01-13 16:03 UTC:

“Tom Lampire of Melksham would make as good Scriptures as the Bible.”
― William Bond, quoted in The World Turned Upside Down by Christopher Hill

That's the first time, however, that Tom Lampire has been mentioned on the web; I'd expected there to be at least a hundred mentions of him when I saw this snippet of information in Hill's book. There's only a few mentions of Lampire in books, even. I wonder if there's been a misspelling or something along the way?

2008-01-17 20:48 UTC:

Today, a new topic! New Model Organisation. There was also quite a bit of discussion about this because in fact I wrote it to inform a discussion which I'd already started.

There's quite a few ideas and notes that I didn't yet incorporate into either the article or the associated discussion, such as Jon Hanna's excellent little anecdote about people setting themselves up with grand titles in paganism just for the fake prestige, and how that's rather unsurprisingly frowned upon.

2008-01-18 20:28 UTC:

Today I got some salmiakki in the post, which is liquorice plus ammonium chloride. It tastes like smelling salts, pretty much. I've wanted to try some ever since I read about it in Sonja Elen Kisa's Speak Finnish page; salmiakki is basically a thing that Scandinavians are trying to keep a secret. Actually, as someone else on the web commented, it's more like Marmite: it's just so foul on first taste that it doesn't spread from culture to culture very easily.

On the other hand, personally I love it. Salmiakki is awesome! Many thanks to nsh for shipping it to me from .fi, and to Sonja Elen Kisa for the article that introduced me to it in the first place. It's rather a shame that the only time Sonja's ever spoken to me is to correct my rather desultory article on Toki Pona, the experimental language that she created. Perhaps I should write a more thoughtful commendary article about salmiakki to make up for it...

2008-01-23 21:27 UTC:

So I suggested to patbam, a long time vi user, that he map Tab to ESC just like it was on the old ADM3A keyboard that Bill Joy actually wrote vi with. This was his response:

<patbam> oh i could never do that now
<patbam> my central nervous system would revolt
<patbam> i'd be like, eating some soup, and my hand would splash lentils in my face and go TAB IS ESCAPE, HUH BITCH?

2008-01-24 21:21 UTC:

This is the story of how I fixed my Firefox environment.

Yesterday I was chatting with Jesse Ruderman about contentEditable support in Firefox. I'd come to the understanding that it wouldn't be ready for Firefox 3, but in fact it had been added in one of the alpha. I tried out a test page in my Firefox 3.0b2 and indeed it's good!

Now I installed Firefox 3.0b2 a while ago and I've been having quite a few problems with it, especially in it being really slow to load and do things, and in the fact that it won't display multiple rows of tabs. I often have between about 100 and 200 tabs open, using it very contextually, so to have only a single row is a major bummer.

I was thinking yesterday evening about cobbling together a browser using WebKit or something, just to try out some ideas that I'd been working on about how browsing should be. But I realised that the main thing that I wanted was to somehow get rid of my reliance on tabs.

I found a document entitled Places, History Service which had the information that I was looking for, and quickly cobbled together a History Page. You'll need to download it locally, and then click okay for it to access your personal information; read the source first to check it's not doing anything bad, of course. What it should be doing is printing out a list of all your history for the past day.

With that written, I set about changing my general Firefox environment. I made a new profile because I suspected there was some database corruption or something caused by the fact that every day I tend to bookmark those 100 or 200 tabs that I create during the day so as to preserve state—Firefox's history mechanism just hasn't been up to scratch for me (I think I've written about the woes of the mork format that it previously used before; now it's migrated to sqlite in Firefox 3).

Then I looked at the keybindings list, and deleted all of the navigational elements from the toolbar. So all I have is the address bar, and the search bar. I've also nuked the bookmarks bar; and there are to be no bookmarks except for keyword bookmarks, and bookmarklets, both of which aren't accessed in a direct sense of course.

I also bound my start page to Command+., and changed it from being a search bar and list of pages I frequently visit (a kind of bookmarks standin) to being a list of pages I frequently visit, and the Javascript history mechanism that I made. This means that I have to use the browser's built-in search bar to search quickly now, which is fine because that's easily accessible with Command+K.

The big thing though is my treatment of tabs. I now try to have only three tabs maximum open, and then I rely very heavily on my Javascript history (which I can access from my start page, Command+., and from there a link to a fuller history since I only include the most recent 30 entries on the start page). This has simplified browing immensely! All I have are two inputs (address and search engine), and three tabs. If I want to find out the context of all the complex threads of browsing that I've been going through for hours, e.g. my tafting and so on, then I press Command+. It's pretty great so far, but of course it's open for evolution. Three tabs might be a bit too strict a limit, or on the other hand I wonder if I can get away with just two.

The upshot is that Firefox 3.0b2 is actually quite fun to use now, though it still has a few bugs that will hopefully be ironed out for the main release.

2008-01-24 21:38 UTC:

A few niggles still in the Firefox system:

I'm trying to use a keyword instead of my Web Archive bookmarklet, so that I can just type "a" in front of a URI to have it be looked up in the Web Archive. Unfortunately this doesn't work because Firefox %HH escapes some of the characters in the URI, and the Web Archive doesn't like that. So I've thought about making a CGI to redirect and take out the %HH escaping, though perhaps there's a better way of doing it.
It still doesn't let me handle, say, Notation3 as text/plain! It's annoying because it gives that option for known media types, but not novel ones. Why not just allow it for any type?
It'd be interesting to have a hard limit on the amount of tabs, say seven or something like that. There are probably times when having quite a few tabs open is really, really useful however. Certainly you need tabs open if you've got form state in them, etc.

I'm also wanting to save my daily histories, which is kinda tricky because they're DOM modified, and doing Save As HTML Only only gives the original page rather than any Javascripted modifications. Doing Save As HTML Complete does save the modifications, but then it mangles the crap out of it and saves a directory with it, which is not what I want it to do. There's a bookmarklet that you can use to view the current DOM tree as HTML, but it does it by putting the text into a preformatted section, so when you save it it's actually all entity encoded. You have to copy the text out, put it into a text editor, and then save it that way. So that's something that should be made easier.

2008-01-26 21:15 UTC:

I wrote a (very) little script today called edit.js that tries to replicate some of the functionality of Mimulus, my Firefox 2 XHTML editor, using contentEditable. Sadly, it seems that contentEditable has a rather large bug in 3.0b2: namely, that it just turns itself off after a random amount of input. There's a test page that you can try out if you have a recent Firefox.

It's also impossible, as far as I can tell, to get it to insert a <strong> element; you can only do <b>, and even then you have to turn useCSS off, which you do by setting it to true. In fact, it's even worse than that: you have to pass 'useCSS' as the first argument to execCommand, false as a dummy second argument, and then true as the counter-intuitive third argument. Something tells me that replicating the functionality even of the rather simple Mimulus might not be as straightforward with Firefox's contentEditable implementation as it ought to be.

Also bear in mind the point that I abandoned Mimulus because Firefox 2 had an annoying caret mode bug where the caret just turned itself off randomly. I doubt that the two superficially similar problems are related, but perhaps.

2008-01-31 10:27 UTC:

I've been thinking for quite a while now about making an online wiki dictionary. Yeah, I know about Wiktionary; but Wiktionary is terrible. With Wikipedia, the design is fine for an encyclopaedia: most of the pages have a lot of prose content. With a dictionary, the content is not prose; rather, it's basically nuggets of typed data. Sense one, sense two, etymology, quotation, part of speech, and so on. Wiktionary displays this data as far apart as it possibly can, it seems. Look up a fairly common word like "duck" in a printed dictionary on your shelf, and then look up the same word in Wiktionary. What is the easier to derive information from? Compare it to another big online dictionary like the OED or MW. What is the easier to derive information from?

The problem with the OED, of course, is that it's not free. The problem with all the other dictionaries is that they're not very comprehensive, and they don't allow editing (which isn't really a problem with the OED since it's so immensely comprehensive anyway).

So I've been thinking about making my own dictionary site, one which has a very simple design. It wouldn't be too much of a bother for me to code, perhaps, and I wouldn't have to maintain it because hopefully a community would build up on it. But I need some initial impetus to start it... do people think that it's a good idea?

2008-01-31 10:37 UTC:

A slightly better JSON parser than the previous one I published:

import re

r_string = re.compile(r'("(\\.|[^"\\])*")')
r_json = re.compile(r'^[,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]+$')
env = {'__builtins__': None, 'null': None, 'true': True, 'false': False}

def json(text): 
   """Evaluate JSON text safely (we hope)."""
   if r_json.match(r_string.sub('', text)): 
      text = r_string.sub(lambda m: 'u' + m.group(1), text)
      return eval(text.strip(' \t\r\n'), env, {})
   raise ValueError('Input must be serialised JSON.')

Again, no guarantees, but I've been using it in my code.

2008-01-31 10:41 UTC:

Yesterday, I figured that OWL is the SGML of the Semantic Web. Consider what HTML 5 says about SGML:

Some earlier versions of HTML (in particular from HTML2 to HTML4) were based on SGML and used SGML parsing rules. However, few (if any) web browsers ever implemented true SGML parsing for HTML documents; the only user agents to strictly handle HTML as an SGML application have historically been validators. The resulting confusion — with validators claiming documents to have one representation while widely deployed Web browsers interoperably implemented a different representation — has wasted decades of productivity.

Who uses OWL so that they can use DL tools with it?

Arnia challenged my analogy saying that he's "not entirely sure that's fair. SGML's problem was that it was too complicated to use. OWL-DL's problem is almost the opposite, it is too simple to use." We debated decidability a bit as a result, but perhaps we strayed from the point: the analogy is not that OWL is too complicated or otherwise, it's that OWL is not used qua a description logic. People just use it to explain how their terms work, really; they don't then use DL tools to validate or entail or anything.

As a result, we find that FOAF, for example, uses OWL incompatibly: it expresses that some properties are DatatypeProperties which means that they must have a particular datatype (actually it doesn't seem to declare any DatatypeProperty that has a range other than rdfs:Literal, but I thought that it used to do so, so perhaps this has been fixed). What I think the problem is there is that OWL isn't flexible enough, and yet at the same time, from a DL point of view, OWL Full is too flexible. It's so easy to tip OWL schemata into being OWL Full it's unreal. For example, my beautiful typed list pattern makes an OWL schema to be OWL Full. It's just crazy.

So I'm thinking about boycotting OWL, though this isn't the central nucleus of why I'm thinking about this kind of thing. The main problem is that I can't figure how how to design Arcs so as to make browsing the Semantic Web even viable, let alone sensible.

2008-01-31 11:21 UTC:

Don't let the triples, the mechanics, fool you: the Semantic Web is basically a distributed version of ENQUIRE.

With all the ballyhoo about how useful the Semantic Web will be, I actually admit to not being able to think of many use cases. The canonical one is that I should be able to look up the latest train that I can catch to return home in time for The Simpsons; in other words, a simple merge of Train Schedule data and Television Schedule data.

Meteorology is a nice one to factor in too. I want to be able to look at dates in my diary (actually I don't have one; I don't schedule many things, but most people do), and have an overlay of the expected weather. Or to be able to plan more easily based on the expected weather.

Or another good one that I've heard is when you go to a conference page and you want to note down all the details of the conference and don't want to have to transcribe all the fernickety bits of detail into your organiser by hand. But again, I don't have an organiser; I haven't been to a conference in years. So for me, the Semantic Web in general seems to solve problems that are just out of my mileu. It doesn't help; it's irrelevant.

Actually I can think of one way in which the Semantic Web qua the Semantic Web (as opposed to being a thing that's helped me learn lots of design patterns and meet loads of friends and so on) has helped: my old FOAFQ service let me easily look up people's homepage and email addresses from just their nicknames, using FOAF, which is a distributed social networking service. But social networking, apart from that one instance, is very unremarkable. And actually, that instance of use isn't all that remarkable itself.

So, is there anything that would be useful? Especially, is there anything that would be useful which makes use of some of the interesting properties of the Semantic Web, such as the ability to look at the schema for any upgrades to a language?

I've not got any particular conclusions on this yet, just a bunch of rough ideas and directions. I observe that ENQUIRE was created as a kind of personal wiki, and so what we're talking about is oddly similar to Lion Kimbro's OneBigSoup project: a huge wiki merge. But I'm not going to publish information about my dentist or whatever online... that's the whole point about a personal wiki; it's personal, so you can merge stuff in, but not export out. Norm Walsh seems to be using N3 and the Semantic Web quite strongly for this sort of thing, but again, such things don't concern me all that much: I manage things so efficiently that I don't really have a need for a secretary.

At least, I don't need a secretary when it comes to societal things; but on intellectual matters, I really desperately need a secretary. And it's there that I wonder if the Semantic Web could help. When I model the genealogy of Shakespeare, or even my own family tree, why not try to model it fairly formally in Notation3? N3 is a crazy metalanguage, and the more you invest into it, the more you get back. What about trying to link design patterns together, and taftings and strands? I'm not exactly sure how one would do this, nor even why one should do this since, as William Loughborough says, "the clutter is inherent to the organism"; but you know... analysis and synthesis, and all that.

This might sound a bit pie-in-the-sky, but that's the Semantic Web in a nutshell; it's a very high level design. I'm wondering whether I can actually use it, to some actual aim other than the Semantic Web itself; and I'm starting to think that if I can, I'll be using a lot of the high level idea of it, and not much of the actual detail such as OWL.

2008-01-31 12:05 UTC:

I had a bit of discussion with Leigh Dodds on #swig about the "OWL is the SGML of the Semantic Web" and "My Semantic Web" posts above. He's of a similar mind: he uses OWL mainly as a description, and he's looking at datasets to figure out how to apply the Semantic Web to them to eke more value from them.

I also asked DanC what ratio of HTML 5 vs. XHTML 5 documents (or whatever the correct terminology is) that he's writing. He's writing 100% XHTML; whereas I'm writing 100% HTML.

2008-01-31 12:13 UTC:

I've written about Content-Rich Design before, but I'm still not very good at doing it. I also wrote up the Style is Content design pattern quite a few months ago, to try to justify my emphasis on form; but I think that my emphasis is actually an overemphasis. It's really hard to just concentrate on the content!

One idea that I came up with yesterday is to be able to select, via HTTP, between a range of structural designs. Structure is the murky area between content and style: it's the idea of content as style, rather than style as content. Perhaps if I concentrated on producing lots of different writeups and arrangements and configurations, that'd make me concentrate less on borders and colours and frills. On the web, the main points of orientation are structure, a consistent style themed to areas rather than individual articles, and pictures to help orient the user; including favicons as a sitewide orientation mechanism.

It's still kinda tricky though, since content-rich design just means doing a lot of good research and writing it up clearly. Ripe for procrastination. So perhaps having a habit of just doing as much as I want, but actually doing some, would be helpful. You may have noticed that the links for Arcs and FOAFQ that I've used in earlier posts today are just to documents that say "@@" at the moment. But at least the pages and the links exist to be potentially filled in!

2008-01-31 13:18 UTC:

DanC wrote a thing called itunekb.py that exports an iTunes database as a Python datastructure. He asked for a bit of help turning the milliseconds field of the song's duration into a minutes and seconds tuple. Here's what I came up with:

>>> from math import floor
>>> from decimal import Decimal
>>> def mss(ms): 
...    s = Decimal(ms) / 1000
...    return Decimal(str(floor(s / 60))), s % 60
... 
>>> mss(135902)
(Decimal("2.0"), Decimal("15.902"))
>>>

It was necessary to use decimal for this, really, because floats are just so infuriating with their imprecision. But the Decimal module is, as you can see, a little unweildy to use in Python. I like languages where they just call such thing "Numbers", and build them in.