Gallimaufry of Whits
Being for the Month of 2008-01
These are quick notes taken by Sean B. Palmer on the Semantic Web, Python
and Javascript programming, history and antiquarianism, linguistics and
conlanging, typography, and other related matters. To receive these bits of
dreck regularly, subscribe to the feed. To browse other
months, check the contents.
Roman Pottery
Reconstructed is what happens when I go into storytelling mode and throw in
some history and science and so on. I was going to write more today, but you
know—it's Old Christmas Eve (twelfth night), and the Top Gear special is
gonna be on in a little while!
Joe Geldart mentioned 25
Best Free Quality Fonts today, so I've been on a little fonts splurge. The
two that I downloaded are Day Roman and Cardo. Cardo is okay, and has a
large unicode repertoire for mediævalists etc. that might come in handy; but
it's Day Roman which is exceptional. The beauty of it! Very impressive.
From its documentation: “The two-tiers font included in this archive, Day
Roman, is a digitally redrawn version of what has come to be historically known
as the 'Two Line Double Pica Roman', a typeface designed by 16th century French
punchcutter François Guyot, and used in numerous books between 1535 and 1570,
most notable of which are J. Steelsius's printing of The Bible (1541) and
Frisius (1551) [etc.]”.
The licensing issues
that I've been studying have come along quite a way. Rather than write my own
license, I found that the Eiffel
Forum License 2 meets many of my requirements—enough to use as a stopgap
or license kludge.
Since this was mainly brought on by Noah Slater offering to make a debian
package from phenny, I asked
him what he thought of it and he approved.
He also offered to package up Trio, the RDF API I've been working on, for
debian too! I've been asking a fair few people about using the EFL 2, but I'd
still like to hear more comments from people if anybody has them. It's a bit of
an odd choice in some respects, but exceptionally interesting in others.
The first www-tag mail of the year is my Resource-Type
Revisited which is a note about whether or not a Resource-Type header would
be a good idea. Quite possibly I should write an Internet Draft for it...
another Internet Draft, indeed, since I already published one for it back in
2002.
One of the neat things described in the email is a kind of 303 container
mechanism, which is more fully
elucidated further on in the thread; it comes from a discussion
that I had on #swig several days ago. The idea is that you can't 303 to one
thing one day and then a completely different thing the next: there
has to be the same level of consistency as you'd expect between resources and
their representations.
The BBC report today that blasphemy law in
England and Wales may be absolished. I'd researched this, by
coincidence, just a few months ago, and I thought it was interesting that it
should now come up in the news. Perhaps I should get in a bit of covert civil
disobedience before they strike the law off of the statute books, I wondered,
so I went researching some more and found this:
"Although no blasphemy case has been prosecuted in England and Wales since the
passage of the Human Rights Act, and what follows is therefore necessarily
speculative, it is our view that any prosecution for blasphemy
today—even one which met all the criteria described in paragraphs 5-7
above—is likely to fail on grounds either of discrimination or denial of
the right to freedom of expression."
― House of Lords - Religious Offences in England and Wales - First
Report
So they speculate, if I understand them right, that it is impossible to be
successfully prosecuted for publishing blasphemous matter in England and Wales;
which implies that, if their speculation is correct, it is not criminal to
publish blasphemous material in England and Wales.
Which makes all the news about it rather trumped up and moot.
“Tom Lampire of Melksham would make as good Scriptures as the
Bible.”
― William Bond, quoted in The World Turned Upside Down by Christopher
Hill
That's the first time, however, that Tom Lampire has been mentioned on the
web; I'd expected there to be at least a hundred mentions of him when I saw
this snippet of information in Hill's book. There's only a few mentions of
Lampire in books, even. I wonder if there's been a misspelling or something
along the way?
Today, a new topic! New
Model Organisation. There was also quite a bit of discussion about this
because in fact I wrote it to inform a discussion which I'd already
started.
There's quite a few ideas and notes that I didn't yet incorporate into
either the article or the associated discussion, such as Jon Hanna's excellent
little anecdote about people setting themselves up with grand titles in
paganism just for the fake prestige, and how that's rather unsurprisingly
frowned upon.
Today I got some salmiakki in the post,
which is liquorice plus ammonium chloride. It tastes like smelling salts,
pretty much. I've wanted to try some ever since I read about it in Sonja Elen
Kisa's Speak Finnish
page; salmiakki is basically a thing that Scandinavians are trying to keep a
secret. Actually, as someone else on the web commented, it's more like Marmite:
it's just so foul on first taste that it doesn't spread from culture to culture
very easily.
On the other hand, personally I love it. Salmiakki is awesome! Many thanks
to nsh for shipping it to me from .fi, and to Sonja Elen Kisa for the article
that introduced me to it in the first place. It's rather a shame that the only
time Sonja's ever spoken to me is to correct my rather desultory article on
Toki Pona, the experimental language that she created. Perhaps I should write a
more thoughtful commendary article about salmiakki to make up for it...
So I suggested to patbam, a long time vi user, that he map Tab to ESC just
like it was on the old ADM3A keyboard that Bill Joy actually wrote vi with.
This was his response:
<patbam> oh i could never do that now
<patbam> my central nervous system would revolt
<patbam> i'd be like, eating some soup, and my hand would splash lentils in my
face and go TAB IS ESCAPE, HUH BITCH?
This is the story of how I fixed my Firefox environment.
Yesterday I was chatting with Jesse
Ruderman about contentEditable
support in Firefox. I'd come to the understanding that it wouldn't be ready for
Firefox 3, but in fact it had been added in one of the alpha. I tried out a test
page in my Firefox 3.0b2 and indeed it's good!
Now I installed Firefox 3.0b2 a while ago and I've been having quite a few
problems with it, especially in it being really slow to load and do things, and
in the fact that it won't display multiple rows of tabs. I often have between
about 100 and 200 tabs open, using it very contextually, so to have only a
single row is a major bummer.
I was thinking yesterday evening about cobbling together a browser using
WebKit or something, just to try out some ideas that I'd been working on about
how browsing should be. But I realised that the main thing that I wanted was to
somehow get rid of my reliance on tabs.
I found a document entitled Places,
History Service which had the information that I was looking for, and
quickly cobbled together a History Page. You'll need to
download it locally, and then click okay for it to access your personal
information; read the source first to check it's not doing anything bad, of
course. What it should be doing is printing out a list of all your history for
the past day.
With that written, I set about changing my general Firefox environment. I
made a new profile because I suspected there was some database corruption or
something caused by the fact that every day I tend to bookmark those 100 or 200
tabs that I create during the day so as to preserve state—Firefox's history
mechanism just hasn't been up to scratch for me (I think I've written about the
woes of the mork format that it previously used before; now it's migrated to
sqlite in Firefox 3).
Then I looked at the keybindings list, and deleted all of the navigational
elements from the toolbar. So all I have is the address bar, and the search
bar. I've also nuked the bookmarks bar; and there are to be no bookmarks except
for keyword bookmarks, and bookmarklets, both of which aren't accessed in a
direct sense of course.
I also bound my start page to Command+., and changed it from being a search
bar and list of pages I frequently visit (a kind of bookmarks standin) to being
a list of pages I frequently visit, and the Javascript history mechanism that I
made. This means that I have to use the browser's built-in search bar to search
quickly now, which is fine because that's easily accessible with Command+K.
The big thing though is my treatment of tabs. I now try to have only three
tabs maximum open, and then I rely very heavily on my Javascript history (which
I can access from my start page, Command+., and from there a link to a fuller
history since I only include the most recent 30 entries on the start page).
This has simplified browing immensely! All I have are two inputs (address and
search engine), and three tabs. If I want to find out the context of all the
complex threads of browsing that I've been going through for hours, e.g. my
tafting and so on, then I press Command+. It's pretty great so far, but of
course it's open for evolution. Three tabs might be a bit too strict a limit,
or on the other hand I wonder if I can get away with just two.
The upshot is that Firefox 3.0b2 is actually quite fun to use now, though it
still has a few bugs that will hopefully be ironed out for the main
release.
A few niggles still in the Firefox system:
- I'm trying to use a keyword instead of my Web Archive bookmarklet, so that
I can just type "a" in front of a URI to have it be looked up in the Web
Archive. Unfortunately this doesn't work because Firefox %HH escapes some of
the characters in the URI, and the Web Archive doesn't like that. So I've
thought about making a CGI to redirect and take out the %HH escaping, though
perhaps there's a better way of doing it.
- It still doesn't let me handle, say, Notation3 as text/plain! It's annoying
because it gives that option for known media types, but not novel ones. Why not
just allow it for any type?
- It'd be interesting to have a hard limit on the amount of tabs, say seven
or something like that. There are probably times when having quite a few tabs
open is really, really useful however. Certainly you need tabs open if you've
got form state in them, etc.
I'm also wanting to save my daily histories, which is kinda tricky because
they're DOM modified, and doing Save As HTML Only only gives the original page
rather than any Javascripted modifications. Doing Save As HTML Complete does
save the modifications, but then it mangles the crap out of it and saves a
directory with it, which is not what I want it to do. There's a bookmarklet
that you can use to view the current DOM tree as HTML, but it does it by
putting the text into a preformatted section, so when you save it it's actually
all entity encoded. You have to copy the text out, put it into a text editor,
and then save it that way. So that's something that should be made easier.
I wrote a (very) little script today called edit.js that tries to
replicate some of the functionality of Mimulus, my Firefox 2 XHTML editor,
using contentEditable. Sadly, it seems that contentEditable has a rather large
bug in 3.0b2: namely, that it just turns itself off after a random amount of
input. There's a test
page that you can try out if you have a recent Firefox.
It's also impossible, as far as I can tell, to get it to insert a
<strong> element; you can only do <b>, and even then you have to turn
useCSS off, which you do by setting it to true. In fact, it's even worse than
that: you have to pass 'useCSS' as the first argument to execCommand, false as
a dummy second argument, and then true as the counter-intuitive third argument.
Something tells me that replicating the functionality even of the rather simple
Mimulus might not be as straightforward with Firefox's contentEditable
implementation as it ought to be.
Also bear in mind the point that I abandoned Mimulus because Firefox 2 had
an annoying caret mode bug where the caret just turned itself off randomly. I
doubt that the two superficially similar problems are related, but perhaps.
I've been thinking for quite a while now about making an online wiki
dictionary. Yeah, I know about Wiktionary; but Wiktionary is terrible.
With Wikipedia, the design is fine for an encyclopaedia: most of the pages have
a lot of prose content. With a dictionary, the content is not prose; rather,
it's basically nuggets of typed data. Sense one, sense two, etymology,
quotation, part of speech, and so on. Wiktionary displays this data as far
apart as it possibly can, it seems. Look up a fairly common word like "duck" in
a printed dictionary on your shelf, and then look up the same word in
Wiktionary. What is the easier to derive information from? Compare it to
another big online dictionary like the OED or MW. What is the easier to derive
information from?
The problem with the OED, of course, is that it's not free. The problem with
all the other dictionaries is that they're not very comprehensive, and they
don't allow editing (which isn't really a problem with the OED since it's so
immensely comprehensive anyway).
So I've been thinking about making my own dictionary site, one which has a
very simple design. It wouldn't be too much of a bother for me to code,
perhaps, and I wouldn't have to maintain it because hopefully a community would
build up on it. But I need some initial impetus to start it... do people think
that it's a good idea?
A slightly better JSON parser than the previous one I
published:
import re
r_string = re.compile(r'("(\\.|[^"\\])*")')
r_json = re.compile(r'^[,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]+$')
env = {'__builtins__': None, 'null': None, 'true': True, 'false': False}
def json(text):
"""Evaluate JSON text safely (we hope)."""
if r_json.match(r_string.sub('', text)):
text = r_string.sub(lambda m: 'u' + m.group(1), text)
return eval(text.strip(' \t\r\n'), env, {})
raise ValueError('Input must be serialised JSON.')
Again, no guarantees, but I've been using it in my code.
Yesterday, I figured that OWL is the
SGML of the Semantic Web. Consider what HTML 5 says about
SGML:
Some earlier versions of HTML (in particular from HTML2 to HTML4) were based
on SGML and used SGML parsing rules. However, few (if any) web browsers ever
implemented true SGML parsing for HTML documents; the only user agents to
strictly handle HTML as an SGML application have historically been validators.
The resulting confusion — with validators claiming documents to have one
representation while widely deployed Web browsers interoperably implemented a
different representation — has wasted decades of productivity.
Who uses OWL so that they can use DL tools with it?
Arnia challenged my analogy saying that he's "not entirely sure that's fair.
SGML's problem was that it was too complicated to use. OWL-DL's problem is
almost the opposite, it is too simple to use." We debated decidability a bit as
a result, but perhaps we strayed from the point: the analogy is not that OWL is
too complicated or otherwise, it's that OWL is not used qua a
description logic. People just use it to explain how their terms work, really;
they don't then use DL tools to validate or entail or anything.
As a result, we find that FOAF, for example, uses OWL incompatibly: it
expresses that some properties are DatatypeProperties which means that they
must have a particular datatype (actually it doesn't seem to declare
any DatatypeProperty that has a range other than rdfs:Literal, but I thought
that it used to do so, so perhaps this has been fixed). What I think the
problem is there is that OWL isn't
flexible enough, and yet at the same time, from a DL point of view, OWL
Full is too flexible. It's so easy to tip OWL schemata into being OWL
Full it's unreal. For example, my beautiful typed list pattern makes
an OWL schema to be OWL Full. It's just crazy.
So I'm thinking about boycotting OWL, though this isn't the central nucleus
of why I'm thinking about this kind of thing. The main problem is that I can't
figure how how to design Arcs
so as to make browsing the Semantic Web even viable, let alone sensible.
Don't let the triples, the mechanics, fool you:
the Semantic Web is basically a distributed version of ENQUIRE.
With all the ballyhoo about how useful the Semantic Web will be, I actually
admit to not being able to think of many use cases. The canonical one is that I
should be able to look up the latest train that I can catch to return home in
time for The Simpsons; in other words, a simple merge of Train Schedule data
and Television Schedule data.
Meteorology is a nice one to factor in too. I want to be able to look at
dates in my diary (actually I don't have one; I don't schedule many things, but
most people do), and have an overlay of the expected weather. Or to be able to
plan more easily based on the expected weather.
Or another good one that I've heard is when you go to a conference page and
you want to note down all the details of the conference and don't want to have
to transcribe all the fernickety bits of detail into your organiser by hand.
But again, I don't have an organiser; I haven't been to a conference in years.
So for me, the Semantic Web in general seems to solve problems that
are just out of my mileu. It doesn't help; it's irrelevant.
Actually I can think of one way in which the Semantic Web qua the
Semantic Web (as opposed to being a thing that's helped me learn lots of design
patterns and meet loads of friends and so on) has helped: my old FOAFQ service let me easily look
up people's homepage and email addresses from just their nicknames, using FOAF,
which is a distributed social networking service. But social networking, apart
from that one instance, is very unremarkable. And actually, that instance of
use isn't all that remarkable itself.
So, is there anything that would be useful? Especially, is there anything
that would be useful which makes use of some of the interesting properties of
the Semantic Web, such as the ability to look at the schema for any upgrades to
a language?
I've not got any particular conclusions on this yet, just a bunch of rough
ideas and directions. I observe that ENQUIRE was created as a kind of personal
wiki, and so what we're talking about is oddly similar to Lion Kimbro's
OneBigSoup project: a huge wiki merge. But I'm not going to publish information
about my dentist or whatever online... that's the whole point about a personal
wiki; it's personal, so you can merge stuff in, but not export out. Norm Walsh
seems to be using N3 and the Semantic Web quite strongly for this sort of
thing, but again, such things don't concern me all that much: I manage things
so efficiently that I don't really have a need for a secretary.
At least, I don't need a secretary when it comes to societal things; but on
intellectual matters, I really desperately need a secretary. And it's there
that I wonder if the Semantic Web could help. When I model the genealogy of
Shakespeare, or even my own family tree, why not try to model it fairly
formally in Notation3? N3 is a crazy metalanguage, and the more you invest into
it, the more you get back. What about trying to link design patterns together,
and taftings and strands? I'm not exactly sure how one would do this, nor even
why one should do this since, as William Loughborough says, "the clutter is
inherent to the organism"; but you know... analysis and
synthesis, and all that.
This might sound a bit pie-in-the-sky, but that's the Semantic Web in a
nutshell; it's a very high level design. I'm wondering whether I can actually
use it, to some actual aim other than the Semantic Web itself; and I'm starting
to think that if I can, I'll be using a lot of the high level idea of it, and
not much of the actual detail such as OWL.
I had a bit of discussion
with Leigh Dodds on #swig about the "OWL is the SGML of the Semantic Web" and
"My Semantic Web" posts above. He's of a similar mind: he uses OWL mainly as a
description, and he's looking at datasets to figure out how to apply the
Semantic Web to them to eke more value from them.
I also asked DanC what ratio of HTML 5 vs. XHTML 5 documents (or whatever
the correct terminology is) that he's writing. He's writing 100% XHTML; whereas
I'm writing 100% HTML.
I've written about Content-Rich
Design before, but I'm still not very good at doing it. I also
wrote up the Style is
Content design pattern quite a few months ago, to try to justify my
emphasis on form; but I think that my emphasis is actually an overemphasis.
It's really hard to just concentrate on the content!
One idea that I
came up with yesterday is to be able to select, via HTTP, between a range of
structural designs. Structure is the murky area between content and style: it's
the idea of content as style, rather than style as content. Perhaps if I
concentrated on producing lots of different writeups and arrangements and
configurations, that'd make me concentrate less on borders and colours and
frills. On the web, the main points of orientation are structure, a
consistent style themed to areas rather than individual articles, and
pictures to help orient the user; including favicons as a sitewide orientation
mechanism.
It's still kinda tricky though, since content-rich design just means doing a
lot of good research and writing it up clearly. Ripe for procrastination. So
perhaps having a habit of just doing as much as I want, but actually doing
some, would be helpful. You may have noticed that the links for Arcs and FOAFQ that I've used in earlier
posts today are just to documents that say "@@" at the moment. But at least the
pages and the links exist to be potentially filled in!
DanC wrote a thing called itunekb.py
that exports an iTunes database as a Python datastructure. He asked
for a bit of help turning the milliseconds field of the song's duration into a
minutes and seconds tuple. Here's what I came up with:
>>> from math import floor
>>> from decimal import Decimal
>>> def mss(ms):
... s = Decimal(ms) / 1000
... return Decimal(str(floor(s / 60))), s % 60
...
>>> mss(135902)
(Decimal("2.0"), Decimal("15.902"))
>>>
It was necessary to use decimal for this, really, because floats are just so
infuriating with their imprecision. But the Decimal module is, as you can see,
a little unweildy to use in Python. I like languages where they just call such
thing "Numbers", and build them in.
Sean B. Palmer, inamidst.com