Gallimaufry of Whits

Being for the Month of 2007-08

These are quick notes taken by Sean B. Palmer on the Semantic Web, Python and Javascript programming, history and antiquarianism, linguistics and conlanging, typography, and other related matters. To receive these bits of dreck regularly, subscribe to the feed. To browse other months, check the contents.

2007-08-03 15:00 UTC:

Morbus has released 60 Blank White Cards.

2007-08-08 08:16 UTC:

One power cable replacement later, and I'm back. Well, I wasn't offline at any point, but my script setup would've taken too much effort to set up on another computer just for the sake of a few days. AppleCare are pretty good: they keep you on premium phone lines for way too long, but then they send your parts out by next day UPS, which is good. Also they bent the rules a little bit for me about signing up, because I was a few days past some deadline or something, so that was very nice of them. Overall, I'd say 8/10.

2007-08-08 08:33 UTC:

Cody and I are using Gobby for various things, and we came up with the following notes about why we like and dislike Gobby.

Ways in which Gobby is cool:

But of course:

We are professional whiners though, and the truth is, Gobby works incredibly well, despite all the niggling little nit-picking. The main problem is that Moonedit exists. If Moonedit didn't exist, we probably wouldn't moan about half as much about Gobby. Moonedit's UI is just much simpler, and thus slightly easier to do right. Gobby tries to do a lot more, and thus makes it a lot harder to match the pure elegance of our previous tool.

Thank you for reading our pleasures and grievances on this matter. If you are a Gobby developer, please pat yourself on the back and fix these damn nuisances simultaneously.

2007-08-09 08:54 UTC:

Word of the day: acroamatic.

2007-08-09 08:56 UTC:

The beginning of the month is font time in Whits, and this month I'm pondering the typographical dark side. Comic Sans MS, to be precise. If you use it in the right context (for example, a comic site?) and at the right font size, it can be pretty effective. A much maligned font, and rightly so given just how eminently abusable it is, but that doesn't mean that should be avoided at all costs by those who know how to use it effectively.

It's even available on OS X, would you believe; must be part of the whole "Core Fonts for the Web" deal.

2007-08-10 16:24 UTC:

Kevin Reid's nifty Timer Project got me thinking about simple hardware design. The enticing thing about it is that it's still possible to get some absolute essentials done right, and yet for most people creating hardware of this level of complexity is out of the question.

William Loughborough wanted a watch which only told him the time and date, and nothing else, and it took him ages to find a watch which does that. I choose a watch as a fairly random example of small hardware... For me, the perfect watch would have the following requirements at least:

Those are the really basic, basic things but it's already quite complex. You'd have to make sure it has a radio receiver to pick up the Atomic Radio Signal, and for the automatic timezones switching it'd need a GPS unit. Getting the essentials right is still costly! As for power, watch batteries seem to be getting rather good these days so perhaps a standard one would do, though I guess a GPS watch would take a lot more power than a regular one.

On stopwatches, I just figured today that since you're only ever displaying one stopwatch at a time (right?), a stopwatch can be implemented in memory simply by making a mapping of { stopwatch-number: start-time }. Then the display value is simply current-time minus start-time.

For more advanced functions, moonphase and sunrise times and other ephemeris like that would be nice. You can go crazy with all that kind of stuff though; really a watch is going to have to be fairly simple because it's so small.

2007-08-11 09:21 UTC:

One of these days I should really write an IRC logs normaliser. Because I use dircproxy, I get a lot of duplicate lines in my files. Though it shouldn't be too difficult to filter these out, I'd really want to make sure it's done properly, so it'll have to be a robust and careful program.

2007-08-11 14:02 UTC:

Word of the day: "ghæstanlich".

2007-08-13 07:38 UTC:

The day prior to the day o' yester, someone mentioned learning Java and as usual I did my best to dissuade them from the said madness. Back in 2005 the author of jEdit, a popular text editor written in Java, had written an excellent summary of why Java was irritating from the perspective of an experienced Java user. He said he was turning his back on Java, and only feebly maintaining jEdit from then on. I figured I ought to show the Java learner this article.

The problem was that I simply couldn't find it. I found a reference to it that I'd made on IRC, but I hadn't included the URI. Search as I did, I just couldn't find the article. This is extremely unusual because usually I find pretty much everything within ten minutes, but this time the ten was changing into twenty, thirty... and I just had to find this thing.

Morbus, to whom I'd been chatting in the original IRC logs reference, remembered the article and we both almost simultaneously found the site that the article was likely on. It had gone down, and it was blocked out by the Internet Archive, so that explained why it was so hard to find: it wasn't in the search engines corpus, and nor was it in web.archive.org. This was gonna be tricky.

Eventually, Adam Wendt told me that the author of the article was actually on Freenode, and introduced us. I asked him whether the article still existed, and he said no. I asked him if he had a private copy lying around anywhere, and he said no. He massively didn't care about Java anymore. There was one route left for me to pursue, which was my old HTTP traffic archive that I'd set up using slogger. I thought I was using it around 2005-02, and I turned out to be right! My 2005-02 slogger archive contained the article, and triumphant I mentioned this to the author and asked if he'd like a copy. He said no.

So, thanks to my obsessive information preservation tendencies, for quite some time I've been quite probably the only person in this world with this Java denouncement article. And thanks to my dislike for Java, I now realise this fact against all the odds. Funny how things like this turn out sometimes.

2007-08-13 07:57 UTC:

Word of the day: "mimmiku".

I found this word by chance, having actually coined it independently. I was basically wanting to increase the level of serendipity, so I typed random words into Google. According to my Search History I went for: quank, zorning, waylocks, wayblocks, shandle, grebbing, worge, smate, sheddle, drabe, triggle, quezzle, mimmiku, fwage, pomon, quezzy pezzy, pumbole, and ingdog.

2007-08-13 08:10 UTC:

Saw about two dozen Perseids last night, as well as Endeavour docked to the ISS, numerous other satellites, and two foxes.

2007-08-15 09:00 UTC:

Dave Pawson and I were chatting today about JpegRDF, and Dave noted that most EXIF tools don't seem to output the MakerNotes, the vendor specific metadata added to most photos taken with a modern camera. This can contain important stuff such as ISO settings, so it's worth extracting.

Eventually, he found ExifTool by Phil Harvey, which does just that. It's also got a rather nifty HTML dump feature that gives the hex specifics.

2007-08-15 11:52 UTC:

Dijkstra. Wow.

2007-08-15 13:06 UTC:

After reading the Dijkstra paper I discussed it a bit with Dave Pawson and Kevin Reid. Dijkstra argues that teaching and writing code that can be proved is a good way forward, so I investgiated some of the current state-of-the-art for programming languages that are linked with proofs. In doing so I came across programming-by-contract, which is variously intimated to be connected. But it's not. As Wikipedia says in its page about formal methods, "critics note that such semantics never really describe what a system does (merely what is true before and afterwards)".

An example. Programming-by-contract consists of adding pre- and post- conditions to code that are, hopefully, complete. We'll ignore that "hopefully" for a moment and introduce some example code for a post- condition:

post[a]:
    # length of array is unchanged
    len(a) == len(__old__.a)
    # all elements given are still in the array
    forall(__old__.a, lambda e: __old__.a.count(e) == a.count(e))
    # the array is sorted
    forall([a[i] >= a[i-1] for i in range(1, len(a))])

This is from Contracts for Python (pycontract). These checks are for a sort(...) function, but note that they don't specify whether the sort(...) function works or not for all its inputs, which is a definition of a proof. They merely test the sort(...) function every time you feed it some input. So in other words, they're really no different from a test suite, albeit one which is said to be a complete set of tests. (You would have to verify this on every possible input, though, which is impossible because there are an infinite amount.)

Given that Dijkstra said "deal with all elements of a set by ignoring them and working with the set's definition", clearly he wasn't thinking about programming-by-contract.

One question I have about what Dijkstra argues is whether the idea of proof based programming applies to higher level langauges. Since there weren't any really high level languages like perl6 around when Dijkstra wrote the paper, in 1988, perhaps he merely didn't think about them, thus rather succumbing to the thing that he warned against. That would seem odd though.

An alternative is that there is a positive answer to the underlying question, i.e. are high-level programming languages compatible with provability? I've been thinking about this with respect to perl6, which is getting very generalised, with its bytecode broken out into Parrot and hence perl6 being viewable as just a dialect for a particular set of core semantics. Surely there are people working on these core semantics, their verifiability at different levels, and so on?

Kevin Reid pointed out that bugs (or errors, as Dijkstra suggest we call them) can manifest at various levels of a system. My power supply might get interrupted, or my OS might crash, or the code might memory leak, or it might assign the empty string to a buffer that had important information, or its interface might coerce me into making a mistake when using it. I'm interested in the high-level bugs because those are ones that I can rectify. I can write a new editor, but I can't write a new OS. I think that it's more critical to fix OSes than editors, but that's not something you can hack on in your garage: that requires social placement and other serendipities. And, indeed, perhaps some kind of trickledown effect may be observed whereby making more robust high-level programs may lead to people in the right places putting more effort into creating robust low-level programs. Which is ironic since usually I think about it from the other way around.

2007-08-15 13:24 UTC:

Björn points out that "you don't really want to know what a system does, so long as it does what you want it to". Kevin grumbles that his point about contracts being more like goals rather than proofs didn't reach Whits, so I'm putting that right here. In other words, he says, "it is true that the contract is not a proof; but it is a formal specification of something to prove". So programming-by-contract is something which is useful to the act of proof, just in an orthogonal direction.

What Björn pointed out seems connected to the fact that if you're setting out to prove the wrong thing in the first place, then your program is going to be Broken By Design. Meeting your requirements is irrelevant if your requirements happen to be wrong or useless in the first place. This is, I suppose, an even higher-level form of bug, but of no lesser importance. I think that fields like interaction design are bound to be more necessarily geared towards solving such things, but I don't know much about them.

Provability appeals to me as a thing to be investigated with respect to programming because it feels, intuitively, like a low-hanging fruit. Something that might be very beneficial for very little input, or at least having a better ratio of benefit to input than alternatives such as studying the latest strategies in test driven development.

2007-08-15 14:27 UTC:

When I wrote Bug Free Programming!, I included a wonderful example from an editor that I'm writing that shows how writing clearer code can reduce bugs. Looking at it again, I think that I can do a little better, so here's what I've come up with:

def move_left(editor): 
   # uses editor.move_up
   # uses editor.go_end_of_line
   
   screen, doc, cursor = editor.components
   
   if not screen.start_of_line(): 
      cursor.move_left()
   
   elif screen.start_of_line() and \
        not doc.first_line() and \
        not screen.hidden_left(): 
      editor.move_up()
      editor.go_end_of_line()
   
   elif screen.start_of_line() and \
        screen.hidden_left(): 
      screen.show_left_section()
      screen.go_end_of_line()
   
   elif screen.start_of_line() and \
        doc.first_line(): 
      pass

Obviously you'll have to compare with the originals to see the improvements. The "uses" comments are to try to weed out dependency loops. A lot of the art of writing an editor is finding out which tasks are composed of other tasks and which ones aren't.

2007-08-16 07:45 UTC:

Word of the day: "Blengigomeneans".

2007-08-16 08:51 UTC:

Inspired by Dan Connolly's latest Notes on GRDDL/Javascript Development, I've been experimenting with the Tabulator API. It's quite easy to use: I got an example script working really easily using it. With just a few lines I was able to query out all the names in my FOAF file. Here's the main part of the code:

var kb = new RDFIndexedFormula(); 
var p = new RDFParser(kb);
p.parse(rdfxml, uri, uri);
var results = kb.statementsMatching(
  undefined, 
  new RDFSymbol('http://xmlns.com/foaf/0.1/name'), 
  undefined
);

Some of the terminology (indexed formula, symbol) is a bit quirky, but based on SWAP/CWM foundations so it's not too bad if you're of that heritage.

2007-08-16 08:57 UTC:

#swig: <dajobe> I am hoisted on my own heuristics

2007-08-17 08:11 UTC:

"Your Lulu Order Has Shipped" - the proof copy of Lo and Behold! is on its way! It already has some errors that I know about, but they're quite minor and I have a new draft of it ready to go which I think will be okay for general publication.

2007-08-17 10:43 UTC:

An idea for a web filesystem: use two uri-shortening (S1, S2) and two free-upload (U1, U2) services. Use the uri-shorteners to redirect to indexes on the free-upload services. Then a write would look like this:

  1. Upload the file to U1 and U2.
  2. Obtain the current index from S1 -> U1.
  3. Check that the index is available from S2/U2 too.
  4. Make a new index, and upload to U1 and U2.
  5. Reset the indexes on S1 and S2.

It should complain if anything goes wrong. To read a file would be a simple case of:

  1. Obtain the current index from S1 -> U1.
  2. Check that the index is available from S2/U2 (optional).
  3. Obtain the file from U1.
  4. Check that the file is available from U2 too (optional).

Setting up the accounts might be the hardest part. I'm also not entirely sure how automatable this would be... the upload places don't seem to be too keen on automatic uploading because those aren't considered commerciable hits.

2007-08-17 15:20 UTC:

Someone on #swig asked whether you can define the range of a property to be a list of some particular things. Since this is a rather useful recipe, I'm publishing my answer again here:

:property rdfs:range [
   rdfs:subClassOf rdf:List, [ 
      a owl:Restriction; 
      owl:onProperty rdfs:member; 
      owl:allValuesFrom :Something
   ]
] .

Ages ago, someone asked what the difference is between having a list and just having a bunch of objects. So for example:

:Simon :brothers 
   (:Jim :Albert :Bill).

Compared to:

:Simon :brothers
   :Jim, :Albert, :Bill.

The main difference we settled upon was that in the first example, the list of brothers can be said to be exhaustive: those are all of Simon's brothers. In the latter case, there is no such possible bound.

2007-08-22 12:28 UTC:

Sir Patrick Moore was on the news this morning, talking about Google Sky. If Sir Patrick's on the case then it must be good, I reckon, so I went searched about for more details regarding this. After a bit of jiggery pokery I found out that Google Sky is contained within the latest version of Google Earth, though not altogether conspicuously.

Still, it was worth the effort. It's a gloriously easy to use representation of the night skies, and I immediately went flicking about through the constellations at random, checking out interesting nebulae and so on. One of the things that I found was a little nebula at 20h33m49.32s RA, 45°38'24.93" dec. What's that, I wondered? What's that indeed.

Good though Google Sky is, it doesn't have a professional level of catalogue data about deep sky objects, which is a bit annoying because it does have a very good level of deep sky detail. So I went browsing through various deep sky object catalogues online trying to look for this object. I was able to get a good greyscale image of it from the DSS, but that didn't tell me what it was. So I downloaded Cartes du Ciel, a well known star chart program for Windows, and tried that. Nothing in there either. I downloaded a few extra catalogues for it. Still nothing. So I downloaded all the catalogues I could lay my hands on that might be relevant, and... success!

Or, well, kinda success anyway. Cartes du Ciel labelled this patch of sky "LBN337C 181", but Google didn't have anything for LBN337C-181. It did have something for LBN-337-C, though, and that enabled me to deduce that C 181 is a secondary designation and LBN 337 is the primary designation. And indeed, that is what the nebula is called: LBN 337. LBN stands for the Lynds Catalog of Bright Nebulae, and there are a few results for LBN 337 on Google, including a few pictures, but no really decent information like how far away from earth it is, what its real size is, how old it is, and so on. It's a minor object, clearly, but I'd've still thought there would be some more information about it somewhere. Possibly there is but I just don't know the right places to look, especially in printed sources.

The only similar-grade alternative to CdC for OS X seems to be Stellarium, but even though it's newer and prettier and hangs together a little better, its catalogues don't seem to be as comprehensive or as good as CdC's. Moreover, I couldn't find LBN 337 in it. For that matter, nor could I find the Veil Nebula in it. Of course, it wasn't in CdC either under that name, being instead broken into its NGC catalogued constituents, but at least I could find 52 Cyg in there. Couldn't even find 52 Cyg in Stellarium, though there was a star that seemed to be in the right place (I didn't check the exact coördinates) that was labelled 5 Cyg.

Anyway, the point of this anecdote is that it should be much easier to find out what a particular nebula is called and more information about it. I've been thinking about trying to convert whatever CdC catalogue LBN 337 is in to a Google Sky layer, but I'm skeptical as to whether that will be particularly easy to do. Also it'd be nice if Google Sky had an ajaxy counterpart...

2007-08-24 15:17 UTC:

Yesterday I was most concerned with astronomical image processing. There's a nebula near Gamma Cygnus (Sadr), and I wondered what it was, so instead of loading up Cartes du Ciel on Windows, I went searching on the web instead, and found an amazing image created by Douglas Finkbeiner of the "Cygnus Nebula". It was said to be Lynds Bright Nebula 258, so I looked up its position and navigated there in Google Sky and... how odd. Not much there. Just a bit of very dark red fuziness and a few stars.

In actual fact it was the right spot, but Douglas had enhanced the view of the area. I wondered if I could do the same, so I grabbed some data from the DSS form that I found in looking for LBN 337 on my first day of using Google Sky, and got three POSS2 channels for it: blue, red, and infrared. Doug had used SDSS data and different channels, but I was unable to find that data, so I had to use POSS2, which was passable.

After a few attempts, I managed to make a picture of LBN 258 that's not too bad, and almost approaches Doug Finkbeiner's prizewinner. But it wasn't quite there, so I emailed Finkbeiner and he was kind to give me a few pointers of how he did it and so on. I'm still kinda wondering how he got the SDSS data, though, so I may write to him again.

Using a similar kind of technique I also made a picture of the Butterfly Nebula, LBN 249, and the Crescent Nebula, also known as NGC 6888, Caldwell 27, or LBN 203.

In the evening, Adam Wendt and I experimented with some FITS files and stuff for processing those. We found a huge image of M81 taken by Hubble, but for some reason a 74MB JPG just wouldn't load nicely in Preview. Adam prepared a huge 7zipped FITS file of it, but I haven't tried it out in the ds9 FITS viewer yet; I probably need to install 7zip locally first anyway.

The moral of the story today is that the pretty astronomical images that you see on the news and so on have probably undergone a great deal of processing to get them to look that way, but that at the same time the amount of misrepresentation is low because very often you're enabling people to see things in different parts of the electromagnetic spectrum. In other words, if you want people to see intricate things in x-ray or infrared, you're going to have to false colour them! Those structures still exist, even if we can't see them using the naked eye. We can't see many deep sky objects using the naked eye anyway, so the leap to using a telescope is probably a larger one in a way than the leap to representing different parts of the spectrum.

2007-08-29 08:33 UTC:

I've converted the Lynds Catalogue of Bright Nebulae and the Principal Galaxy Catalogue to Google Sky compatible formats: LBN.kml and PGC.kmz. All of the source of the little conversion utilities that I wrote are available too.

2007-08-29 08:42 UTC:

The Tabulator Extension got released yesterday, and it's pretty nifty. There's even an RDF wiki online you can edit with it. After my experiment with the Tabulator API the other day, I've been thinking about interface idioms that I might like to develop using it and jQuery. In fact, it's really just the same idea that I had way back in November; navigation using real time evaluated path expressions. I've found that it's a bit difficult to figure out how the Tabulator output maps onto the actual graph, and I'd like to make the input datasets more distinctive, so it'd be nice to investigate those areas too.

2007-08-31 10:20 UTC:

This morning I thought, were I to choose which people could have gone on the cover of Sgt. Pepper's Lonely Hearts Club Band, who would I pick? Assuming that the album were made now, here's my list.

Adam had pointed me to Proemethea, a graphic novel, one of whose covers is a parody of a parody of Sgt. Pepper's, which started me off on this endeavour.

2007-08-31 16:01 UTC:

I've published not only my Tabulator API example that I mentioned some days ago, but also a new thing called Graphpath. It's a first draft of an RDF path evaluator. You can load graphs by typing in @uri followed by a space, for example "@http://inamidst.com/proj/sdoc/http.rdf ".

Sean B. Palmer, inamidst.com