Gallimaufry of Whits

for 2006-11

These are quick notes taken by Sean B. Palmer on the Semantic Web, Python and Javascript programming, history and antiquarianism, linguistics and conlanging, typography, and other related matters. To receive these bits of dreck regularly, subscribe to the feed. To browse other months, check the contents. This file was generated from plain text source, for convenience of posting, so apologies for all the in-your-face URIs.

2006-11-02 15:20 UTC:

On Web.Pi, protoplex, espia:

<sbp> indexes, events, storage, services, interface, identity, networking
* sbp isn't sure how indexes differ from storage
<jeffarch> oh yeah, was gonna look for that for ya too, sbp
<jeffarch> storage would include local fs storage, remote server storage, dht
   storage, etc...
<sbp> what would indexing be?
<jeffarch> I believe indexing includes tags, tagging tags, plexnames
<sbp> oh, so it's like distributed identifiers
<sbp> and catalogues and so on

2006-11-02 15:28 UTC:

Fixing emacs to work properly...

2006-11-02 15:43 UTC:

15:36 <sbp> awesome: $ ./note
15:36 <sbp>    Fatal error (11)../note: line 12:  6963
Segmentation fault      emacs -f end-of-buffer $NOTE
15:41 <sbp> hmm, I wonder...

- #swhack

2006-11-02 15:45 UTC:

$ cat note
# note - Write Notes
# Author: Sean B. Palmer,
# When found, make a note of!
# - Captain Cuttle

NOTE=$(date +%b%y.txt)
echo >> $NOTE
date -u '+- %Y-%m-%d %H:%M UTC' >> $NOTE
echo >> $NOTE
echo >> $NOTE
emacs +$(wc -l $NOTE | awk '{print $1}') $NOTE


Should get jcowan to work on bae

wc -l $NOTE | bae a

tabulate fields... tab?

SEPR='[ \t]+' - pcre
SKIP='[ \t]+'

Reminds me of my idea for a Unicode stream processor, taking into account lots of the tricks listed in the TRs, such as isolating words and so on. Something that patbam would be interested in.

2006-11-02 15:49 UTC:

I've been working on a few things this morning, and decided I wanted a really simple place to take down notes. It's actually quite hard to stick to a notes file... I decided I wanted month stamped files, with discrete dates inside them, eventually, but first I was thinking of using a specialised line-mode editor, and all kinds of stuff. I even made a script that greps itself, noting that you can do stuff like:


   script stuff


   Some general non-bash content.

And the "script stuff" greps itself. But anyway, that would mean a scheme where one uses the PATH_INFO to search within the script, which causes various problems as to addressibility.

I'd like to have plinks for these files, perhaps... but it would also be nice to go back and edit things generally.

2006-11-02 15:52 UTC:

Of course, the beauty of it in general is that you can have any interface you want to it. What you're doing is just adding text and adding text.

So tav was talking about Web.pi, and those core technologies:

indexes, events, storage, services, interface, identity, networking

He figures we can just use HTTP for networking. I want to convert Tabulator into being a better all-purpose Semantic Web engine, and it occured to me that the things that tav is talking about are key things to consider with respect to that.

Slidy needs fixing.
   - tabulator@w3 and public-semweb-ui@w3 for feedback

(The /latest thing is pretty nifty, hadn't come across that before.)

The Game of Love... interesting tune, haven't heard it for ages:

2006-11-02 16:00 UTC:

I chose the name kajero from here:

   kajero: exercise-book, folder, notebook

For such a wacky language, Esperanto is oddly good for things like this.

The DesignIssues are awesome... reminds me of this:

(Argh, having problems pasting, even when turning the Pelican Minor Mode off; it then indents stuff annoyingly. Let's try pre-processing in TextWrangler.)

          STITLE/ ID=ABC  >What Next?<
            FIG/ X=7 y=67  CAP=>The solution<
      /  IDX;

Now I can parse that and see that I am missing a section end.

Of course real language people might want use a different concrete syntax:

      { section(level=2)
          { stitle(id=abc)
             "What's Next?"
          } stitle
          { idx
              { fig (x=7, y=67, cap="The solution")
              } fig
              } idx


There we go. Stupid tabs.

2006-11-02 16:10 UTC:

whit Look up whit at
    "smallest particle," 12c., in na whit "no amount," from O.E. nan wiht, from
    wiht "amount," originally "person, human being" (see wight).

Nice prices at

2006-11-04 11:37 UTC:

Idea: superscript capitals as footnotes! More mnemonic than using numbers, and you don't have to reorder if you insert them, but just as cool.

So for example: See Candeira<sup>CN</sup>, 2005.

2006-11-04 13:19 UTC:

Okay, modified the script a little to actually work. Javier is talking about pyblosxom and says he has 15 pages of notes. Woah. I told him he should publish them, but I'll bet that they don't get published. If I had 15 pages of notes like that, I probably wouldn't publish them either. This current file isn't even being published yet. All of this is not good!

2006-11-04 16:05 UTC:
- chat about Semantic Wiki URIs

Also about annotation and the Semantic Web...

<timbl> | And the usual sem web value add from semantic wiki is the addition of
        | backlinks.. any page on X aboux also getas nay links to or from x on
        | other pages.
  <sbp> | it was only when I read LinkedData the other day that I realised
        | that: that the Semantic Web has annotea built-in, in a way
<timbl> | really? it doesn't by itself solve the problem of discovering the
        | annotation servers you want to be using. It does solve th eproblem of
        | merging annotations from different sources
  <sbp> | no that's true, but then annotea doesn't solve that either. just
        | provides the annotation servers
  <sbp> | Google solved discovery for the web though. I think a decent Semantic
        | Web search engine could likely do the same for the sw
  <sbp> | (having said that, I often find things in my referer logs that I
        | didn't find via Google. bet I'm missing lots more that don't link)
  <sbp> | (and Google doesn't solve the annotation server problem either,
        | unless the annotation server data is public)

I've been thinking for a while that the best thing for the Semantic Web would be a huge aggregated site, constantly updated and with a really good search interface, basically the Semantic Web in your pocket, just like Google is the Web in your pocket.

2006-11-04 16:19 UTC:

I'd like to have a proper Semantic development area, working on exposing data perhaps more than consuming it and so forth. Really working on building up the instance data and schemata and finding interesting things to achieve with it... of course, that's kinda been the goal since the beginning.

Oh, here's the edit scipt I'm using now after making it work better:

NOTE=$(date +%b%y.txt)
echo >> $NOTE
date -u '+- %Y-%m-%d %H:%M UTC' >> $NOTE
echo >> $NOTE
emacs +$(wc -l $NOTE | awk '{print $1 + 1}') $NOTE

2006-11-05 10:34 UTC:

It might be nice to implement tab as a C program, with some simple maths and so on; or as a Python/C project to get into that.

I'm also still interested in making the RDF API pool, binding together all the best Python processors of RDF together. People should use .nt.gz at the main transfer format for RDF... I wonder how the compression rate compares to RDF/XML and Notation3?

JTab - Tabulator as Firefox extension.

2006-11-05 10:58 UTC:

Yet another script change:

NOTE=$(date +%b%y.txt)
if [[ -n $(tail -n 1 $NOTE) ]]
then echo >> $NOTE
date -u '+- %Y-%m-%d %H:%M UTC' >> $NOTE
echo >> $NOTE
emacs +$(wc -l $NOTE | awk '{print $1 + 1}') $NOTE

2006-11-05 11:18 UTC:

Filter to canonicalise cwm's N-Triples output:

sed 's/^ *//g; s/ *$//g; /^ *$/d'

Which was for this:

$ wc -m selectors.* | sort -rn
  355206 total
  153213 selectors.nt
  137632 selectors.rdf
   39464 selectors.n3
    7171 selectors.nt.gz
    4972 selectors.nt.bz2
    4610 selectors.rdf.gz
    3124 selectors.rdf.bz2
    2698 selectors.n3.gz
    2322 selectors.n3.bz2


2006-11-05 11:54 UTC:

Probably best to grep notes out by day, and then have internal links to the minuted sections.

2006-11-05 12:35 UTC:

Actually, converting to HTML on the fly and having anchors for the minuted sections, so that way even if they break you get the right page, and there don't need to be lots of secondary URIs and the redundancy that causes.

Using IDs like 2006-11#N11-0511, i.e. #N[ote]<day>-<hour><minutes>

2006-11-05 12:58 UTC:


The pl. minutes "record of proceedings" developed c.1710, perhaps from L. minuta scriptura "rough notes," lit. "small writing."

And yet...

Therefore I have entreated him along
With us to watch the minutes of this night;

I'd figured it was in the "recording" sense.

Indeed the OED disagrees with to a degree, perhaps not to the plurality but to the implication, in that even the verb is recorded back to 1601, the same year roughly as Hamlet:

1601 in R. Pitcairn Criminal Trials Scotl. (1833) II. 374 The
relaxatioune..quhilk is minut, produceit and registrate in the sheref bukes of Roxburgh.

And as a noun:

1443 in H. Nicolas Proc. & Ordinances Privy Council (1835) V. 276 Stourton was send to Eltham to {th}e King wt a minute..of lettres patentes.

2006-11-05 13:15 UTC:

One nice benefit of having the timestamps appear above the content that they're timestamping is that when you link to them, you can see the content immediately, unlike say when handing out loggy: uri? links to people on #swhack: generally you have to go back through the logs yourself to find the exact part being referred to.

2006-11-05 13:36 UTC:

It'd be nice if Google Search History supported export of all searches over specified time periods. Since I started using it, it appears I've made 23,517 searches. The stats ("Trends") could be beefed up too.

2006-11-05 13:45 UTC:

Awesome, somebody picked this quote up:


"the Semantic Web isn't even cool. it should be cool. it should be like Ajax, it should be like Web 2.0: it should be this suite of awesome tools that people can use off the shelf for doing cool data remixing stuff"
-- Sean B. Palmer


The source is

Found via a vanity search on Google Blog Search, which I was led to in investigating whether I could remix search results across web, images, groups, books, scholar, and blogs... Doesn't appear so, sadly. Kinda ironic that I found this quote in trying to achieve that though!

2006-11-05 14:23 UTC:

Since miscoranda is a bit of an insoluble problem, how about another weblog for technological matters? Could have miscoranda's index redirect to the new one for a trial period or something.

I'd like to write a few entries in advance before going ahead with it, perhaps, and I'd like to make sure I do pinging and all that sort of stuff. Would like to use a third party comments site too... but I wonder if there'll be one that's good enough for me. Arnia mentioned one on #swhack, I recall.

And then there's naming and stuff. Should look at the logs for when we came up with "Pushing Buttons", since I guess this is a similar thing, only in weblog rather than site form. Perhaps there's some intersection. I'd also like to use the current development stuff/notes approach, compile-around-edit, for the editing of the weblog, so that it'll be static.

For slogans, I wonder about a hattip to Mischmasch's headline. Perhaps even the title could tip to that too.

Might need to have a different category for RDF; would like to appear on Planet RDF again, after all this time. Would like to cover the Semantic Web, URIs, taxonomisation, organisation, filesystems, small bits of code, Python, Pluvo, web architecture, TAG issues, and all that kind of stuff.

2006-11-07 10:45 UTC:

Short name British organisations tend to be cool for some reason; especially the CPRE of course.

2006-11-08 13:25 UTC:

In Forteana, another mystery explosion...

The more interesting of the TAG Issues:


It'd be nice to push the IFP URN-NID through...

2006-11-08 13:31 UTC:
- Reinventing HTML

Cf. my comments on that:

This links to TagSoupIntegration-54, probably.

"Similarly, I met up with Susan Greenfield from the Royal Institution - not a loony bin for inbred monarchs, but a public forum for the demonstration, discussion and discourse on science that goes back to the days of Humphrey Davy and Michael Faraday." -

(from: "Pretty" is a feature)

- I'd been talking to Javier about this, but this also led to this very notes file, rather oddly, since I was lamenting Thread Mode problems in writing Strange Strands. But this is still Thread Mode, so I guess that's not really the biggest of the problems: I think its discreteness and the size and polishedness of the posts.

Cf. current development notes/constraints "Minimising Editing Constraints", just a couple of days before making this notes file.

2006-11-08 13:36 UTC:

Boy spots an Avocet:

I've been working on a format called Avocet, a kind of structured text format along the lines of Markdown and ReST, only smarter. The name originally came from the thread on naming Pluvo.

Development cgi/rootlist might be a good way to show how optimising code like function robots() can really make a huge difference to performance.

2006-11-08 13:44 UTC:

- Compendium

2006-11-08 13:46 UTC:

Ford T.D. ;; 1992 ;; A hitherto unknown account of a late 18th century visit to Speedwell Mine at Castleton by James Plumtree ;; Vol 11 (6) pp 281-282 -

Cf. Titan:

2006-11-08 13:52 UTC:
- On application/xml and RDF Statements vs. Pictures of Statements

A big chat from #swig, not otherwise in the logs.

These notes could be nice for seeding a new technological blog. It'd be nice to write it to about 70% of the quality of What Planet. It'd certainly be freeing it from the overlapness of miscoranda... The whole mix on miscoranda was certainly a problem: there isn't really much overlap between tech and antiquarianism, at least in the sort of things that I work on, so everyone apart from jcowan would be ignoring half the posts.

2006-11-08 14:00 UTC:

@@ Following the patterns in

Google Mashup:

HaloScan, via Arnia:

The Rectory Umbrella:

2006-11-08 14:05 UTC:

That's as far as I got on some GRDDL hacking with DanC the other day; he was working on some slides but appears to have worked more on the spec instead.

2006-11-10 17:48 UTC:
- Why I Don't Use Doctypes

2006-11-11 09:59 UTC:

This has a lovely style:

It also does this cool thing of highlighting the element selected by the fragmentID, if one exists. It doesn't seem to be Javascripted either, though it surely must be unless it's some kind of wacky CSS extension.

Also the email address obfuscator is rather good, though it does sort of rely on utf-8 in the source to provide some of the obfuscation. Makes me wonder about using a real Unicode editor again... does gVim handle it?

2006-11-11 10:00 UTC:

Editing XHTML would be a good post name for the kind of mashup of Amaya, emacs + nxml-mode, templating, and so on.

2006-11-11 10:02 UTC:

Could do a vi edit and then have Jing check the file if it was html; i.e. have some kind of a wrapper for it. Then the error could be fixed in emacs with nxml-mode if it was non-ignorable.

2006-11-11 10:04 UTC:

On the wc -m selectors.* thing from 2006-11-05, I wonder whether something could do a kind of pass to a canonicalised subset of Turtle, and then gzip that... it'd probably be very efficient indeed. Even more so than the cwm output possibly, given how much whitespace it loves to throw around everything; it'd be an awesome display of human readability, understanding, and compression. Cf. the Wikipedia compression project.

2006-11-11 11:09 UTC:

Dynamic tradition. A meritocracy of paegentry. Laurels and so on... questioning the role of traditional organisations both professional and traditional; is it not just consensus? Consensus can be established in many ways, but it's slow and reactionary, not progressive and evolutionary. Cito! (Such as is the motto of the Worshipful Company of Information Technologists).

2006-11-11 11:58 UTC:

@@ attic script for copyarchiving stuff to The Attic?

2006-11-11 12:42 UTC:

Free the Postcode!

It would be nice if the PAF were available freely...

2006-11-11 13:52 UTC: only Ajaxy?

2006-11-11 13:54 UTC:

Does Firefox check if you add duplicate bookmarks?

2006-11-11 14:42 UTC:
- Joyful Little Tabulator Source Rant

Summary: the Tabulator source code is slightly gasworksian, but then I end up thinking about prototypes and behaviourality, especially with respect to Pluvo. I was thinking about "indexable!" being a kind of trait which you can pass along to a constructor to get it to enable some kind of behaviour in the instance.

2006-11-11 15:16 UTC:

Me on building intelligent IRC bots: "well I thought about it from the reverse angle | when you have a bot, how can you tell that it's not human? | what properties of its discourse let you know that it's not human, in other words? | and I figured there are two main things: | 1) Response time. | 2) Ability to remember based on context. | there are lots more, actually, but those are the most trivial, the first things that tend to show, I think | solving 1) is really easy: add an automatic thinking and typing delay | solving 2) is way, way harder | I'm not sure how to solve it, but being able to remember stuff that's gone on and act is rather an important question | indeed, a more important question is: why do we want bots that seem intelligent? | we have lots of intelligent agents already in the form of people | I think it'd be an interesting technical challenge to write a bot that passes off as a really, really stupid IRC user | I wonder if that'd be possible | it's the Turing Test with a spin!"

(Pipes indicate IRC caesura.)

2006-11-11 15:19 UTC:

How about PUNCTUS ELEVATUS to indicate an IRC caesura?


Also, using "date: " rather than "- date" might be more clear.

2006-11-12 09:41 UTC:

2006-11-12 14:57 UTC:

Public vs. private, topic vs. topic is not a distinction that I used to make before I knew about the internet; all of my notes were private, and organisation didn't particularly matter since it was only for my benefit. Now that I try to make as much public as possible, I am faced with the continual stumbling block of presentation. It would be nice to just write notes again without having to worry about their content, without having to worry about whether I should publish them in case I later want to make a book out of them or whether some copyright somewhere is being violated. My commonplace should be available to everyone, but that's not really possible in a society that isn't too generous about doling out that freedom.

In my early days on the internet, I felt a lot more excited about the whole thing because, for example, I didn't have to worry about clean URIs. My main worry was not having to pay for it: I felt the web should be free. Instead, I conceded to the whole notion of having to pay for a domain and a server, and I started to worry about presentation and so on. All of this should be a solved problem... like it or not, inamidst and the other mess of sites that I have *are* the conceptual library that I want to build, the idea that I keep returning to of starting anew now that I've learned more about organisation and presentation.

I guess there are different expentancies. The technology community expects me to have nice URIs and use subversion. The earth lights and antiquarianism communities expect me to publish papers and books. Even the kind random people who are interested in what I do would probably prefer me to have a weblog so that they can follow what I'm doing more easily. Perhaps that latter is a "solution" whereas the former two are problems.

2006-11-12 15:07 UTC:

It occurs to me that this file is a block oriented IRC channel that's not going across IRC and is instead being exported/logged directly to HTTP.

2006-11-13 13:00 UTC:
- [Draft] Process Suggestion: EO in TR: Don't

2006-11-13 15:34 UTC:

<sbp> I'm thinking about nuking miscoranda and starting a new tech weblog
<sbp> since I've got a huge list of topics I want to write about and am not
 making progress on them
<danja> I wouldn't nuke miscoranda though, it does work and you might want to
 go back to it one day...
<sbp> oh yeah, I'll just leave it dormant
<danja> right
<sbp> actually I was thinking about HTTP redirecting the feeds to the new blog
<sbp> since I think most people are subscribed to miscoranda for its tech rants
<sbp> whereas I sorta just wax lyrical about whatever on there
<danja> fun reads those though
<sbp> thanks. does my head in to have both tech and non-tech in the same place
<danja> I guess if you felt wax-inclined you could always merge, Planet SBP or
<sbp> heheh. yeah
<sbp> is a Planet of Planets a Solar System?
<sbp> anyway

2006-11-14 16:38 UTC:

"what I've been thinking about is: what do people want to find out in a Semantic Search Engine? | and I think actually you don't need to store all of the triples | for example, one query that's useful is: is the range of property ?x a Literal or Resource? that's pretty important to know if you have an RDF editor, for example | so you'd have to index that specifically when you crawl in order to be efficient on looking it up | and you could do it very cheaply: ?p :literals "500"; :resources "2" . | you don't have to store all the triples where it's a literal and all where it's a resource | another use case I was thinking of was namespace prefixes. what prefixes do people use for various namespaces? again, you'd have to index that separately of triples | one rather more triple lookup oriented thing is to find existing properties. I've often wanted this when working on new projects: should I use an existing date property, or make up my own? | what I need to know there are: a) what are the most widely used date properties in existence currently; and b) what do they actually do compared to what I need them to do? | a search engine would be better at finding the former, and should give a link to the latter | then I was wondering about Julie-like things, and how to provide that service when you have, say, 500 million triples. What Would Google Do! | and I figured that actually, perhaps you could just index where various terms appear |so if contains :sbp foaf:nick "sbp"... | all you do is note the fact that "sbp" is in that document (as an object, perhaps), foaf:nick, and :sbp | then when you do a query, you just look for documents that mention all of those in the correct roles | now, of course, at that point you might not get the right answer because I might have :StateBankPakistan rdfs:label "sbp" . :crs foaf:nick "crschmidt" . it'd be a false positive | but I wonder if, in the general case, that'd be useful enough when you also incorporate a PageRank algorithm for RDF, which I've also been thinking about | basically documents which use well known properties get higher rankings, where a well known property is one used on lots of different sites, lots of different times, and has been around for quite a while | so I figure that in some *handwave* way it might be a good way to explore the Semantic Web, perhaps with a decent UI on top of it like Tabulator only done properly... | the important thing is to ground it in real use cases, so, for example, if I want to find out what conference DanC is at now. he publishes his data, but how do I find that? with Google, or would a Semantic Search Engine do it better? if not, what about when I have a start point for the data? then would a Semantic Search Engine help? | I think if you know the properties that people are using for such stuff then it would definitely be | so the idea is that it would help people to keep using the same properties as one another, and then people would start to remember them as much as they would remember that is news and is comedy | and all without having to index the entire amount of triples that you crawl... | I guess the moral in all of this is that the RDF isn't just some big distributed RDB. it's decentralised and it's messy and it's new, and you have to treat it like that | possible downside: people don't know how to use the Semantic Web. I don't want to give people a pencil and have them clean their ears out with it"

- From IRC, chatting to crschmidt

2006-11-15 10:41 UTC:

Discussing Beautiful Code with Javier...

Aaron: Feed Validator, djb's code...
PPR: emacs, tex, scheme, vms, mumps, zork, lisp
sbp: nxml-mode, Exile map generator, kt's C backdoor, MJD's

MediaWiki for Swhack/Pushing Buttons/+?
Freenet for the web

2006-11-15 17:45 UTC:

Check out PSIs and foaf:primaryTopic.

2006-11-17 11:38 UTC:

What data do I publish?

What data should I publish?

(Or, what data would I publish if it were easier?)

Hmm. That's not all that much.

2006-11-17 11:50 UTC:

Have an attic for stuff I'm reticent to publish because of its quality? Could be a good idea, even though its semantics are slightly different from the development equivalent.

2006-11-17 16:41 UTC:

Javier suggests writing a Turing Incomplete post about domain name prices, but I tell him that it wouldn't be a very long one and is more suitable for here, so...

Domain name prices suck, but it turns out that you can get a domain name for $6.89 per year for a ten year period, plus 25c/year ICANN fees, which turns out to be only about 37 GBP. That's not bad. This is for a .info domain, which is cheaper than all the other generic domains... I wonder why? Perhaps because .info is less commonly used still. For the other domains, it's probably worth doing a similar thing: just register them for lots of years.

The gamble is that the prices will rise with inflation in future rather than with sanity. In other words, domain names are really still extravagantly overpriced, but as long as they stay that way, doing a kind of bulk registration into the future will be beneficial.

2006-11-19 13:19 UTC:
- Opening a .app from the command line
Summary: open /Applications/

2006-11-20 09:52 UTC:
- Arithmetical Croquet
From /The Diaries of Lewis Carroll/, ed. R. Green (1954).

2006-11-20 09:59 UTC:
- only seems to apply to msnbot, not much of a big deal

2006-11-20 11:32 UTC:
- Abstract Syntax Notation 06

Sandro Hawke's new grammar thing.

Grammar <- Regexps, ABNF
  Types <- Algol, Pascal, Modula3
 Syntax <- Python
   URIs <- Notation3

What would the perfect Semantic Web format look like? Some melding of JSON, Notation3, and Pluvo? That points to some triple-levelled language which has transparent semantic datatypes, a declarative semantic level, and a more procedural programming level.

On the other hand, what about nonce formats? For example, I came up with the following to capture Will-o'-the-wisp synonyms:

Ignis fatuus
   Ignes fatui
      2006-04-06: 12100
   YYYY-MM-DD: 11600
   2006-04-06: 157000
   @ln singular

It's a kind of indented, key based format. So what about grammar annotations like Sandro was working on way back in 2001? Or something like Perl6 where you have an extensible core (you can define things like new traits and operators, even circumfixes and so on).

On the schema language used, note how RDFS and OWL are very strict and classical, using Classes instead of Prototypes. And RDF Datatypes are strict like static typing rather than dynamic or duck typing. What if there were a format which was more flexible and dynamic?

2006-11-20 12:05 UTC:

@graph ${URI}Graph
   dc:title "RDF/XML Syntax Specification (Revised)"
   doc:editor {
     foaf:name "Dave Beckett"

2006-11-20 12:21 UTC:

A while ago, Mark Nottingham wrote about JSON and XML:

It made me wonder why Classes and Datatypes are separated not only at the syntax level in RDF, but even at the ontology level. How about a more flexible approach to defining new datatypes syntactically?

For example, let's say we have a format which doesn't have a date datatype, and we want to extend it to include one. Let's say that the range of :modified is a :Date thing, and we want to extend Literal to include it. This was one of the actual considered approaches to datatypes in RDF, I recall, and it didn't make it: rather a shame!

Anyway, in the N3 syntax, we have "literals" but not extensible barenames. It'd be easy to collect \S+ and then put it through some filters. 2006-10-25 would be good to use as a date.

<> :modified 2006-10-25.
:modified rdfs:range [ rdfs:subClassOf rdfs:Literal ],

I think DanC was working on something like this with @datatype.

17:07:10 <DanC> hmm... @namepattern "\d\d?[A-Z][a-z][a-z]\d\d\d\d" <>.
17:08:17 <DanC> or @datatype "[+-]?\d+" <http://...#integer>.

Which points to

2006-11-20 14:01 UTC:

The 19Nov2006 format is pretty neat (cf. ShorthandRDF).

2006-11-20 14:39 UTC:

Benjamin Franklin was a schemer-- a schemer for good works. He wanted to start a hospital, for instance, and proposed to find private donors and get matching funds from the state. (When he had a proposal, by the way, he liked to be modest about it-- he'd say "some gentlemen friends" were proposing it. This softened personality-based resistance... and generally he'd ultimately get the credit anyway.) Noting that many legislators scoffed at the notion that he could privately raise 2000 pounds, he proposed a conditional appropriation, to be made only if this figure was raised. Now many assemblymen could vote for the measure convinced that they would never have to pay; and Franklin could tell subscribers that their contribution would be doubled by the Assembly. P.S.: He got his hospital.
Autobiography, Benjamin Franklin

2006-11-20 16:24 UTC:

I've just taken wordcounts of ten of the most awesome articles online, and found that interestingly the average came out to be very, very close to 5000: it was 4952. The range is from about 2000-10000, so it's like 2-10 What Planet essays, centering on 5. It'd be nice to write some more articles of this kind of length; I rarely do so.

2006-11-21 09:42 UTC:

John Cowan requests that I reveal the top ten articles that I wordcounted yesterday. They are, in no particular order:

They're all quite long, which contrasts heavily to the kinds of articles that I write, only two of which come up to even the required length, let alone the quality:

What Planet <> essays were generally up to about 1000 words in length, so it'd just be like stringing five of them together, but on the other hand they were quite compact (cf. writing44.html in the grande list of ten).

In any case, I still think that 5000 is a good length to aim for for an awesome article, so I've been trying to identify topics that I could write that much on, and it turns out that there aren't really many. Of course, if you look at the Grande List, it's telling that only one person has two entries on there, Mark Rosenfelder. Really, really good articles are rarities.

Still, I don't think it's a bad thing to want to aim to produce really cool and useful works online!

2006-11-21 16:32 UTC:

Old ideas include...

2006-11-21 16:43 UTC:

On vislangs:

2006-11-21 16:44 UTC:

2006-11-23 14:01 UTC:

I was reading 1 Henry IV last night, and I was thinking about Shakespeare's use of glittering, whirling verbs. I was trying, a bit like Franklin after the Spectator, to come up with similar usage patterns, and set myself "he ran across the field" as an example... the first verb that sprang to mind was "he pegged it across the field", which is common slang in my dialect.

So I looked up peg, v., in the OED and to my surprise it's been around for quite a while, since at least the mid-18th century and possibly earlier if it was as slang a term then as it is considered now:

10. a. intr. and (in later use) trans. with it. colloq. To move vigorously or hastily. Also with away, off.

1748 T. SMOLLETT Roderick Random I. xxvii. 247 The captain, by his sole word and power and command, had driven sickness a pegging to the tevil, and there was no more malady on poard. 1808-18 J. JAMIESON Etymol. Dict. Sc. Lang., To Peg off, or away, to go off quickly.

From the second quote it looks as though it was current in Scotland, though this usage is probably in the definition and maybe the author wasn't only familiar with Scots English. I don't really follow the etymology of the phrase from the OED though. It puts the sense under "II. Other uses", and apart from perhaps peg in the sense of throw a missile, it doesn't really relate to anything else. Perhaps a bit of an anomaly.

2006-11-24 09:56 UTC:

Yet another quite established word that I thought was more likely to be recent, local slang:

grizzle, v2
2. To fret, sulk; to cry in a whining or whimpering fashon. Hence 'grizzling
vbl. n. and ppl. a.
1842 Catnach Ballad in Westm. Gaz. 7 Apr. (1899) 2/2/ Useless is our grumbling,
our grizzling, or mumbling.


On the other hand, the OED does say of the term that it's "local". The Westm. Gaz. must be the Westmoreland Gazette, Westmoreland being an county that's now been subsumed into Cumbria: One of the later quotes is from the "Kentish Gloss." though, and sense one of the verb, to grit the teeth, is from the westcountry. So by local they appear to mean "local to England"!

2006-11-24 10:21 UTC:

Ugh, now that Flickr has been Yahooinated, it prompts for login on, presumably, private pages. So now I have a Flickr account, which saddens me; though I have to admit that Flickr is very, very handy when looking for content to use in things like

Oh and another good thing is that it allows exclamation marks in the username, which is rather unusual.

2006-11-24 19:24 UTC:

Ooh! Notation3 editor + Ajax + Tabulator's RDF engine + RDFPath + RDFe...

2006-11-24 19:41 UTC:

@@ Ways of dealing with subgraphs returned as path matches.

2006-11-25 11:27 UTC:

On paths vs. regular expressions, the following:


Is rather like /<node()>(?=<rdf:type>)/ in regexp terms. Term1/Term2/Term3 is a bit like /(?<=<Term1><Term2>)Term3/, so it's kinda strange that it should be so common a construct in paths; it's not in regexp. I wonder about a slightly different syntax for paths... tests and forward traversals being so common that they should be huffmanised somehow. "/" (for forward traversals) could become a simple space, " ", and "[...]" for tests could become "& ..." or something, though I'm not sure if that's any clearer.

   *[rdf:type/foaf:Person foaf:name/"Sean B. Palmer"]/foaf:knows/

Would be nice as...

   @prefix default foaf.
   * (a Person; name "Sean B. Palmer") knows * (a Person) mbox *

Perhaps *(...) -> [...].

   [a Person; name "Sean B. Palmer"] knows [a Person] mbox *

That's rather neat and intuitive; very N3-like too.

2006-11-25 13:12 UTC:

Part of my idea about N3 editing yesterday was that one could have a kind of TurtlePath, which would be to RDFPath as XPath is to it in XML: it would allow you to traverse the syntax rather than the semantics.

This might be a good idea given that people organise things in Turtle files in particular ways. For example, you might split documentation and ontology data into two sections. This could extend to fairly arbitrary divisions in the syntax, so I might have people I know most at the top of my FOAF file, and other acquaintances at the bottom.

In the graph, there would be no syntactic distinction between groups, even though there is in the file. So an editor that used a graph traversal mechanism would be losing data: data which is put in there for convenience of editing!

2006-11-26 10:40 UTC:

<sbp> blufr: heh, awesome. it's like Call My Bluff
<d8uv> We should try to bluff eachother
<d8uv> sbp: The creator of Python has an irrational fear of dish soap, and thus
 must throw away his dishes after he's done with them
<sbp> ...probably true!
<sbp> d8uv: Jim Carrey once got into trouble with the Birmingham Metropolitan
 Police for not having bushy enough eyebrows
<d8uv> False. He's afraid of soap in general, and is only barely smellier than
 other open-source language developers
<d8uv> And that's false
<sbp> correct. Jim Carrey has NEVER been in trouble with the police
<d8uv> sbp: Giraffes can talk
<sbp> to humans, or to one another?
<sbp> I'll go for false, either way
<sbp> d8uv: Munchkins and Oompa-Loompas are actually the same creatures, only
 exposed as different dimensional corners
<d8uv> The answer is true, however, their voice is too high-pitched for people
 to hear
<d8uv> Munchkins and oompaloompas...
<d8uv> I'll go with true
<sbp> correct
<sbp> as far as anyone can tell
<sbp> damn, losing 2-0 at the moment
<d8uv> Ok
<d8uv> Lemme get a good one
<d8uv> Coca-Cola was originally invented as a herbicide
<sbp> false!
<sbp> Pepsi is just Coca-Cola with added sugar
<d8uv> False. It was made as an insecticide
<sbp> ooh
<d8uv> 2-1 now
<sbp> yay
<sbp> don't forget to answer mine
<sbp> I already have the answer typed out
<sbp> so you know I'm not cheating
<sbp> I will paste it in when you answer
<d8uv> Oh I thought you were explaining your answer
<sbp> nope
<d8uv> True
<sbp> it's false. Coca-Cola is Pepsi with added sugar (that's why it tastes
<sbp> 2-2!
<sbp> we should market this
<d8uv> Or put it in whit
<sbp> oh right, good idea!

2006-11-26 11:14 UTC:

So in Morbus' List of Sri Lankan musicians ( he links to something called Managing User Creativity:

For some reason it made me think about Iceland. I'd like to do some kind of interactive game online that kinda tries to mirror society on an island the size of Iceland or so, only I'm not sure how to get all the interactions working and such.

I was thinking about having a kind of coordinates system, and you'd subscribe to one particular set of coordinates, and then start writing about shit you're doing there. Then if other people were nearby, you'd hear some of their posts too, their writings, depending on how far away they are from you: the further away they are, the smaller the percentage of their writings you'd hear.

But that's a kind of silly idea... I dunno, I was trying to think of a kind of adult version of I guess. It got me eventually thinking about how there is already a kind of online society, and trying to make fantasy games is almost like trying to make micronations: for a micronation to work you need territory and consensus. To make some kind of "online nation", there's just no way to succeed to any degree of reality on the former point, but on the consensus point, MMORPGs and things like Second Life do pretty well.

If someone created a text-based Second Life that didn't suck, that might go down well. MOOs were probably intended to be that... I don't like MOOs particularly, so they don't qualify under "don't suck" for me.

2006-11-26 11:30 UTC:

Septentrionalists Unite!

2006-11-26 11:31 UTC:

Heh, the constellations are like the Rorschach Test of civilisation.

Cf. "The constellation of Ursa Major has been seen by many distinct civilizations as a bear. [...] In earlier times, Greek mythology did not consider Ursa Major a bear, and instead its 3 bright stars (situated in the tail) were seen as apples growing on a tree (sometimes represented by the fainter stars in the remainder of the constellation)."

2006-11-26 11:41 UTC:

The Medievals did not believe in a flat earth; they believed it was spherical, and writers such as Mandeville and Dante show that they understood that down points in different directions in different parts of the earth. They also knew that the universe was immense-- the stars, for instance, were said to be larger than the earth; and the South English Legendary says that if a man could travel upward at forty miles a day, he could not reach the fixed stars in 8000 years. (That is, they're more than 116 million miles high.)
- The Discarded Image, C.S. Lewis

2006-11-26 11:44 UTC:

Thou divine Nature, how thyself thou blazon'st
In these two princely boys! They are as gentle
As zephyrs blowing below the violet,
- Shakespeare; Belarius, Cymbeline IV.ii

2006-11-26 11:50 UTC:

"What makes portly Sir John so entertaining? How is it, when his actions would repulse many in both a modern and medieval context, we find ourselves so attracted to this lying tub of lard? Speculation over the years has produced many possible answers, one no more likely than the next."
- Kenneth MacLeish, Longman Guide to Shakespeare's Characters (1986)

2006-11-27 10:01 UTC:

Even though I'm subscribed to an assload of friends in my aggregator (Google Reader), they're mostly of the serious "dude, we blogging" variety rather than the "LOLLISH DONGS" comic variety, so except on one of those rare occasions when d8uv is blogging I don't tend to chuckle at the posts. This morning, on the other hand, I got rather a chuckle from Pat Hall...

Excuse me but how do you say "Airbag" in Finnish?
- from Hacklog: Blogamundo by Patrick Hall

   How DO you say "airbag" in Finnish?

   Update: Someone in the know tells me it's /turvatyyny/, which
   means "safety pillow." Phew, I feel better now.

- excuse-me-but-how-do-you-say-airbag-in-finnish/

2006-11-27 10:06 UTC:


"Sense of 'mild breeze' is c.1610."

So Shakespeare's use in Cymbeline may be what is being referred to here...

"It is believed to have been written around 1609."

He says that the "zephyrs blow", so unless he means that they suck and he's the daddy of modern rocks/rules/sucks/blows slang, he must be using it in the mild breeze sense.

2006-11-27 13:37 UTC:

On Javier looking up U+1368 in phenny...

<sbp> okay it really bugs me that OS X doesn't come with a decent font covering
 a substantial portion of Unicode

Especially since I'd normally use

But the site is down at the moment.

2006-11-28 19:38 UTC:

Thinking about RDFPaths for editing again: in a striped syntax, "*term" could type something as being an arc... it saves you a space over the normal "* term" to get into arc typing. Silly striping. Also, I think that "->" would be better to get a subgraph than surrounding with "{" and "}", since that way you don't have to go back to the beginning to do it. Or "}" could be a synonym... hmm.

2006-11-28 20:31 UTC:

Better CWM N-Triples normaliser:

   sed 's/^ *//g; s/  */ /; s/ *$//g; /^ *$/d'
   e.g. $ cwm --rdf sbp/foaf.rdf --ntriples | sed 's/^ *//g; s/  */ /;
            s/ *$//g; /^ *$/d' | sort > ~/foaf.nt

Works as long as there aren't literal subjects; and because CWM seems to get the second spacing right (i.e. it uses a single space after the predicate).

2006-11-29 11:03 UTC:

Since [prop objt] is like a lookahead, it's comparable to objt < prop < *

So, looking up contact details for people...

   [nick "crschmidt"] name|email|mbox|etc. *

Is there a way to compact that predicate query? Perhaps foaf:*? Perhaps there should be a class called foaf:ContactProperty, so then you could do:

   [nick "crschmidt"] [a ContactProperty] *

You could even define a similar thing in a file of your own and load it in.

So, this is navigation through a Graph: a kind of topological exploration syntax. But you also want to be able to manipulate the Graph. For example, foaf:knows doesn't differentiate between people you've met in real life and those you know online. I might want to tab through all of my contacts:

   #include <>
   [nick "sbp"] foaf:knows *

Display them by credentials--a cascade of name, nick, email--and then skip through them adding ?person a :Online, or a :Physical, or both.

So we're talking about browse and edit modes for a Semantic Web editor, or rather agent. But there's a third mode: reason.

Browse -> Find and step through data
Edit -> Add and remove data
Reason -> Infer new data from old data

The BER theory of Semantic Web UI, I guess.

I remember Seth going mad because people didn't use dc:format hints on their rdfs:seeAlso arcs. Of course, dc:format is too generalised anyway; we could do with a subPropertyOf it. RDF profiling is probably important too: one could do root namespace sniffing to see if there's likely to be FOAF involved, I guess, but then that's a general discovery problem.

We'd also need extensible interface semantics.

Perhaps RDF is like SGML on the early web, and we're lacking an HTML. I mean, we have rdfs:seeAlso and dc:format, but they suck in expressing intent. A real link language for the Semantic Web (Semantic Web Link Language... SWeLL? Swell) would provide better roles and purposes and whatnot. Perhaps even a behavioural aspect.

2006-11-29 11:24 UTC:

How do we encourage short but unique names on the Semantic Web? Perhaps by using reversed domain names for properties. Perhaps even a URI scheme for it:

This is rather related to the TAG standardised fields issue.

2006-11-29 12:09 UTC:

Okay, I've expanded on the idea in the following:
   Use Case/Story and Suggestions on standardizedFieldValues-51
   Date: Wed, 29 Nov 2006 12:07:33 +0000

It basically suggests three ways of solving standardizedFieldValues-51:

1) Use reverse domain names in the URI space.
2) Invent an algorithm for human readably compressing HTTP URIs.
3) Invent a new DNS-like identifier space.

But suggests that ultimately, the way to solve it may be implementation based, especially on the Semantic Web.

2006-11-29 12:52 UTC:

Taglyphs... It'd be cool if one could generate 32x16 bitmaps of geometric shapes and symbols from the (domain name, date) tuple in tag: URIs.

2006-11-29 13:01 UTC:

for person in {[nick "sbp"] knows *}
 @show person [is cascade of my:FoafContactProps] *

2006-11-29 14:22 UTC:

An RDF pattern, from speaking with cygri and danbri on #swig:

   "Reuse properties until it hurts; then change."


The idea is that sometimes it's unclear whether it's suitable to reuse some property or other construct in RDF. But the idea of consensus of properties is central to the well maintained growth of the Semantic Web. Therefore, one should be optimistic in the grey areas. Flagging this in your documentation might be a good idea, as danbri notes.

2006-11-29 14:59 UTC:

RewriteRule ^(.+)$ azimuth.cgi/$1 [L,QSA]

For some reason, Apache started adding "redirect:$SCRIPT_NAME" to the front of $1 after an apt-get upgrade, so after putting in this kludge to azimuth.cgi:

   weird = '/redirect:' + env('SCRIPT_NAME')
   if path.startswith(weird):
      path = path[len(weird):]

We called in The DrBacchus, and he suggested shoving in [NS] to the rewrite, which also worked. No idea why the breakage in the first place though.

2006-11-29 18:15 UTC:

On Resource vs. Literal hints in RDF editors:

<sbp> another thing I'd like people to annotate is whether properties will
 generally have Literals, that is the RDF data type (one of URI, bNode,
 Literal) not the class, as objects or URIs/bNodes
<sbp> since rdfs:Literal is a subClassOf rdfs:Resource, you can't use that
 schema information to find out
<sbp> it's an extremely good hint though...
<sbp> such information would be useful for an RDF editor which types objects
 for you
<DanC> { ?P s:range s:Literal } is valuable information; I don't know why you'd
 want to know whether the object term is likely to be a bNode vs. a URI
<sbp> not bNode vs. URI; Literal vs. bNode/URI. if you have rdfs:range
 rdfs:Literal, you can tell that the object is going to be an instance of the
 class rdfs:Literal, but you can't tell whether it's going to be a URI (<...>)
 or Literal ("...") in the syntax, since literals are a subclass of resource
 and can hence presumably be identified by URIs too
<sbp> actually, thinking about it I guess it doesn't matter. all identifiable
 Literals must have an "..." form
<DanC> right


Sean B. Palmer,