Strange Strands

15 Sep 2006

Noting and Unnoting

I was going to write a miscoranda essay yesterday entituled [sic; I got that from a paper about Benjamin Franklin] "Doing and Undoing", but I can't remember exactly what it was about other than the fact that it was a little analogy. It was basically going to be on the subject of having not written much, and on the basic editorial function of performing a change being captured as a discrete thing that you can undo. I wrote a kind of almost working Palimpsest network editor implementation, but haven't released it yet, and the principle behind that is very strong on these atomic operations.

So anyway, one thing that I noticed today is that I can take notes more easily when I don't have to perform a series. I already knew that I find it much easier to write when people weren't watching me, e.g. writing to a weblog that I know nobody's subscribed to, but I thought that might be the only major factor. It turns out that even the style of a weblog is a hinderance to me writing. I think it comes partly from the fact that it's a write-once medium, so that entices me to get it right the first time, which means that I invest too much effort and overparticularise it. It stops me from publishing a very loose half baked idea with no text, since I don't feel that I can go back and make significant revisions to old posts.

I also feel that the whole idiom of writing serially causes a kind of mush of the output: when you have a discrete idea, you can go into DocumentMode to write about it. But weblogs are more like ThreadMode. ThreadMode is mushy and unordered. Most wikis tend to end up in ThreadMode even though DocumentMode is what they're really novelly good at (Wikipedia is a major exception of course, and proves just how excellent DocumentMode can be; and it seems that more and more wikis are doing the right thing as a result).

This is all pretty much apropos of the Taxonomy of Documents that I'm still working on. One of the categories is simply for low-grade text notes, modelled on a directory that I already have which is growing pretty fast. The idea is that each text file just captures some idea in at least a few words, so I can expound on stuff later. It doesn't matter if I later move it out entirely, or split files up, or whatever; and each file is published using a CGI script to append the last modification time and details about the author and where the text was originally from. It feels much more fun writing those notes than it does to write essays like these. There are useful things documented herein, but therein, in the rough text notes, is where I'm able to capture things in more of a DocumentMode.

Perhaps I need to be able to distinguish between minor and major edits, and revisit old files in such a way that only if they've been majorly revised they get published in the feed again; and that the old version is kept around somewhere, though I'm not sure how that'd work exactly. Of course, my Changes and Updates service covers the diffs and so forth, but I'd like to have good URIs for all of the content, as usual, and a user friendly and sleek interface to it.

The other big difference, of course, is that they're text. I've been thinking about a kind of hybrid of Text and HTML, using basically a very, very light wiki syntax that gets converted into HTML. The aim of the format is that one should be able to type up a document without really thinking about the syntax all that much, and it should still be convertable into HTML. It's like semi-directed text mining. But I'm having problems being able to decide how preformatted sections and headers especially will work. I even wrote a little statistical analysis script to see what the characteristics of headings were compared to short line paragraphs, and couldn't find much of a sensible match. Of course that doesn't mean that I couldn't come up with some sensible rules to conform to, but I'd like to have the syntax be as naturally derived, basically descriptivist, as possible.

Perhaps I should take all of the plain text files floating about on inamidst.com and try to write a huge and massively heuristic program that can convert all of them into nice HTML. Sometimes, the complex things are just that: complex, and you can't break them down into nice constituents. For example, HTML Tidy is pretty dense, crappy code, because it's dealing with dense, crappy tagsoup. Random text files present a rather similar problem, so they may have to be dealt with in a rather similar way. Text Tidy?

Strange Strands, Noting and Unnoting, by Sean B. Palmer
Archival URI: http://inamidst.com/strands/note

Feedback?

Your email address:
inamidst.com