Strange Strands

14 Jun 2006

Principle of Least Annotation

From about the 27th to the 30th May I was participating in the Kalusa unplanned constructed language project, my dabblings with which I've now published. I'd written some notes previously about Egeslic, which was a code name for a kind of play language that I wanted to work on. With Kalusa, I've been rather reinspired to do that, though the techniques that Kalusa fosters are more widely applicable than even the field of constructed language; I think it would be a useful mechanism for learning an existing natural language too.

It also got me wondering about the divide between natural languages and programming languages, currently mainly only traversed by projects such as Lojban on the natural side and Applescript on the programming side. I think that the reason these projects don't prosper may be related to the reason that little languages, subsets of English used for command input and so on, don't work: because they're in the uncanny valley of language recognition. We always want to use more features than a language subset provides, and it's difficult to remember the arbitrary restrictions. This is why, for example, people always ask for feature creep on some of the more natural language commands for Phenny, such as "phenny: tell nickname about something"—people often ask for more verbs than the current enumeration of "tell" and "ask", notwithstanding that almost any statement can be periphrasised quite cleanly into ones containing just those two verbs.

Notwithstanding this, I've been thinking about a framework for making a language work with the requirements I have for such a language, or rather the reasons that I have. These were mostly outlined on the Egeslic page, but essentially the most important aspects can be outlined in negation, i.e. by saying what such an invented language should not be. There shall be no fantasy or constructed world aspects; it shall not be a Sapir-Whorf trial; there should be no prominent logical aspect; it should not succumb to conlangfest features; it shall not set out to change the world or be revolutionary or anything particularly serious; and nor shall it be a philosophical or ontological language.

On the positive side, there are two main things that I want to achieve with such a language: coining interesting words, and experimenting with the principle of least annotation. And of course there's the higher goals and values of it being an art form, and there being the possibility for great creativity in grammar and semantics. But to get a language started, I think it's as well to keep the grammar as simple as it's going to be at its simplest, so I'm thinking about starting on a subset such as is machine parsable. In other words, something a bit like the codifications that Panini made to Sanskrit, though not even that advanced (he was way ahead of his time). I suppose this is a good general approach in creating any conlang. Tolkien went with the very wizened aproach of making a protolanguage from which to evolve your regular languages, to make them more natural, but I think that my negation of fantasy aspects might rule against that. On the other hand, it may also be a good way of ensuring that the language is more usable.

So, the principle of least annotation. If you look at programming languages, it's interesting to see that features such as dynamic and strict typing are starting to move towards being hybridised. In Perl6, you can use either dynamic or strict typing depending on your current whim. Extending this to a natural language, I think it would be nice to convey the minimal amount of semantic inforamtion needed to get a point across. So, instead of referring to a man, you refer to a person if the gender need not me known. If you've already referred to a man and nobody else, the next pronoun isn't gender typed. There need be no case marking on constructions such as "the man ate food". Indeed, there need be no articles or other determiners unless there is really a solid reason why they ought to be there. OF course, people are always free to choose which words they watn to use, but the idea is that the language itself is huffmanised to make the easy things easy and the hard things possible, in such a way as to make the most semantically efficient thing to do also the most syntactically efficient thing to do. People can still choose not to do that, and that will be a case of rhetoric and emphasis and poetry, but for the core cases this seems an exciting approach to me.

The grammatical framework is just one side of the coin, of course. There's the whole phonology and feel of the words to consider. I rather like Indo European languages in general, but if you look at the success of English I think that the Germanic underpinnings with Romance words are a good recipie for success there, and I think that loanwords from all sorts of places are a fine way of enriching a language. Though I've preadmonished the fantasy aspect, I'm not sure about edge cases such as idioms. It's fun to come up with new idioms, and it almost seems like even if you avoid the fantasy aspects of a language, just the very condition of inventing a language is a breeding ground for making new idioms and so on. In Kalusa, the principle of "heresy" took on very quickly, for example, as a general pejorative insult, because it was so clearly comical that it could be used to put things down in a polite manner whilst being quite ridiculous about it. It became a kind of in-joke, and at the same time a very key feature of the language.

Kalusa has an almost polynesian feel to it, phonologically and texturally speaking, which is not something that I like too much. I rather follow Tolkien in liking Welsh and Finnish &c., though again I would certainly seek to pull in loanwords from all over the shop. And that actually includes Kalusa, since some of the words and ideas from it are quite beautiful.

As to how such a language can even get started, I think that the corpus approach as espoused by Kalusa is a very good one indeed. But since I'd be basically pursuing it individually, it'd be a good idea to establish a framework for the language beforehand, which would mean a lot of research into various very basic parts of a language, such as whether I should use SVO or VSO or any of the other alternative sentence structures; and whether I want an nominative-accusative or an ergative-absolutive language, and so on. Such is the fun of making a language, of course.

Strange Strands, Principle of Least Annotation, by Sean B. Palmer
Archival URI: http://inamidst.com/strands/lannot

Feedback?

Your email address:
inamidst.com