· topic

An Haiku Language

The language that Richard Adams created for his widely beloved novel Watership Down, Lapine, has been called "a haiku of a language". Since it was created for a work of fiction, Lapine is what's known as an artistic language: a kind of constructed language made with aesthetic pleasure in mind. The reasons why people make artistic languages are varied. Though Adams made his for a book, many linguists create a language in order to understand better how language works, or to try out linguistic oddities. Very often, these languages don't amount in size or complexity to much more than Lapine does, or, to give a completely unrelated example, Toki Pona.

I've been noodling on constructed languages to some extent for many years, but in December 2006 I started to learn more voraciously how langauge works and how they can be constructed. Many of my friends are conlangers; I consulted both them and online works about conlangs. But in many ways I don't fall into some of the perceived stereotypes of a conlanger. I tend to abhor science fiction and fantasy writing, for example, with a few exceptions (notably Douglas Adams and J.R.R. Tolkien). Nor am I interested in creating a philosophical or a logical language, like Toki Pona or Lojban. As a friend of mine said, "communication is primarily aimed and conveyence of meaning, and less at universal truth".

On the other hand, I love coining words, and feel the sounds of words and the textures of languages distinctly. Words like "mizzen", "bailiwick", and "lambent" stick in my mind; and like Tolkien, I enjoy Welsh and Finnish for their textures. I also like humour. When I participated with some friends in the Kalusa corpus driven conlang experiment, "wumung", meaning friend, was one of our favourite words because it made us laugh our arses off every time we used it.

So I have no inflated ideas about making a language which the whole world will learn, bringing about world peace in the process; or even creating a language which is in any respect good. Heck, I don't even have much of an idea of creating a language. I'm interested in coming up with a "haiku of a language" like Lapine, something which is small enough to fit in your pocket, and makes you smile and maybe makes you think. It won't be as good as Lapine, nor based on it since it doesn't have the same goal, but it might mean that I learn something about language on the way.

These, then, are some notes about that process.

Creating a Language

To create a language, the first thing that I did was to learn a lot about the process, which involved a) learning about how language works, and b) learning about how creating a language works, and how it's been done by others. So I did a heap of background reading to get a kind of layman's working knowledge of the beast, but when it came down to actually starting to create a language, I had no idea how to do it. That's really why I'm writing this guide, because whilst lots of people seem to be giving hints on how language works and what considerations you have to have in creating a language, all of which is helpful, nobody actually talks about how to create the language once you have that knowledge.

All the same, before you can get to that step of actually sitting down and doing the work, you do need that background knowledge. So let's step through my own understanding of just what makes a language.

How Language Works

Ignoring the myriad complexity of the field, what are the general nuts and bolts of a language? A good way to get an impression of this is to read something like Mark Rosenfelder's Language Construction Kit; the simple answer is that a language consists of small units that denote meaning, and that these can be combined using a grammar to create longer expressions. These small units are called morphemes (which don't necessarily have a one to one mapping with words!), and are usually realised in speech by configurations of particular units of sound called phonemes, and in writing by some kind of script, for example an alphabet.

So a constructed language must have a morphemery and a grammary. The only way to start creating these is by having some kind of realisation of the morphemes, so you'll need either a phonology or a script. Most guides say to start with a phonology, because that's how real languages work. But since I'm using a computer and I'm lazy, I tend to start off with a script, and let the phonology work itself out. This is dangerous because it means you can end up using a really complicated phonology. Since I'm not a phonologist, however, I don't give two hoots.

There are various concepts that are pretty much universal across languages, that every language has a morpheme for. When making a vocabulary for a new language, it's usually a good idea to start with these.

Universal Morphemes

Where should we look for things that are more or less universal across languages? Just trying to think them up turns out not to be all that good an idea, though inspriration can certainly be a part of the process. Better is to consult something like the following:

Whilst these are valuable resources, the thread of Ariadne that unites them all is that they're crackers. In other words, trying to create a universal, philosophical, or logical language exposes one of the most common fallacies in what people think about language: we think we can do better than ourselves, but we can't. We're still debating how language really, really works: the field of cognitive semantics is rife, and dudes like Chomsky and Talmy have been putting out various bits of theoretical fluff for people to flamegrill for many years now. We learn languages relatively easily, but the collective efforts of many linguists throught the years still can't dot the i's and cross the t's on how it all really works.

In the past, the thought of a perfect language tended to manifest itself in the search for the mother tongue: the first ever, primigenial language. As Umberto Eco says in Serendipities: Language and Lunacy, the primigenial language was sought because it "had revelatory value for, in speaking it, the speaker would recognize the nature of the named reality". Eco goes on to say:

Raffaele Simone suggested that much of the search for a perfect language derived from a sort of neurotic uneasiness, because people would like to find in words and expression of the way the world works, and they are regularly disappointed. This is certainly true. In the legitimist tradition, the assertion of the sacrality of language aims not so much at reconstructing a primigenial language as at rediscovering the traces of our natural languages.

At the far end of the backlash against universality and objectivity are Lakoff and Johnson, who according to Wikipedia say that an embodied philosophy "would show the laws of thought to be metaphorical, not logical; truth would be a metaphorical construction, not an attribute of objective reality". But here this particular train track disappears off into the mists, and for us it's just a branch line.

So, who cares how and why language works? Well, linguists, and philosophers. But with no more noble a goal than having a bit of a lark and coining some fun words, we can safely ignore those bringdowns. As with most comedies, the matter is going to have to speak for itself: res ipsa loquitur.

Nuts and Bolts

Rosenfelder's guide to creating a language is broken into sections all of which essentially ask some questions about the language that you want to create. Here are some of the questions from the grammar section:

This is where we go to cuckoo land, because even the range of choices involved in the first question are highly complex but awesome. Let's take a look at it as a use case.

Morphological Typology

The term "morphological typology" means the measure of how many morphemes, bits of meaning, there are in each word. English generally has just one or a few morphemes per word. For example "ducks" can be broken down into "duck", the animal, and "-s", meaning more than one. You can't break it down any further in meaning. So there are two morphemes in that one word. Some languages, such as Chinese, have even fewer morphemes per word on average.

On the other end of the scale are languages like Finnish, which have lots of morphemes per word. In the case of Finnish, this is because it's agglutinative: it just sticks all the morphemes together and calls it a word. Finnish is kinda mad... as someone on QDB said, "Finnish sounds like two alphabets went to nuclear war and the fallout mutated it into something seriously fucked up with lots of i's and e's".

The languages like English and Chinese are called isolating or analytic language, because it isolates the morphemes into words. Languages like Finnish are called synthetic langauges because they synthesise words from lots of morphemes, making radioactive morpheme-cauldrons of words in the case of Finnish. Actually, languages like Mohawk take it even further: washakotya'tawitsherahetkvhta'se means "he made the thing that one puts on one's body ugly for her", i.e. he spoiled her dress.

So far it might sound like "morphological typology" is a measure of how much words are crammed together, but as well as agglutination, that process of cramming stuff together, there's another way to make a synthetic langauge: using fusioning. In these languages, morphemes are so fused together that it's hard to separate them out. It's hard to give examples, but usually it means overloading some bit of the word with lots of meanings: in Latin, boni is plural, nominative, and masculine because of the -i suffix.

Probably because I'm English, I prefer mainly synthetic languages. But there's no way that you can avoid tacking morphemes together if you're used to just adding -s on the end of a word to make it a plural... it's extremely hard to buck off the conventions of your own native language. So my tendency is to have a language with a morphological typology somewhat like English, only perhaps a little more or a little less isolating just to try to make it at least nominally different.

So morphological typology isn't a particularly interesting case of thing to think about when constructing a language, but it shows just how much you can take for granted if you don't read about it.

* * *


This guide is incomplete because I'm more interested in tinkering than documenting. Normally when I abandon a half finished fairly interesting essay like this, though, I let it sit around and fester, accruing dust, crumbs, bed bunnies, and glistening gold florins. But this time I figured I'd publish what I've got, and write about why I abandoned it at the end.

I'd been thinking about languages perhaps constructed with something like the following ideas:

But then I figured that actually, it'd be really fun to try to do something like Kalusa, where lots of my friends were involved, so I got to thinking again about how to do a collaborative constructed language. I'm thinking about a thing called Nomilang, a cross between Nomic and Conlang, so I'm basically abandoning this draft to think about that.

There's also always a chance that I'll get back to editing this again at some point.

Sean B. Palmer, 2006-12-21