An Haiku Language

<title>An Haiku Language</title>
<link rel="stylesheet" type="text/css" href="style.css" />
<p>
   <a href="/">inamidst.com</a> 
   <img src="arrow" alt="&#xB7;" /> 
   <a href=".">topic</a>
</p>
<h1>An Haiku Language</h1>

<!-- 
<address>
by <a href="http://inamidst.com/sbp/">Sean B. Palmer</a>, December 2006
</address>
-->

<p>The language that Richard Adams created for his widely beloved novel
Watership Down, <a
href="http://en.wikipedia.org/wiki/Lapine_language">Lapine</a>, has been <a
href="http://www.langmaker.com/featured/lapine.htm">called</a> "a haiku of a
language". Since it was created for a work of fiction, Lapine is what's known
as an <a href="http://en.wikipedia.org/wiki/Artistic_language" >artistic</a>
language: a kind of constructed language made with aesthetic pleasure in mind.
The reasons why people make artistic languages are varied. Though Adams made
his for a book, many linguists create a language in order to understand better
how language works, or to try out linguistic oddities. Very often, these
languages don't amount in size or complexity to much more than Lapine does, or,
to give a completely unrelated example, Toki Pona.</p>

<p>I've been noodling on constructed languages to some extent for many years,
but in December 2006 I started to learn more voraciously how langauge works and
how they can be constructed. Many of my friends are conlangers; I consulted
both them and online works about conlangs. But in many ways I don't fall into
some of the perceived stereotypes of a conlanger. I tend to abhor science
fiction and fantasy writing, for example, with a few exceptions (notably
Douglas Adams and J.R.R. Tolkien). Nor am I interested in creating a
philosophical or a logical language, like Toki Pona or Lojban. As a friend of
mine <a href="http://swhack.com/logs/2006-12-20#T16-00-33">said</a>,
"communication is primarily aimed and conveyence of meaning, and less at
universal truth".</p>

<p>On the other hand, I love coining words, and feel the sounds of words and
the textures of languages distinctly. Words like "mizzen", "bailiwick", and
"lambent" stick in my mind; and like Tolkien, I enjoy Welsh and Finnish for
their textures. I also like humour. When I participated with some friends in
the <a href="http://kalusa.fiziwig.com/">Kalusa</a> corpus driven conlang
experiment, "wumung", meaning friend, was one of our favourite words because it
made us laugh our arses off every time we used it.</p>

<p>So I have no inflated ideas about making a language which the whole world
will learn, bringing about world peace in the process; or even creating a
language which is in any respect good. Heck, I don't even have much of an idea
of creating a language. I'm interested in coming up with a "haiku of a
language" like Lapine, something which is small enough to fit in your pocket,
and makes you smile and maybe makes you think. It won't be as good as Lapine,
nor based on it since it doesn't have the same goal, but it might mean that I
learn something about language on the way.</p>

<p>These, then, are some notes about that process.</p>

<h2>Creating a Language</h2>

<p>To create a language, the first thing that I did was to learn a lot about
the process, which involved a) learning about how language works, and b)
learning about how <em>creating</em> a language works, and how it's been done
by others. So I did a heap of background reading to get a kind of layman's
working knowledge of the beast, but when it came down to actually starting to
create a language, I had no idea how to do it. That's really why I'm writing
this guide, because whilst lots of people seem to be giving hints on how
language works and what considerations you have to have in creating a language,
all of which is helpful, nobody actually talks about how to create the language
once you have that knowledge.</p>

<p>All the same, before you can get to that step of actually sitting down and
doing the work, you do need that background knowledge. So let's step through my
own understanding of just what makes a language.</p>

<h2>How Language Works</h2>

<p>Ignoring the myriad complexity of the field, what are the general nuts and
bolts of a language? A good way to get an impression of this is to read
something like Mark Rosenfelder's <a href="http://www.zompist.com/kit.html"
>Language Construction Kit</a>; the simple answer is that a language consists
of small units that denote meaning, and that these can be combined using a
grammar to create longer expressions. These small units are called morphemes
(which don't necessarily have a one to one mapping with words!), and are
usually realised in speech by configurations of particular units of sound
called phonemes, and in writing by some kind of script, for example an
alphabet.</p>

<p>So a constructed language must have a <strong>morphemery</strong> and a
<strong>grammary</strong>. The only way to start creating these is by having
some kind of realisation of the morphemes, so you'll need either a phonology or
a script. Most guides say to start with a phonology, because that's how real
languages work. But since I'm using a computer and I'm lazy, I tend to start
off with a script, and let the phonology work itself out. This is dangerous
because it means you can end up using a really complicated phonology. Since I'm
not a phonologist, however, I don't give two hoots.</p>

<p>There are various concepts that are pretty much universal across languages,
that every language has a morpheme for. When making a vocabulary for a new
language, it's usually a good idea to start with these.</p>

<h2>Universal Morphemes</h2>

<p>Where should we look for things that are more or less universal across
languages? Just trying to think them up turns out not to be all that good an
idea, though inspriration can certainly be a part of the process. Better is to
consult something like the following:</p>

<ul>
<li><a href="http://en.wikipedia.org/wiki/Natural_semantic_metalanguage"
>Natural Semantic Metalanguage</a> - some primitives proposed by Anna Weetabix
(the only possible mnemonic for "Wierzbicka") back in 1972. There are some 61
proposed primitives and the list has grown fivefold since its inception. At
this rate, in 120 years the amount of primitives will be about 40,000, which is
bigger than most people's vocabularies.</li>
<li><a href="http://www.tokipona.org/nimi.html">Toki Pona Nimi Ale</a> - the
word list of the brutally minimal Toki Pona, invented by Sonja Elen Kisa.
Wikipedia wryly observes on Oligosynthetic languages that the fact that "no
existing language, living or dead, has been demonstrably shown to exhibit
oligosynthetic properties has led some linguists to regard true oligosynthesis
as impossible (or at any rate, wildly impractical) for productive use by human
beings." So now even though you may not know what Oligosynthetic means in
practice, you've learned something about the Oligosynthetic Toki Pona.</li>
<li><a href="http://www.animal.helsinki.fi/lojban/gismu-search-form.html" >The
Lojban Gismu</a> - the basic vocabulary of the logical language Lojban. The
language was set up as an attempt to see whether unnatural language affects the
mind or whether the mind will reshape the unnatural language. Since there are
basically no speakers of the language, all it has proved so far is that the
mind tends to ignore silly psycholinguistic experiments.</li>
</ul>

<p>Whilst these are valuable resources, the thread of Ariadne that unites them
all is that they're crackers. In other words, trying to create a universal,
philosophical, or logical language exposes one of the most common fallacies in
what people think about language: we think we can do better than ourselves, but
we can't. We're still debating how language really, really works: the field of
cognitive semantics is rife, and dudes like Chomsky and Talmy have been putting
out various bits of theoretical fluff for people to flamegrill for many years
now. We learn languages relatively easily, but the collective efforts of many
linguists throught the years still can't dot the i's and cross the t's on how
it all really works.</p>

<p>In the past, the thought of a perfect language tended to manifest itself in
the search for the mother tongue: the first ever, primigenial language. As
Umberto Eco says in <em>Serendipities: Language and Lunacy</em>, the
primigenial language was sought because it "had revelatory value for, in
speaking it, the speaker would recognize the nature of the named reality". Eco
goes on to say:</p>

<blockquote>
<p>Raffaele Simone suggested that much of the search for a perfect language
derived from a sort of neurotic uneasiness, because people would like to find
in words and expression of the way the world works, and they are regularly
disappointed. This is certainly true. In the legitimist tradition, the
assertion of the sacrality of language aims not so much at reconstructing a
primigenial language as at rediscovering the traces of our natural
languages.</p>
</blockquote>

<p>At the far end of the backlash against universality and objectivity are
Lakoff and Johnson, who according to Wikipedia <a
href="http://en.wikipedia.org/wiki/Embodied_philosophy">say</a> that an
embodied philosophy "would show the laws of thought to be metaphorical, not
logical; truth would be a metaphorical construction, not an attribute of
objective reality". But here this particular train track disappears off into
the mists, and for us it's just a branch line.</p>

<p>So, who cares how and why language works? Well, linguists, and philosophers.
But with no more noble a goal than having a bit of a lark and coining some fun
words, we can safely ignore those bringdowns. As with most comedies, the matter
is going to have to speak for itself: <em>res ipsa loquitur</em>.</p>

<h2>Nuts and Bolts</h2>

<p>Rosenfelder's guide to creating a language is broken into sections all of
which essentially ask some questions about the language that you want to
create. Here are some of the questions from the grammar section:</p>

<ul>
<li>Is your language inflecting, agglutinating, or isolating?</li>
<li>Do you have nouns, verbs, and adjectives?</li>
<li>How do you indicate plural, case, and gender forms of adjectives and
nouns?</li>
<li>Do nouns have gender?</li>
<li>Does the verb inflect by person, gender, and/or number?</li>
<li>What distinctions are made in the verb?</li>
<li>What are the personal pronouns?</li>
<li>What are the other pronouns?</li>
</ul>

<p>This is where we go to cuckoo land, because even the range of choices
involved in the first question are highly complex but awesome. Let's take a
look at it as a use case.</p>

<h2>Morphological Typology</h2>

<p>The term "morphological typology" means the measure of how many morphemes,
bits of meaning, there are in each word. English generally has just one or a
few morphemes per word. For example "ducks" can be broken down into "duck", the
animal, and "-s", meaning more than one. You can't break it down any further in
meaning. So there are two morphemes in that one word. Some languages, such as
Chinese, have even fewer morphemes per word on average.</p>

<p>On the other end of the scale are languages like Finnish, which have lots of
morphemes per word. In the case of Finnish, this is because it's agglutinative:
it just sticks all the morphemes together and calls it a word. Finnish is kinda
mad... as someone <a href="http://www.bash.org/?42831">on QDB</a> said,
"Finnish sounds like two alphabets went to nuclear war and the fallout mutated
it into something seriously fucked up with lots of i's and e's".</p>

<p>The languages like English and Chinese are called <strong>isolating</strong>
or analytic language, because it isolates the morphemes into words. Languages
like Finnish are called <strong>synthetic</strong> langauges because they
synthesise words from lots of morphemes, making radioactive morpheme-cauldrons
of words in the case of Finnish. Actually, languages like Mohawk take it even
further: <em>washakotya'tawitsherahetkvhta'se</em> means "he made the thing
that one puts on one's body ugly for her", i.e. he spoiled her dress.</p>

<p>So far it might sound like "morphological typology" is a measure of how much
words are crammed together, but as well as agglutination, that process of
cramming stuff together, there's another way to make a synthetic langauge:
using fusioning. In these languages, morphemes are so fused together that it's
hard to separate them out. It's hard to give examples, but usually it means
overloading some bit of the word with lots of meanings: in Latin, <em>boni</em>
is plural, nominative, and masculine because of the <em>-i</em> suffix.</p>

<p>Probably because I'm English, I prefer mainly synthetic languages. But
there's no way that you can avoid tacking morphemes together if you're used to
just adding -s on the end of a word to make it a plural... it's extremely hard
to buck off the conventions of your own native language. So my tendency is to
have a language with a morphological typology somewhat like English, only
perhaps a little more or a little less isolating just to try to make it at
least nominally different.</p>

<p>So morphological typology isn't a particularly interesting case of thing to
think about when constructing a language, but it shows just how much you can
take for granted if you don't read about it.</p>

<p>* * *</p>

<h2>Colophon</h2>

<p>This guide is incomplete because I'm more interested in tinkering than
documenting. Normally when I abandon a half finished fairly interesting essay
like this, though, I let it sit around and fester, accruing dust, crumbs, bed
bunnies, and glistening gold florins. But this time I figured I'd publish what
I've got, and write about why I abandoned it at the end.</p>

<p>I'd been thinking about languages perhaps constructed with something like
the following ideas:</p>

<ul>
<li>Trying to encode Chinese characters, with radical and pronunciation, in a
roman alphabet. So you'd have the pronunciation bit saying what a word sounds
like, and then a radical bit giving the meaning. This would be good for a
language with a limited syllabary phonetically. Both king and lemon might be
"ba", with different tones or something, but you could use "baba" for king and
"bani" for lemon in text: both would be pronounced simply "ba".</li>
<li>Making a language with none of the universal components, as much as
possible, so that for basic idioms you use strange circumlocutions. If there's
no word for have, you might use some phrase whose constituents are "thing with
you" or something like that. On the other hand, if it has "with" then I guess
that could be used for "have". Perhaps some kind of metaphor would do it.</li>
</ul>

<p>But then I figured that actually, it'd be really fun to try to do something
like Kalusa, where lots of my friends were involved, so I got to thinking again
about how to do a collaborative constructed language. I'm thinking about a
thing called Nomilang, a cross between Nomic and Conlang, so I'm basically
abandoning this draft to think about that.</p>

<p>There's also always a chance that I'll get back to editing this again at
some point.</p>

<address>
<a href="http://inamidst.com/sbp/">Sean B. Palmer</a>, 2006-12-21
</address>