Strange Strands - Bootstrapping a Collaborative Language

21 Oct 2006

Bootstrapping a Collaborative Language

With the goal of trying to make conlanging both more fun and more like the way natural language actually works, I had an idea about a collaborative language project. I was discussing computational linguistics stuff with Pat Hall, as is very often my wont, and we started talking about Kalusa, the corpus driven conlang project that we both worked on back in May. Pat was lamenting about the prescriptivist nature of the participants and the fact that the process was unnatural as a result: people were focussing on the grammar more than the meaning and just getting a rough consensus. As a result, it was less fun.

He was thinking about linguistic games that people could do, to turn it into more of a shared activity, and also about blogging in Kalusa. Then it struck me: what if, instead of using a corpus to drive the language, we used blogging? In other words, a project where a list of registered participants each maintained a weblog on which they invented a language. The catch would be that none of them could use any English translations. Instead, for example, you'd all have use an initial miniscule bootstrap grammar and vocabulary and work from there onwards (similar to how the corpus started).

Your duty as a participant would be to once or twice a day come up with a new weblog post. It would just be a first words per post at first, some very rudimentary sentences. Jokes and so forth would be encouraged as much as possible, but the goal of the game, if you want to think of it as a game, is to communicate with a bunch of other people using a language that you all create together out of nothing. You'd have to read everybody else's posts and try to decipher as much of it as possible. Then in your next days' posts, you'd try to use what you've learned from the other people, and contribute your own new stuff back too.

Of course, this is just a thought experiment. It could be complete chaos when it's actually tried, but having said that, the original corpus experiment showed some very fascinating things. For example, it's possible to create a rudimentary language really quickly. It's possible to learn it and communicate with it and, most interestingly, come up with a shared culture really quickly. It just naturally evolves of its own course, as a microculture always does when you get a bunch of people together, only it gets expressed in the fabric of the language itself. Hopefully that would still happen with a blog driven conlang.

One of the motivations is that it'd turn the process from being one of competition into one of cooperation. With the corpus driven method of development, it's all too easy to get very defensive about your own words and grammar, and to basically compete for inclusion. It's more about generation than about understanding. This naturally attracted types of people who liked to be mavericks, to bring in their own already defined languages, who liked to add things to the language which weren't consistent with what's already there. With the blog driven method, it could still happen but it wouldn't matter: the people who didn't want to cooperate could be safely ignored. If there were arguments over a particular point of grammar, the language could be forked. It'd still likely be understood by the participants, which is the point. If you started to end up with mutually independent variants of the language, then you end up with a fork. But that all happens organically, and there's nothing to prevent the natural things from happening.

Moreover there's more of a motivation to do the opposite, to actually understand what's going on and try not to fork—because that's what makes it fun. The aim is to have a bunch of people chatting together as normal but using their new constructed language, so eventually it just becomes a kind of forum, only with a rather linguistic slant.

There are a couple of problems with the idea. The first is: how do you enforce the rule (that translations shall not be provided), or how to you punish people who infringe it? The second is how to stop people from colluding behind the scenes, which may include the first problem of people explaining what they meant. On the second problem, it seems that actually that would be a good thing to encourage. Even with the corpus based development process, there was some amount of that going on behind the scenes. It's just a form of cooperation, and it's quite fun. But the first problem really seems to be a thorny one indeed.

As for the operation of the project, all you'd need are two things: a page explaining the bootstrap grammar and vocabulary, and an OPML document listing the feeds of all the contributors. From there on, the main bulk of the conversation should take place on the weblogs. If somebody wanted to fork, they could just post new OPML feeds on their blog, though it's debatable as to how long it'd take for the language to evolve to such a point where that would be possible. Of course, before it's that complex you shouldn't need a fork anyway.

There are lots of other points that could be debated, such as whether comments on the blogs should be allowed, what language they should be in, and so on, but the idea is straightforward enough. Whether it'd work or not, or even be worth trying, is another thing!

Strange Strands, Bootstrapping a Collaborative Language, by Sean B. Palmer
Archival URI: http://inamidst.com/strands/collaborative

Feedback?

inamidst.com