On Learning to Write Languages

I’ve been learning to write languages recently.

I read Steve Yegge’s thought-provoking post, in which he talks about how, if you know how to deal with language problems like lexing, parsing, translating, and compiling, then you know how to solve a large number of common programming problems.

I’ve been using very simple custom languages at work to write integration tests. Just little bits of work, but they’ve really helped quality by allowing us to write loads and loads of tests quickly and confidently. I think we have about 500 integration tests written which rely on small setup languages.

I think this has become possible because the system we’re writing against is pretty stable. The underlying classes and database tables we’re writing against don’t change too often.

This seems to be the key time for writing your own languages; the underlying libraries have reaches a point of stability, and you are being asked to do complex things to the underlying data.

So if you deal with classes or database tables called ‘Document’, ‘Alert’, and ‘Error’, then you can start making statements using those objects; things like

‘When the document is saved, if the document is not Signed Off, alert the document owner and log an error’

Now, it should be possible to write a translator that turns this into c-sharp; something like;

public void OnSaved(Document document)
{
if (document.State != DocumentState.SignedOff)
{
SendAlert(document.Owner);
LogError(document);
}
}

The first version is significantly easier to understand and write. You can show this to your customer and ask if he agrees with the statement. The language helps communication. The c-sharp is no help at all in communicating.

So, custom languages can help put together systems that are easier to understand, because the language is tuned to the problem, and easier to modify, because the code is invariably shorter than it would be in the general-purpose programming language.

To my mind, if I can learn to write interpreters, compilers, and translators, it allows me to write software in a way that is significantly more easy to maintain.

There are, however, two big problems;

First, learning to write languages is not trivial. It’s a significant investment of time. Your manager is not going to be happy about a proposal that starts “Can I spend the next month learning about languages and not writing production code…” so I think you have to learn about these things in your own time.

Second, once you know how to write interpreters, they are themselves fairly hefty beasts. If it takes you 300 lines and a day of work to write a lexer and parser, you’d better be certain you save more than 300 lines and a day of work in the course in writing scripts in the new language — otherwise what was the point? So you have to pick your battles, picking only those areas that are ripe for better automation.

If you meet these two criteria — you’ve learned languages on your own time and you’re picking an area that’ll benefit from it — I think writing your own languages is a very valuable ability.

So, I’m now reading heavily in the area, writing my own lexers and parser by hand, and starting to look at automated tools like ANTLR and Irony. Irony .Net Language Implementation Kit

iOS4 bluetooth keyboard first impressions

I installed iOS4 yesterday; here are the first impressions of the bluetooth keyboard support. Being able to use a keyboard with the phone, without jailbreaking, is absolutely great. The phone starts to become a really viable writing platform; for blog posts, mail, and fiction. I’ve got one of the old foldable keyboards originally designed for a Palm, and it folds down to the size of a paperback book. This makes it really easy to write while on the move.

Of course, this makes it even more important to have a decent editor. I’m still going for Simplenote, which uses syncing with the cloud, but I’ve got a terrible urge to write some kind of subversion integration so that I have all my writing in the same source-controlled location.

As yet there are still some issues; for instance, the bluetooth keyboard means that the on-screen keyboard doesn’t need to be on-screen, but the wordpress app still reserves space for it.

‘This’ in Javascript and C#

I noticed something today while learning jQuery, and that’s the way the keyword `this` differs between C# and JavaScript. It suprised me when I saw some javascript that looked like;

01 $(document).ready(function() {
02 $(‘div’).each(function() {
03 this.style.color = ‘blue’;
04 });
05 ));

and I realised that this wouldn’t work in C# — at least, not the same way it works in JavaScript. In the JavaScript above, the `this` on like 03 refers to each `div` element that’s being iterated over.

Now consider similar C# code;

class Document
{
List divList = …;

void Ready()
{
divs.foreach(delegate () {
this.style.color = “blue”;
});
}
}

In C#, `this` doesn’t refer to the div, but to the Document class.

In both pieces of code, we’re creating a function with a reference to `this`, but they mean different things;

– In C#, `this` means `the object that declares the function`
– In JS, `this` means `the object the function is being invoked on.`

To see the difference, realize that you can attach the same function to two different javascript objects, and you’ll see `this` referring to each one in turn. Here’s a piece of javascript to illustrate;

var func = function() {
alert(this.name);
}

var obj1 = { ‘name’: ‘first object’, ‘func’: func };
var obj2 = { ‘name’: ‘second object’, ‘func’: func };

obj1.func();
obj2.func();

When you run this; you get two alerts: `first object` and `second object`. But when you run this in C#

Action func = delegate() {
MessageBox.Show(this.GetHashCode());
};

var obj1 = new { func = func };
var obj2 = new { func = func };

obj1.func();
obj2.func();

You see the same hashcode in both message boxes. It’s the hashcode of the object that contains this method.

So. Don’t confuse the meaning of `this` in C# and JavaScript. They are very different beasts.

Now, if you want C#’s semantics in Javascript, you have to take account of this behaviour. With my C# head on, I was tempted to understand ‘`this`’ as a _variable name_, but it isn’t. It’s a keyword, and not a variable name. To make it work like C#, you need to create a _real_ variable, and use the variable in the function. Like so;

var outerThis = this; // declare a real variable
func = function() { alert(outerThis.name); }

And this will give you C# semantics in Javascript.

IPhone app review: RedLaser

This is a capsule review of the iPhone app, [RedLaser](http://www.redlaser.com/). It’s a barcode-scanning application which looks up scanned products on amazon and google. In short; start the app, point your phone at the barcode, get online price comparisons.

The app is extremely simple to use, and cheap, too. It can save you money in a purchase or two. I used it at Borders the other day, scanned a book, and found a copy six quid cheaper somewhere else online. Since the app costs less than two quid, it’s a great little moneysaver. It also acts as a nice ‘outboard memory,’ storing a list which can form a wishlist. Scan in books you want to remember, and it’ll keep the list and let you email it off.

Because it uses amazon amd google product search, it doesn’t work well with things that are very cheap, or own-brand products. I wondered if I could use it as a shopping list (scan stuff as it runs out) but, well, no-one sells paxo stuffing on the Internet, so no dice.

What it seems to excel at is products that make good presents; books, DVDs, xbox games, and board games all worked well. I think I will be using it for my own christmas wishlist, and for keeping track of presents for friends and family.

PS: a little tip. I had a couple of books fail to scan properly, until I noticed that the books had _two_ adjascent barcodes. Cover up the smaller one with your thumb and it’ll work perfectly.

Grammar Rant #1: ‘More Unique’

With a title like that, you may expect me to rant about the terrible phrase ‘more unique,’ and why you should never use it. It’s one of those long-held orthodoxies about English that ‘more unique’ is illogical, because something is either unique, or it isn’t, and thus ‘more unique’ is nonsense. The same advice holds for absolutes like ‘perfect,’ ‘full,’ or ‘fatal’. (Various examples; [Wikipedia](http://en.wikipedia.org/wiki/Comparison_%28grammar%29), [Dr Grammar](http://www.drgrammar.org/faqs/#53), [Grammar Girl](http://grammar.quickanddirtytips.com/modifying-absolutes.aspx))

This isn’t true. There are times when you can use the phrase, and this post covers those situations.

Uniqueness describes how one thing in a group has a property exhibited by no other; ‘the only left-handed pupil in the class,’ or ‘the only green apple in the orchard.’ Uniqueness always exists within a set of things. In the preceding examples, pupils in a class or apples in an orchard. That set of things, however, may be part of a larger set. A class of pupils is part of a school, a district, and all the kids of the same age. Sets of apples may be found in baskets, orchards, or supermarkets.

When you move from comparing a small set to a larger set, you find that the rules of uniqueness change. Left-handedness may be unique in a class but not in a school. Blue eyes may be unique in a family of brown-eyed children, but red eyes may be unique within a much larger grouping (for example, from rare cases of [albinism](http://en.wikipedia.org/wiki/Albinism))

When you shift from a small set to a larger set, then, the rules of comparison change; in doing so, the phrase ‘more unique’ reflects a _new kind of uniqueness_ — uniqueness in the larger set. So it becomes reasonable, though not very stylish, to say;

> No-one in the family had blue eyes except John, but Jane’s more unique red eyes entranced him.

or

> Every child in the race had already won the blue ribbon for being the fastest in their class, but now they were competing for the prestigious and far more unique Flanders Cup, awarded to the fastest of the fast.

So, in summary; if the set changes from a small to a large, ‘more unique’ makes logical sense, and means ‘unique within the larger set.’

Of course, the main reason to challenge any orthodoxy is when it impedes eloquence. While [Truth is beauty and beauty truth](http://englishhistory.net/keats/poetry/odeonagrecianurn.html), _beauty should always be allowed to win_. Otherwise we would never have this;

> We the People of the United States, in Order to form a [more perfect Union](http://en.wikipedia.org/wiki/Preamble_to_the_United_States_Constitution), establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America.

Which is a glorious sentence. Fie upon those who would have us write ‘in Order to form a better union.’

I may tackle other orthodoxies in the future. Stay tuned.

Fantasia on a theme by Samuel Delaney

This year, my long-time friend Derek Muir attended the Clarion writer’s workshop. One thing he brought back was [an essay called ‘Thickening the Plot’][ttp] by Samuel R Delaney, who had been an instructor at Clarion in the 70s. I bought the larger book that contains it — ‘About Writing: 7 essays, 4 letters, and 5 interviews.’

[ttp]: http://books.google.co.uk/books?id=FYD26bt8Wz0C&pg=PA69&lpg=PA69&dq=delaney+thickening+the+plot&source=bl&ots=3REApfQal-&sig=FCXanSN7dkRhIgmpa-C7ce-Jrm8&hl=en&ei=6AeXSu6UK9KNjAfZg_iQDQ&sa=X&oi=book_result&ct=result&resnum=1#v=onepage&q=&f=false

It’s a wonderful book. In the essay that follows, I have taken some of his ideas, mixed in my own interpretations and thoughts, and generally corrupted and polluted his work to create my own synthesis. What follows, then, should not be taken as his views, but my own.

In the introduction to the book, Samuel R Delaney makes a distinction between three levels of literary skill: those who can’t write, good writers, and talented writers. The first group is basically most people; not illiterate, but people who have neither inclination nor aptitude nor necessity. The next group — good writers — are those who can write grammatically, want to put down fiction, and may have done so. We can imagine that maybe they have done well during English lessons at school and were told they were better than their peers, or had been encouraged by parents. They are probably avid readers. The talented group are people with something extra beyond competence and desire.

It’s the middle group — the good writers — who produce generally poor fiction. How could they not? Who else produces it? Certainly not the non-writers, because they aren’t producing anything, and not the talented, because their stuff is the really good fiction. For me, it was counterintuitive that competent writers are — must be — responsible for bad fiction. Now that that idea is in my head, I can’t seem to shake it.

Delaney writes:

“However paradoxical it sounds, _good writing_ as a set of strictures (that is, when the writing is good and nothing more) produces most bad fiction. On one level or another, the realization of this is finaly what turns most writes away from writing. _Talented writing_ is, however, something else. You need talent to write fiction.”

If you can string a sentence together, but you know you’re not writing to the same standard as the authors of the great classics — authors like Milton, Austen, Dickens, Melville, Hemingway, Lovecraft, Orwell — then you are a Good Writer. And you are probably producing dross.

This comparison with the greatest authors of all time should loom over you and oppress your soul. You must be crushed down by the weight of a thousand masterpieces, each one pressing upon your soul and leaving you awestruck. Then, you must take these great works and absorb them, read them to find their secrets, learning from them what you need to first do as well as them, and then surpass them. Without the visceral, breathless need to reach the very pinnacle of excellence, and without the knowledge of what constitutes that pinnacle, chances are you will not succeed.

Why must you strive to reach so high? Isn’t it enough to be merely good? No. Here’s why.

Let’s imagine you’re a science fiction writer, and you want to write novels, have them published, and see them sell. What you want specifically is this; you want your novel shelved alongside _Neuromancer_ and _1984_ and _The Hitchhiker’s Guide to the Galaxy_ and _The Moon is a Harsh Mistress_. You want me to flick through all these books, and yours, and for me to put those books back on the shelf and take yours to the tills. Well, then you’d better write better cyberpunk than Bill Gibson, or better distopias that George Orwell, or I’ll just buy their books instead. You must aspire to beat the best authors in history at their own game; you are in competition with them. It is not viable to attempt anything less.

For myself, I imagine some avid reader, entering a bookshop and glancing, by chance, at the spine of my book, shelved there under ‘Cooper’. Just to the left are books by C.J Cherryh and Orson Scott Card. To the right, Samuel Delaney and Philip K. Dick. _Even while staring at my book,_ his peripheral vision hits four Hugo-award winning novels. When he reaches out his hand to pick up my book, with the smallest twitch his fingers brush _Ender’s Game_ and _Do Androids Dream of Electric Sheep?_.

My book has to be very good indeed to make that sale.

The Dark Matter of Writing

I just did an on-line typing speed test, and it turns out I managed 54 words per minute. Which made me wonder — at that speed, how much time would it take me to get a novel written? If I could operate at that rate while writing fiction, what could I get done?

Turns out 24 hours of typing at that speed gets 78,000 words down on paper. That’s a good sized novel, in 24 hours.

Now, if you can actually type out the contents of a novel in 24 hours, what does it mean when someone says they spent 6 months writing a novel? Because it certainly doesn’t mean they spent six months typing. (Butt on seat for two hours a day, that’d be about a million words.)

In fact, it makes you wonder why we call it ‘writing’ in the first place. There may have been a time, I suppose, back when we pressed styluses into clay, when the content we were writing was simple and the process of making marks was laborious. Back then it was reasonable to say we were spending our time writing. But that’s not so any more. With a comfortable word processor, getting words down is trivial. So calling ourselves ‘writers’ is perhaps disingenuous. What we’re doing isn’t writing — can’t be, or we’d be finished much sooner. People would comfortably knock out a novel in a weekend. The bulk of the activity has to be something else; imagining, maybe? daydreaming?

This is one of those posts where I don’t have the answers, but I hope an interesting question. Like dark matter, the real activities of writing fiction are somewhat invisible to us, but constitute the vast majority of the whole. Because from the outside we just look like we’re typing, what we do is called writing. But I feel that might be like calling the act of driving ‘seatbelt-wearing.’ Sure, you wear your seatbelt throughout the whole process, but that’s not where your attention is. It’s not what you’re doing.

So there are some consequences. If writing time is trivial, then any idea which uses the word ‘writing’ to describe fiction-creation is literally incorrect.

For example, a piece of advice handed down with great regularity is this; ‘write every day.’ Is it because the act of committing words is important _per se_? No . The other ‘dark matter’ fiction-making activities must be engaged on a regular basis. ‘Write every day’ gets us sitting down and our heads running the processes that create scenes and characters and drama.

What’s worth considering, then — what constitutes story-making and literature-creation — is a series of processes, mostly mental, mostly transitory, the _final_ process being the mechanical typing of words. I can think of three main processes;

1. Imagine a scene in your story-world.
2. Convert the scene to language in your mind’s ear.
3. Transcribe the language onto paper or computer.

But note how step 3, the writing, is the thing we’re urged to do. It’s backwards. We should be advised to daydream every day. We should be told to babble about what we see in our minds every day. I suppose it’s not surprising that we don’t hear this advice, but these things are the core processes of fiction.

I suppose what I’m looking for, after all this, is an understanding of the mental processes that make up fiction writing. And I don’t what it described in terms of the end results. ‘writing’ creates written words. ‘characterisation’ creates characters. ‘plotting’ creates plot. But words, character, and plot are all artefacts, the output of something, and it doesn’t much help to just make nouns into verbs and talk about writing and characterisation and plotting. I want to talk, and think, about causes, the things that cause plot and cause character and cause language.

PS: It took me exactly 1 hour to write this post of 662 words. Which means only 20% of the time was taken by writing — 80% of the time taken was taken by these other processes.