I’ve been learning to write languages recently.
I read Steve Yegge’s thought-provoking post, in which he talks about how, if you know how to deal with language problems like lexing, parsing, translating, and compiling, then you know how to solve a large number of common programming problems.
I’ve been using very simple custom languages at work to write integration tests. Just little bits of work, but they’ve really helped quality by allowing us to write loads and loads of tests quickly and confidently. I think we have about 500 integration tests written which rely on small setup languages.
I think this has become possible because the system we’re writing against is pretty stable. The underlying classes and database tables we’re writing against don’t change too often.
This seems to be the key time for writing your own languages; the underlying libraries have reaches a point of stability, and you are being asked to do complex things to the underlying data.
So if you deal with classes or database tables called ‘Document’, ‘Alert’, and ‘Error’, then you can start making statements using those objects; things like
‘When the document is saved, if the document is not Signed Off, alert the document owner and log an error’
Now, it should be possible to write a translator that turns this into c-sharp; something like;
public void OnSaved(Document document)
{
if (document.State != DocumentState.SignedOff)
{
SendAlert(document.Owner);
LogError(document);
}
}
The first version is significantly easier to understand and write. You can show this to your customer and ask if he agrees with the statement. The language helps communication. The c-sharp is no help at all in communicating.
So, custom languages can help put together systems that are easier to understand, because the language is tuned to the problem, and easier to modify, because the code is invariably shorter than it would be in the general-purpose programming language.
To my mind, if I can learn to write interpreters, compilers, and translators, it allows me to write software in a way that is significantly more easy to maintain.
There are, however, two big problems;
First, learning to write languages is not trivial. It’s a significant investment of time. Your manager is not going to be happy about a proposal that starts “Can I spend the next month learning about languages and not writing production code…” so I think you have to learn about these things in your own time.
Second, once you know how to write interpreters, they are themselves fairly hefty beasts. If it takes you 300 lines and a day of work to write a lexer and parser, you’d better be certain you save more than 300 lines and a day of work in the course in writing scripts in the new language — otherwise what was the point? So you have to pick your battles, picking only those areas that are ripe for better automation.
If you meet these two criteria — you’ve learned languages on your own time and you’re picking an area that’ll benefit from it — I think writing your own languages is a very valuable ability.
So, I’m now reading heavily in the area, writing my own lexers and parser by hand, and starting to look at automated tools like ANTLR and Irony. Irony .Net Language Implementation Kit