Learning Languages Followup; Test Languages

I managed to write a new language, and 135 unit tests in that language, in a single day. When I talked about big wins coming from writing in the correct language, this is what I meant.

So, the language is a test language for a function library. We have arithmentic, date, string, and other functions that we need to test. Each function is identified internally by a GUID, and may be configured. So the language looks like;

01 #
02 # Check Array Sum function
03 #
04 declare Sum = {11669A5A-45BA-46c0-A6F6-97CDE4F5CAA5}
05 Sum(null) = null
06 Sum([]) = 0
07 Sum([1.0, 2.0]) = 3.0

In this short script, I add comments, give a function a name (binding the name `Sum` to the function identified with the id `{11669A5A-45BA-46c0-A6F6-97CDE4F5CAA5}`. Then, I define three tests; `Sum(null) = null` means what you would expect; call the sum function, passing in a single null parameter; the result should be null.

Having defined this language (which, I think, took me about an hour) I was then able to write about 135 tests with relative ease. The equivalent C# unit tests would be full of repetition and would not express their meaning anywhere near as fully. You’ve have something like;

[TestMethod]
public void TestSumNullIsNull()
{
var expected = (double)null;
var thefunction = FieldModifierHost.Instance()[“{11669A5A-45BA-46c0-A6F6-97CDE4F5CAA5}”];
var maker = thefunction.MakeMethod();
var instance = maker(new object[]{});
var result = instance(null);
Assert.AreEqual(expected, result);
}

Which is frankly impenetrable.

PS: I’ve just had a colleague add a number of tests, without any instruction, and he’s managed to put confidence tests around a function he wants to change in minutes. Unit test languages FTW!

Creative writing on the iPhone followup – iUI

I just wrote about the problems of note taking on the iPhone and PC. It may not be as difficult as I previously thought. There is a great project called iUI on google code which makes it easy to roll your own iPhone-flavoured website. I got this very basic screen up and running within minutes. I may look into creating an Ajax app using iUI, jQuery, markdown, and WMD

Python-style string formatting for C#

[Jon Skeet][skeet] recently asked in one of [his posts][op]:

> it would be really nice to be able to write:
>
> `throw new IOException(“Expected to read {0} bytes but only {1} were available”, requiredSize, bytesRead);`

Which would do the same as

throw new IOException(String.Format(
“Expected to read {0} bytes but only {1} were available”,
requiredSize, bytesRead));

And it got me wondering about the String.Format method, and how much uglier it makes C# code to read than, say, the equivalent python code. Alongside each other;

// C#
string message = String.Format(
“Expected to read {0} bytes but only {1} were available”,
requiredSize, bytesRead);

// python
message = “Expected to read %s bytes but only %s were available” % (requiredSize, bytesRead)

I think I’d solve the problem, not by creating a new constructor for `IOException`, but by making String.Format part of the C# syntax. It works very nicely for python, and it’s such a common thing to do that I tink it would warrant a change to the language. Given how cumbersome String.Format is, it’s often shorter and clearer to use simple string concatenation. This makes things rather inconsistent.

Here’s what I came up with. It’s a ‘first draft’, and more for interest’s sake than as something I’d put into production.

Instead of passing an object array in as the values, I’m reading from the properties of an object. So you can do it with objects or tuples;

var person = new Person()
{
firstname=”Steve”,
secondname=”Cooper”
};

Then you can inject the tuple into a format string like this;

string message = “{firstname} {surname} says injecting properties is fun!”.ㄍ(person)
// message == “Steve Cooper says injecting properties is fun!”

So you’ll see this weird thing on the end of the format string that looks like a double-chevron. This is supposed to look like a double arrow, pushing values into the format string. In fact, it’s the [Bopomopho letter ‘G’][g] and therefore a perfectly normal C# method name.

Here’s the code for the double-chevron method. I say again, this is _just a proof of concept_, not production code. Use at your own peril. (In fact, don’t use. Write your own. It’ll be more solid.)

public static class StringFormatting
{
public static string ㄍ(this string format, object o)
{
var rx = new System.Text.RegularExpressions.Regex(@”{(?w+)}”);
var match = rx.Match(format);
while (match.Success)
{
string name = match.Groups[“name”].Value;
format = format
.Replace(“{“, “{{“)
.Replace(“}”, “}}”)
;
format = format.Replace(“{{” + name + “}}”, “{0}”);

object prop = o.GetType().GetProperty(name).GetValue(o, null);
format = string.Format(format, prop);
match = rx.Match(format);

}
return format;
}
}

[skeet]: http://msmvps.com/blogs/jon_skeet/default.aspx
[op]: http://msmvps.com/blogs/jon_skeet/archive/2009/01/23/quick-rant-why-isn-t-there-an-exception-string-params-object-constructor.aspx
[g]: http://www.alanwood.net/unicode/bopomofo.html

SonicFileFinder for Visual Studio

For those of you who use Visual Studio all day, can I suggest that you install [SonicFileFinder][sff]?

[sff]: http://jens-schaller.de/blog/2008/12/15/295.htm

This lovely little addin by Jens Schaller gives you a way to find files in your current solution with a few keypresses. Invoke it, and you see a dialogue like this;

Sonic File Finder

Type in a fragment of a filename, and you’ll get a filtered list of files matching that fragment. Choose a file, hit ‘return’, and the file opens in the code editor.

Basically, if you know the name of your file, you no longer need to use the Solution Explorer. As codebases get bigger and bigger, this addin gets more valuable as the Solution Explorer gets worse.

Highly recommended, plus it now works with F#, C#, and VB.NET projects.

A lisp macro virgin tells all

I finished my first lisp macro, and I want to tell the world.

I’ll talk about what a lisp macro is, and what makes it unique in the world of programming, how it’s a technique only possible in lisp. I’ll then take you through an example.

So firstly, what’s a lisp macro, and why would you want to write one?

So, you may have seen lisp programs before, and you’ll recognise them instantly — Larry Wall, the inventor of [Perl][], said they had all the aesthetic appeal of a bowl of porridge mixed with toenail clippings;

(defun accumulate (combiner lst initial)
(let ((accum initial))
(dolist (i lst)
(setf accum (funcall combiner accum i)))
accum))

He has a point. They are butt-ugly. But hell, the best he came up with is [Perl][], so he can `$_@++` right off. (I’m pretty sure that’s valid Perl, too 😉 )

It’s ugly, in an aesthetic way, but it’s amazingly practical. It’s got an engineering beauty to it. If you look at that snippet above, you’ll notice that the whole program is made out of exactly three types of symbols;

* open parenthesis: `(`
* close parenthesis: `)`
* symbols, like `defun`, `accum` and `setf`

All simple lisp programs are like this. Just brackets to group stuff together, and stuff that needs grouping. Compare that with C#, where you might find;

* parenthesis for;
* function calls; `print(“hello”)`
* special forms; `using(OdbcConnection con = …)`
* semi-colon to end statements; `int x = 1;`
* curly brackets for;
* code blocks; `{ /* code block */ }`
* array initialisers; `string[] words = { “hello”, “world” };`
* square brackets for array indexing; `x[3] = 4`;

and the list goes on. I gave up because there are too many to list.

So lisp has this seriously small syntactic footprint. You can have a thing, or a group of things in brackets. It’s simple. It’s *so* simple that you can start doing crazy stuff in lisp that you just can’t do otherwise. That crazy stuff goes by the name of macros.

I can write a program that takes a chunk of lisp (remember, just a thing or a list of things), cuts it up, and reassembles it. That creates new lisp code.

So imagine you do a lot of work on three-dimensional arrays. You find yourself, over and over, writing nested loops that say;

for x in range(100):
for y in range(100):
for z in range(100):
# do something to matrix[x,y,z]

And frankly, you’re bored of typing it over and over. What you really want to do is something like;

for {x 100, y 100, z 100}:
# do something to matrix[x,y,z]

You want a brand new bit of syntax for multiple-value looping. Can you add it to python? Nope. C? Nope. Java? Nope.

But now look at the lisp version;

I could, theoretically, write this

(domanytimes (x 100 y 100 z 100)
body)

and, because it’s just a list of stuff, I can chop and change that into this new bit of lisp;

(dotimes (x 100)
(dotimes (y 100)
(dotimes (z 100)
body)))

I’ll show you how in a second, but notice what’s possible — I can write my own looping construct (`domanytimes`) and lisp will rewrite it into many simpler looping construct (the built-in `dotimes`).

Is that particularly special? Well, yeah. I’ve written new syntax. I’ve defined a new way of looping that is no different from the standard loops. I’ve basically added something new to the language. Lisp is now better at dealing with multi-dimensional loops. Try adding a new loop to ruby, or javascript. Make python understand

for x in range(100), y in range(100), z in range(100):
# body here

and you’ll find you can’t.

So I’ve made my version of lisp a bit better at handling loops. If I were writing database code, I could make lisp better at writing SQL statements or data access layers. C# recently got built-in DAL logic with [LINQ][], and it’s great, but only the C# team can write it. Whereas a lisper could write this sort of code;

(sql-select (ID NAME) from PROJECT where (DUEDATE > TODAY))

and it’s do basically the same thing as [LINQ][].

So that’s the why’s and wherefores. Here’s the how of the `domanytimes` macro.

`domanytimes` takes two parts; the loop variables `(x 100 y 100 z 100)` and whatever body you want to execute. We’re going to write a program that skims two elements from the front of the loop variables (say, `x` and `100`) and uses them to write a built-in `dotimes` loop; so a program which converts

(domanytimes (x 100 y 100) body)

into

(dotimes (x 100)
(domanytimes (y 100) body))

and then again to give you

(dotimes (x 100)
(dotimes (y 100)
body))

Here’s the `domanytimes` macro, in all it’s eye-bleeding horror;

(defmacro domanytimes (loop-list &body body)
“allows you to write (domanytimes (x 10 y 10) …)
instead of (dotimes (x 10) (dotimes (y 10)) body ))”
(if (eq (length loop-list) 0)
;; we have our form to execute
`(progn ,@body)
;; we have more loops to arrange
(let ((fst (car loop-list))
(snd (cadr loop-list))
(rst (cddr loop-list)))
`(dotimes (,fst ,snd)
(domanytimes ,rst ,@body)))))

There. Wasn’t that fun? 😉

It looks nasty, I know. All lisp looks nasty. But it’s actually created something new in the language. As far as I understand it, lisp has survived for fifty years basically because the macro system lets you write macros which can add any new kind of syntax you like. You can write knock up a set of macros to [implement OO][clos], and suddenly lisp is OO. You can know up macros for manipulating lazy lists, and suddenly lisp has a [lazy evaluation][lazy]. You can knock up data access layer macros, and it’s got a version of [LINQ][linq]. There seems to be nothing you can’t hack lisp into being.

And if you want to know how the hell that works, I’d recommend [Practical Common Lisp][pcl], which is online and free.

[linq]: http://msdn2.microsoft.com/en-gb/netframework/aa904594.aspx
[lazy]: http://en.wikipedia.org/wiki/Lazy_evaluation
[clos]: http://en.wikipedia.org/wiki/CLOS
[perl]: http://www.perl.com/
[pcl]: http://gigamonkeys.com/book/

Modifying large codebases in dynamic and static languages

I’ve been wondering recently about dynamic languages, and static languages, and the relative benefits.

I’m struggling with this question because I write C#3 by day, and am learning python in the evenings. I’m only writing small python scripts at the moment and I’d like to write larger pieces, but I’m concerned about how easy it’ll be to make certain types of change.

For example. You’ve got 100,000 lines of code. You also have a logging function that’s looks like this;

void Log(string message)

And it’s called about 200 times in your code. You decide you need a severity; so you change the signature to

void Log(string message, LoggingSeverity severity) { .. }

Now, how long does it take to find all the calls to the Log() function that need to be updated? Under C#, about ten seconds. Once every call has been fixed, the code is almost certain to work correctly.

Consider, on the other hand, the python function

def log(message):

What happens if you change the signature to

def log(message, severity):

There is no way to tell where the log message is called. You’ve just introduced 200 bugs.

It’s made even worse by duck typing; maybe you have two loggers — a deployment logger which writes to a database, and a test logger which writes to stdout. You update the database logger so it has severity. Your tests continue to pass, but your deployed system will fail.

So it seems to me that static languages give you much more power to make changes to large codebases. I’d love to know if, and where, the mistakes are in my thinking.

lisp, the beautiful hydra

I’m currently learning the programming language, [Lisp][]. If you’re not a programmer, you may wish to simply ignore this post…

Lisp doesn’t have the mindshare it deserves. At fifty, it is the second-oldest programming language in the world, after [Fortran][]. I’m starting to see the whole history of programming as a struggle between the elder brother, Fortran, and the younger sister, Lisp.

[Lisp]: http://en.wikipedia.org/wiki/Lisp_programming_language
[Fortran]: http://en.wikipedia.org/wiki/Fortran

Fortran Vs Lisp

Programming is, at it’s core, an attempt to write down the solution to a problem so precisely that a mechanical device can perform the solution. We start with our own ideas, then use the program and a compiler to give us machine instructions;

> ideas -> program -> machine instructions

There are two approaches; one is to find better ways to describe machine instructions, which is what FORTRAN does. The other is to find better ways to describe ideas, which is what Lisp does. These two languages established for us an axis; Almost every programming language thereafter fits somewhere in between these two giants.

So far, my impression is that lisp is an excellent language, let down by it’s awful libraries and tools; compared to Ruby, Python, Java, or C#, it just doesn’t have the libraries, and getting extant libraries installed is a dog. Worse, there is no canonical implementation, which means that your code may or may not work on someone else’s lisp; even worse news for libraries. It’s a pity, because the language itself seems beautiful and powerful. Meh.

Object-oriented vs class-oriented programming

In his well-reasoned blog post, [chuck hoffman argues][ch] that what are normally called object-oriented programming languages should probably more rightly be called class-oriented languages. The distinction hopefully becomes clear when you consider this example.

[ch]: http://nothinghappens.net/?p=214

You are modelling people, and you want to create a person type. You should be able to strike up a conversation, so we want a ‘greet’ method for each. Our people (Alice, Bert, Charlie, and Dennis) all respond differently;

– Alice responds to a greeting with “Hi!”, or a surly “what!?” if she hasn’t had her morning coffee.
– Bert responds with either “don’t bother me, I’m walking Spot” or “what can I do for you?”, depending on whether he is walking his dog.
– Charlie responds with either “good morning”, “good afternoon”, or “good evening”, depending on the time of day.
– Dennis responds with “Hello, world!”

Now, in C#, that’s really tricky. Each person uses a different function to answer your greeting. But in C#, the Person class can only have one implementation. You could munge them all together;

class Person
{
string Greet()
{
if (isBert && isWalkingSpot) {return “don’t bother me, I’m walking Spot”; }
else if (isAlice && !hasHadCoffee) { return “what!?”; }
… etc
}
}

But that is monstrous. You could create subclasses;

class Dennis: Person
{
public override string Greet() { return “Hello, World!”; }
}

But this isn’t a class of thing; Dennis is singular. There’s not a whole class of Dennises, just a single solitary one.

What you really want to be able to do is something like this; (excuse the made-up syntax)

Person Alice = new Person();
Alice.HasCoffee = false;
Alice.Greet = { (HasCoffee ? “Hi!” : “what!?”) }

Person Dennis = new Person();
Dennis.Greet = { “Hello, World” };

That’s what an object-oriented, rather than a class-oriented, version of c# might look like.

C# Coding; Missing Functions on IEnumerable

Me old mucker Spencer pointed out today that C# 3’s newly-refurbished IEnumerable<T> class lacks some basic features. Specifically, it lacks equivalents for the classic Map, Filter, and Reduce functions seen in functional languages. The first two are familiar as List<T>.ConvertAll, and List<T>.FindAll. The third isn’t so familiar, but is still very useful. I’ve also thrown in an implementation of ForEach for free.

[Ben Hall](http://blog.benhall.me.uk/2007/08/converting-ienumerable-to-ienumerable.html) points out that it’s possible to extend the class, but I wanted to get a full, commented implementation of the three functions. Feel free to use this code in your own work.

So, here they are;

IEnumerableExtras
——

public static class IEnumerableExtras
{
///

/// Do ‘action’ to every item in the list.
///

/// The source type
/// the IEnum
/// the action to perform.
public static void ForEach
(this IEnumerable list, Action action)
{
foreach (T item in list) { action(item); }
}

///

/// Convert every item in the list using the converter
/// function
///

/// The source type
/// The destination type
/// the list to convert
/// a function to convert
/// one item to another.
/// all items in the list converted by
/// the converter function.
public static IEnumerable Map
(this IEnumerable list, Converter converter)
{
foreach (T item in list)
{
yield return converter(item);
}
}

///

/// Returns a new enumerator containing only those
/// elements which return true from ‘condition’.
///

/// The source type
/// the list to filter
/// the ‘keep in’ condition
/// the items for which condition(item)
/// is true
public static IEnumerable Filter
(this IEnumerable list, Predicate condition)
{
foreach (T item in list)
{
if (condition(item))
{
yield return item;
}
}
}

///

/// Reduces a list of items to a single item; can be
/// used to, say, sum a list of integers, or
/// concatenate a number of strings, or find the
/// maximum value in a collection.
///

///
///
///
///
public static T Reduce
(this IEnumerable list, Func reducer)
{
IEnumerator enumerator = list.GetEnumerator();
if (enumerator.MoveNext())
{
// we have some items; start combining them together.
T aggregator = enumerator.Current;
while (enumerator.MoveNext())
{
aggregator = reducer(aggregator,
enumerator.Current);
}
return aggregator;
}
else
{
// there was nothing in the list; return default.
return default(T);
}
}
}

And here’s an **example program**;

static void Main(string[] args)
{
IEnumerable maybeDoubles =
new List {1, 2, null, 3, 4, null, null, null};

// remove all the empty values: [1,2,3,4]
IEnumerable noNulls = maybeDoubles.Filter(x => x.HasValue);

// convert Nullable to non-nullable: >[1,2,3,4]
IEnumerable notNullable = noNulls.Map(x => x.Value);

// convert to strings so we can display them. [“1”, “2”, “3”, “4”]
IEnumerable stringVersions = notNullable.Map(x=>x.ToString());

// join the strings together with commas “1, 2, 3, 4”
string displayString = stringVersions.Reduce( (s1, s2) => s1 + “, ” + s2);

// show us the result;
Console.WriteLine(displayString);
Console.ReadLine();
}

So there you go. Enjoy.