Michael Higgins ([info]sui66iy) wrote,
@ 2008-02-11 18:46:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Types and tests
Are unit tests the static type system of dynamic languages?

A static type system does the following things:

* It provides a minimal level of documentation telling the programmer what sort of data a function expects to operate on and return;

* It provides an analytical scaffolding that allows a compiler to automatically check for certain kinds of errors;

* The same scaffolding allows certain program transformations to be done in a "safe" way. These are often optimizations (done transparently by the compiler) or re-factoring operations (done explicitly by the programmer with the aid of an IDE).

On the down side, all this type annotation scaffolding causes code to become more verbose. This verbosity can be a worse-than-linear effect as the program grows: connecting different pieces of code together in a strongly typed system often causes a lot of extra code to be written just to make the types and protocols match up.

Dynamic languages, on the other hand, don't support this sort of scaffolding. While values still have types, these types can change over the course of the program, and there is no way to specify that certain functions expect certain types.

The big win is that dynamic languages tend to be very concise, largely because all of that annotation to explain to the computer (and to other programmers) what sort of data is expected goes away. Similarly, when connecting together pieces of a large program, it's not necessary that types match exactly. As long as the parts do the right thing, all is well. (Needless to say, guaranteeing that can be problematic.)

But you lose the self-documenting feature of type systems and the ability to automatically (and statically) detect type errors and do optimizations and re-factoring.

Instead, dynamic language proponents put even more emphasis than usual on unit testing. Instead of relying on the compiler to check errors, the test suite checks the errors. The tests can also serve as documentation: how is the function usually used in practice? Look at the tests.

Tests are, in fact, much more powerful than types. A unit test can verify any computable set of semantics, whereas type systems can only tell you that the right sort of value is in the right place.

So tests are better, right? You can define any semantics you need, write tests that verify those semantics, and you've got your documentation and error-checking covered, right? No need for those messy types.

Well, sure, kind of.

The first problem is that you've got to write the tests. No one ever has 100% test coverage. Far from it. (Think about how many cases you would need to completely test the "add" function.) Writing tests is time consuming, usually fairly boring, and itself prone to errors. Type systems have soundness properties, though, and if your type system is sound you know that your program has "coverage."

The second problem is that the semantic strength of tests is its undoing as far as writing tools goes. Because type systems are simple and (relatively) easy to formally characterize and prove things about, you can write program transformations that preserve types. But tests are Turing machines: you can't prove shit about what they do in general, so you can't write code that "understands" the tests and uses them to understand the underlying program.

My opinion is that the concision of dynamic languages is, overall, enough of a benefit that you can live without automatic re-factoring. (A good chunk of automatic re-factoring is done to try to help with the bloat that came about from the type system in the first place.) And it's not impossible to write re-factoring tools that work heuristically --- you just better have good tests to make sure they don't break your code!

All of this was sort of semi-inspired by Stevey's recent characterization of types as being akin to comment bloat. I see where he's coming from: strong type systems can be awfully frustrating. It's intolerable to write (or read) Java without a smart IDE that helps with all the type cruft. And it's so painful when you encounter a situation where duck-typing would "just work" but you have to introduce wrappers in order to match types.

The big difference, though, is that the computer can't read the comments, but it can read the types, and it really can do some cool stuff with that information. Tests are, ironically, too powerful to subsititute for types completely.

(I do completely agree with Stevey that "huge long comments" are a sign of young programmers. Comments should be brief and to the point. The idea is to clarify the non-obvious, not to re-implement the program in English. Also: comments cannot be tested, so are quite often wrong or obsolete. A wrong comment is worse than no comment at all.)


(Post a new comment)


[info]jeremiahblatz
2008-02-12 02:20 am UTC (link)
I think the main difference between unit tests and static typing is that unit tests tend to go for operational correctness, whereas static typing goes for theoretical correctness. Most unit-testifull methods that I've encountered seem to go big on the "find a bug? write a test for it then fix it!" school, which I suspect drives fans of static stuff into a tizzy.

Which is to say that unit testing provides a similar function as static types, but the two functions are very different. They both provide, I think, a foundation of Known Correctness from which one can gain confidence or find more interesting errors. The kind of foundation they provide is very different, though.

(Reply to this) (Thread)


[info]sui66iy
2008-02-12 02:34 am UTC (link)
Well, I don't think static types even give you theoretical correctness. All they can do is say, "yup, the types match", which doesn't help you if your fancy Increment(x) function actually subtracts 1.

I think where I was coming from is that there's a temptation to think of a sort of a "stack of correctness":

Syntax checks
Types check
Tests pass
...
Program is actually correct

The thing I was sort of wondering about is, does the "tests pass" level completely subsume the "types check" level? And the thing is, it doesn't. There's the practical reason: type checks can have 100% coverage, where unit tests really can't (practically speaking). But there's also the theoretical reason: you can do stuff with the type information that you can't do with the "tests pass" information, which is sort of interesting...

(Incidentally, it's cool that someone actually reads my technical entries ;-)

(Reply to this) (Parent)


[info]je2i
2008-02-12 04:46 am UTC (link)
(Incidentally, it's cool that someone actually reads my technical entries ;-)


I read them too. I'm just generally too lazy busy to comment.

(Reply to this)


[info]jcreed
2008-02-17 06:29 pm UTC (link)
The idea that static types require comment bloat is mercifully not true. Languages that have expressive type systems and have compilers smart enough to support type inference are a joy to use.

(Reply to this)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…