Thursday 27 September 2012

BobSleigh - Personal Experiences of Test Driven Development

In a previous post I commented about creating a new sourceforge project called bobsleigh (http://bobsleigh.sourceforge.net) which will be able to take a database connection, parse its schema and generate code etc from it.  I have been coding away fairly solidly on BobSleigh for a few weeks now and am close to doing a release (the current source code can be accessed using SVN).  At the start of the project I had various high level goals defined such as:  it must be simple, powerful, familiar and also completely TDD'd (test driven developed).  In this post I will outline the experiences that I encountered with creating a project that is totally TDD and some of the pitfalls / advantages I encountered.

Easily the hardest thing I found about TTD was getting started with the project.  For example, I created an empty BobSleighTest unit test class and found myself starring blankly at the screen for a few minutes...  I guess the main reason for this was at the start of the project all I had were some vague high level requirements, which in turn had to be broken down into items at a granular enough level that I could to write a test for.  Usually at the start of any project I am able to churn out code as soon as an idea comes into my head.  However, because I had to create a test first than that really forced me to think out my design.  Like most pieces of software, bobsleigh has 3 main stages: -
  1. Input - BobSleigh expects input to be a "bobsleigh.xml" in the folder where it gets executed.  This is similar to ANT looking for a "build.xml" (although other filenames are allowed - this requires command line arguement's which must point to the different input filename).
  2. Process - BobSleigh then parses the input file, connects and parses a database, performs validation, creates an in-memory model and parses one or more sections (similar in concept to ANT's sections).
  3. Output - BobSleigh finally takes the in-memory model and delegates it to a 3rd party template engine provider (e.g. FreeMarker or Velocity) which creates source code, other files depending on what the user has defined in the various templates.
Again, this is a very high level overview of what the BobSleigh project should do.  The next question then was, which section should I concentrate on first i.e. the input, process or the output?  The option I picked was option one, 1) input, as this was the easiest one to get me started.  The bobsleigh.xml file gets parsed by BobSleighConfig.java and tested using BobSleighConfigTest.java (in the same package but in a different source folder).  It didn't take long for the code-base to grow with a test method always created before the implementation method.  To ensure code coverage, Cobertura was used.

If I could go back to the start of the project, would I use TTD again?  The answer is yes, absolutely, 100%!!!  I can't emphasise this enough and plan to use TDD as much as I can in my day job.  Before in other projects, I would use TTD now and again if it was a complex piece of code.  I would also sometimes write the unit test after the code was implemented.  This is the first project I have used TTD in all the code and the main advantages that I see are as follows: -
  1. Refactoring - in my opinion, this is one of the greatest benefits of TDD if you have high test coverage (bobsleigh currently has 100% code and branch coverage in all non GUI packages).  For example, I decided to implement the ability to have multiple schema objects (instead of just one) and although conceptually it looked like a simple change, it affected every part of the system.  At each refactor I was able to run the entire suite of tests again and see exactly where things were breaking and fix each broken test one-by-one.  Once all the tests were working confidence in the system was high.
  2. Better Design - again, another of the top benefits of TDD as that it forces you to think about design, even if the project is being developed in an incremental and iterative manner.  TDD seems to create classes which are simpler and less coupled to each other.
  3. Unit tests help documents the code - if you are looking at a class / method in the system it is easy to see the purpose of a class / method by loading up its equivalent unit test class.  
  4. Faster debugging - Another nice side effect of have good test coverage is that it is very easy to debug all methods of all your classes.  No more wasting time writing main () methods simply to create an entry point for debugging!
  5. Proof that your code works - as the test suites grows (and as long as all your tests pass) you can say with confidence that your code works for what is was designed to do.
There are many other advantages to TDD and unit testing in general, but these are the main ones that I can think of.  I put refactoring at the top of the list as refactoring tends to happen at the start of a project more than when project matures and increases in size.  This is because people tend to break parts of a program when they refactor and without unit tests these broken parts go unnoticed until QA / a customer etc finds the broken code.  This in turn leads to fear of refactoring and generally only fairly simple refactors occur (e.g. class renaming using Eclipse's built-in refactoring tools).  With good code coverage, this fear is replaced with confidence and with consistent refactoring the code quality grows and grows.

Happy coding!
Bob