Tag Archives: Software Development

Can You Just Take a Look at this Legacy Code?

As a programmer there are a number of books which people will tell you are must read books for any professional – which do change over time as programming techniques evolve. However the books are fairly consistent in that they all tend to be written from the point of view of a green field system, starting from first principles, ensuring you build a maintainable system.

But is that realistic? You might be lucky and get in at the beginning of a brand new startup, or you could land a job at a consultancy where you’re always writing bespoke code, but for most programmers an awful lot of their career will be dealing with the joys of legacy code.

It may be that you come into an established company with many years of development and thousands of lines of code debt and changing technologies.

Alternatively you could be handed the thing programmers often dread the “business developed application” – often these are mired in corporate politics as well, with strained relations between the business area that developed the application and the IT department. Indeed in one company I worked for there was a semi-secret development team in one part of the business formed as a result of the IT department saying no one too many times! In most cases these business developed applications are produced by people whose strength is in understanding how the business works, but are inexperienced as developers, which often produces a double hit of problems in that the business logic is usually poorly documented, and the code is also of poor quality.

Other examples I’ve come across are prototype systems that have almost accidentally ended up as critical systems, and something that happens surprisingly often is a big company takes on responsibility for a third party product either because they don’t want to upgrade to a supported version, or because the third party company is abandoning a product altogether.

The common factor in all of these is that you’re taking on a codebase that is less than ideal, so all these coding books that assume you’re starting from scratch aren’t overly useful. All the benefits of test driven development protecting you when you make changes are really not much good when you have incomplete or totally missing tests. It’s incredibly difficult to find your way around a badly structured code base if you’re used to textbook structures and accurate documentation.

What do you do? Edit and pray it works? Rewrite the whole system from scratch?

All of which brings me back to where I started, and the excellent Working Effectively with Legacy Code by Michael Feathers. The book starts from the entirely pragmatic position that you are going to be working on dodgy code a lot of the time, and if you don’t want to make it worse you need to get it sorted out. It is also realistic in that it gives you techniques to gradually improve the code as a business will rarely be able to spare the time and resources to totally rewrite something.

The really basic concept around which a lot of the more complicated techniques are built is that whilst you can’t bring all of a codebase under test immediately you can grow islands of properly tested code within the codebase that gradually spread out as you work on other parts of the codebase over time. To create these islands you need to separate them from the rest of the codebase, which is where a lot of the complexity comes from, but Feathers offers a variety of different techniques for making those separations. The ultimate aim is that as much of your legacy codebase is brought under test, and the codebase as far as possible conforms to modern principles like DRY and SOLID, whilst at the same time allows you to produce the changes and improvements your users or customers are demanding to the legacy code.

I hesitate to say that any programming book is an essential read, but if like most programmers you’re faced with a chaotic legacy codebase Working Effectively with Legacy Code is a book that certainly gives you a lot of practical advice of how to make things better.

A Whole Load of Empty

I have a lot of respect for Jeff Atwood and his Coding Horror blog. He often has interesting and informed insights into software development, and generally knows what he is talking about.

Yesterday he posted an article under the heading “The Best Code is No Code At All” where, backed by comments from Wil Shipley he argues for the benefits of code brevity – put simply reducing the volume of code a developer has to read to understand how an application works. This is certainly something I agree with. However then he gives an example of a simple change to improve code brevity:

Instead of

if (s == String.Empty)

he suggests

if (s == "")

He backs the suggestion with the following statement:

It seems obvious to me that the latter case is better because it’s just plain smaller. And yet I’m virtually guaranteed to encounter developers who will fight me, almost literally to the death, because they’re absolutely convinced that the verbosity of String.Empty is somehow friendlier to the compiler. As if I care about that. As if anyone cared about that!

This is one occasion where I significantly disagree with Jeff on this when it comes to development in .Net.

Firstly, using “” is error prone. Put a space between the quotes, and the compiler won’t pick it up – it is still a valid string literal. It may be a minor change, but it will break your application, and is the kind of typo that is a pain to find. A typo in String.Empty will be picked up by the compiler, rather than coming up in testing, or worse still on a customer site.

Secondly, using String.Empty is more efficient (although as any .Net programmer should be able to tell you, checking the length of the string is more efficient still). To understand why String.Empty is more efficient it helps to understand how .Net handles strings. In .Net, a string is immutable – it never changes, so for example routines that concatenate strings together to build up database queries will be repeatedly creating new strings throughout the whole process. The .Net Framework helpfully provides a StringBuilder object to use in such situations. However the same is true when considering any literal, so “” has to become a string object, and yet you already have String.Empty which is exactly that.

However, .Net has another trick up it’s sleeve, in that it maintains something called the string intern pool. Every string literal is stored in this pool, and when a new literal is encountered that is already in the pool the version from the pool is used instead, saving the overhead of creating a new object. Whilst that speeds up literals somewhat, the application still has to do slight more than if it is just handed a suitable object straight off. Using String.Empty is slightly more efficient than “” – and it’s nothing to do with being friendlier to the compiler.

It’s entirely fair to argue that the performance differences are negligible in most situations, a point with which I’d agree, however the differences magnify if you are producing code with a lot of literals, and a lot of string comparisons. As an aside check out this posting from a former member of the CLR team at Microsoft where he ponders whether automatically interning the entire pool when an assembly is loaded is a good or bad idea.

Put simply, if you know you’re going to be repeatedly comparing to a string value, and performance is an issue, don’t use a string literal, as every comparison you’ll get the overhead of it looking up in the string intern pool – and if the string value is an empty string, don’t bother with either and use String.Length, or better still String.IsNullOrEmpty if there is the possibility that the string is null (or Nothing in VB.Net), both of which are faster than a string comparison in that situation.

So sorry Jeff – I see your point about brevity, but I’m sticking with String.Empty!

Three Screen Development

Three Screen Working

So this week I got my new machine set up, and have joined the prestigious ranks of three screen software developers. Having said that – it’s not quite the arrangement that Bill Gates has, in that it’s two machines driving the three.

The laptop is pretty obvious on the left, and the primary screen for my new machine is on the right. The screen in the middle is connected to both the laptop (via a VGA cable) and to the new machine via a DVI cable, so I can switch it between the two machines depending on what I’m doing. As I mentioned a few days ago, I’m using Synergy to tie the whole lot together, sharing a single keyboard and mouse, and linking the clipboard on the two machines. It’s not quite as convenient as having all three monitors on a single machine, as you can’t move windows between all the screens, but it is still pretty useful.

Generally I have my e-mail on the laptop screen, and then develop on the two larger screens. This actually threw up and interesting case of ‘to clever by half’ from the graphics card in the new machine. I had the smaller screen on the right plugged into DVI socket number 1, and the widescreen monitor into DVI socket number 2, but it helpfully decided that I really wanted the higher resolution widescreen monitor as my primary screen, and swapped them over in software. Eventually after some digging around I found the relevant settings to switch the right hand screen back to being the primary monitor, leaving me able to swap the centre screen between the laptop and the new machine as required.

Incidentally, with regards to the Vista experiments, the new machine is still generally running Windows XP, primarily because our companies chosen anti-virus software is currently not Vista compatible.