20050104

Why Software Projects Go Wrong

A rather ambitious title for a humble post.

Reading (as I tend to spend much time doing) Slashdot, I found this post which leads to a formal survey. The survey in itself is interesting, but it suffers as much as any survey: it hinges on people's perceptions.

So, I thought that adding my own perception on why software projects don't work well would contribute to the body of knowledge somewhat. After all, the study does not reveal exactly why software projects go bad; it only reveals what we think goes bad.

So, here's my modest contribution:

Mismatched metaphors

This is, to me, the worse problem in software development today. It's a meta-problem, that encompasses using the wrong methodology, unrealistic time pressures, lack of customer involvement, and even design complexity. Those things can arise without the mismatched metaphor problem, but a mismatched metaphor can lead to all of them.

Good examples of this are: the idea of software as manufacturing, or of applying construction-related methods to software. Another is the insistence that you need artificial deadlines to stimulate people, that programmers are fungible resources, that resources can be exchanged freely, that software complexity scales linearly, components that are supposed to fit like lego blocks, and so on.

Software is not like manufacturing or construction. Software construction is what the compiler does: it turns a design spec (source code) for algorithms into a final product (an executable algorithm in machine code). In the days before advanced programming languages, "hand-assembling" the code, as well as the translation from pseudocode to assembly, could be seen as construction. Those are largely automated today. Time for software construction scales linearly with hardware power (well, not quite, because there's hard limits like hard disk access times and so on, and some code optimizations run only in quadratic time or worse, but much of code generation is made up of linear processes).

Writing software is mostly a design activity. Some parts are construction: writing the same string sifting algorithm for the 100th time, translating a high level algorithm into a language that's not quite abstract enough, and so on. Those tasks will eventually become easier and mostly automated with time, as better libraries and languages appear. Some may remain, for the same reason we don't write everything in pseudocode: we don't know algorithms to make every algorithm description into efficient code without human aid.

It is therefore closer to the design stage of construction or manufacturing.

An aside: construction projects are very often late, mostly due to mismatched designs or misbehaving contractors. So, it's not really as if they have a sterling record with their methodologies. Besides, from the construction point of view, software has relatively good track records, as long as you work in a shop that has decent configuration and build management. Manufacturing has a better construction record. However, the amount of investment it requires is tremendous. Try retooling your whole factory in a few weeks to produce a very different widget. This happens very often in software, though.

Mismatched metaphors are sometimes applied by project managers, but it's only deadly when it's applied from a higher level. When most of the upper management believes that software is "like" some other thing (other thing often being a process where fabrication is very expensive), be very afraid. Software is not like any of that. If it's close to something, you could pick, say, producing music on tape. Except that music is never tweaked once it's on tape! So there's really very few things that are similar.

To summarize: Software is unlike any other human creative activity. Treat it as such and respect its differences.

Unrealistic expectations

While people who don't know software shouldn't decide on methodology based on other things they know, people who know software just a little bit can be very dangerous, too. Often, they have certain expectations as to how easy or hard certain things are. They'll estimate times based on this instead of asking their staff.

To be fair, I've seen this mostly happen when past staff had the nasty habit of padding their schedules by way too much, either to goof off or to hide the fact that they have no clue what's going on. But it's not always the case, and if the staff changes, there's no reason to keep judging the new staff from the old standards.

But sometimes, they just have no idea how complex the software has become, or they were used to shipping very "quick and dirty code" (this is the case for many senior execs in startup companies who used to code) and that low level of quality is not acceptable anymore. Also, sometimes technology changes, and it's unfortunate that some new technologies in the realm of software have reduced programmer productivity rather than augment it.

Now, this is no excuse to let programmers goof off. Find one that has a history of delivering good results. Good estimating skills are nice too, but most valuable is the ability to smell "trouble areas" in requirements and designs. That person is probably a good sounding board on whether estimates are bunk. If such a person is giving you vague answers, it's likely that the problem is mis-scoped or that the code is of enough complexity that the person is unsure. In those cases, be willing to give them some slack.

Also, never underestimate the power of an existing codebase to ruin your day. Especially if those who wrote that codebase have left.

In summary: Don't expect anything. Ask your technical staff what to expect; it's more likely to be correct.

Fads

It's interesting how software is often managed with the expectation that it works like a manufacturing project, but management itself behave like youths, sniffing for the latest trends and acting like fad-crazy dolts.

For better or for worse, the IT industry is full of "fads" that keep ruining projects. They do provide a lot of fun for skilled programmers, as they provide new things to do and learn. But unless you're building a prototype or you have a lot of spare cycles to spend on exploratory products, it's not a good way to select technology.

There are, of course, new technologies, and those should be adopted. What make fads different, though, is that they always claim to be a silver bullet, but you must apply them to all your code and use them properly if you are to reap the benefits. So huge rewrites of a somewhat crufty but at least working codebase are ordered.

Note that there are reasons to rewrite large pieces of code, especially if the code is presenting serious structural problems or if it was left to the tender mercies of bit rot. But implementing a new technology by rewriting an existing application is very dangerous. First, because unless you're fanatical about documentation and process, the enumeration of everything that application does is not available. The application will have a bunch of features, behaviours and quirks that everyone is used to, but that nobody really knows completely. However, new technologies is nearly never one of them. What you need is either an adapter, or a fully new product, with a new name, so people don't expect the same thing. For more on this, see this Joel on Software article. Note that Joel was later proven wrong about Mozilla; in the long run, they did the right thing. But they aren't a commercial entity, and most software is done by commercial entities that cannot afford to wait that long because that'll only mean the code will be lost.

So, to summarize this somewhat rambling point: don't let fads rule your product roadmap. Don't accept projects that embrace a new, never used technology without prototyping first. New technologies must always be explored, and programmers should be given the chance to play with them from time to time (on company time, otherwise it won't get done; don't worry, once you've given them some company time to play with it, they'll probably put more of their own time, because it's fun). But don't let the latest trade press articles make you plunge headlong into using this new whizz-bang without due consideration and a technical exploration.

Piling up stuff

This one's simple: the temptation, when a product is already late or borderline, is always to add features "while we're at it." This makes no sense, but still happens. Don't do it. It makes things later. Ship the damn thing, whatever it takes, then schedule time to add those additional items.

Unless your customer is very tolerant of slippage, adding stuff since you're already late does not make sense. It's likely programmers will still kill themselves to get the stuff out. You end up with a pissed off customer (since the product was promised earlier) and a pissed off staff. The former can be damage-controlled; the latter cannot. A pissed off staff is a very, very bad thing, and you should take it seriously.

If you must add stuff, or fix bugs, move the deadling. If you do anything else, you're playing the ostrich. The deadline will slip. No, the slack time the PM put in cannot be used to fix extra stuff; it's already used because software is still an inexact science.

Multiple lines of power

This one is my favorite. It's also the most common.

When it comes to making decisions on a project, there should be one person with the power to move deadlines, cut features, leave bugs in, send programmers home to rest, and so on. That person is the one with the responsibility of the project succeeding or failing, usually called the project manager.

If you give responsibility to somebody without giving them those fundamental powers, you are setting that person up for failure. That person is likely to sense that and not work at the best of her abilities, to say the least!

I've been on countless projects where technical decisions were taken in advance, during the project, without advice from the developers, and over the head of the project manager. This is wrong, and a recipe for problems in the very near term.

That doesn't mean the project manager should ignore advice from others. But that person makes the final call. With responsibility, equivalent power should be available.

Well, that was my own somewhat conceited list. Most of those are management responsibility, though, and unlike some programmers, I do not believe software always dies because of management problems. Here's the list of "programmer sins" if you will:

Falling in love with your own cleverness

I think every programmer has seen either one of those:

  • A program with code so compact and so obscure it's impossible to figure out what it does, but the original author claims it's "really efficient;"
  • A program full of design patterns and deep inheritance hierarchies, but which doesn't use them (how to tell? Look for a lot of unused classes, or a lot of intermediate base classes that have only one derived class...)

I pity you, because you've found code written by a programmer in love with his own cleverness. The resulting software will usually work for the first version; the next version will probably slip badly, because it carries too much obscure code or too much extra baggage that makes it impossible for anyone to tell how to do the work.

Refusing to compromise

Clean design and clean code is incredibly important, but sometimes, you just have to grit your teeth and do something that you know is a bad compromise. This is never the case if the project hasn't slipped (and a project which was late before it even started does not qualify as a slipped project!) or for things that have an effect on persistent data (because you usually can't fix them later, at least not without import/export conversions and other nastiness). Also, it's not a good idea if the shop where you work never schedules time to clean up those things before even starting work on the next project. You'll know after a few releases. If that occurs in your shop, well, I pity you; I'd also tend to consciously let that particular release slip so you can do it properly, or later releases will slip much worse.

But face it, sometimes the company faces non-technical requirements that will force you to do something nasty. Don't refuse it. Make sure you document it well, though.

Overgeneralization

It takes a lot of practice to be able to write reusable toolkits and frameworks. And even then, they're likely not to be that general. Resist the temptation to do it until you've studied a lot and done a couple, complete with mistakes.

Beginners (myself included when I started out), especially bright ones, tend to write very general pieces of code. Coupled with a lack of experience at writing readable code, this can create a little piece of code that nobody can quite figure out. Plus, all that effort spent on generalizing the solution is likely to be wasted, since you'll find that:

  • People end up doing custom solutions instead because the general solution is too hard to use or too obscure;
  • The software ends up never needing so much generalization anyhow.

Reusable code is hard to do. It requires good taste, experience as to what level of abstraction is needed, and the ability to write extremely clear code. If you feel you lack any of this, don't write generalized code.

Copy-paste programming

This usually works well for the first release, but the second release will suffer. Bugs are duplicated by every cut-and-paste. Try to factor out common code in functions. That's what they are for.

One would think people would know that, but every piece of code I've ended up having to maintain used copy-paste in obvious ways in several places. Don't do that! Avoid it like plague! With modern refactoring tools that can extract functions from a piece of code and so on, time constraints aren't even a good excuse anymore.

Note that I've been known to do so for optimization reasons (because Java strings suck), it was always very, very well documented and I've always kept the copied code right above the pasted code. So if you really have to do it, at least make sure it's not all over the place.

Too many things at the same time

Note that this is sometimes a management pathology as well. But programmers shouldn't be too quick to point at management, as it sometimes happens because they did not communicate certain problems.

Programmers should always resist the temptation to do things in one big batch that cannot be tested in the interim. Do things in small steps. Test each of those steps to make sure every step does what you expected and has no embarrassing side effects. Fix any problems, then go to the next step.

The point of this is that by doing things one at a time, you can tell which thing you did broke what. So, try to do one thing at a time and test each one of them.

Projects sometimes fail because they run out of control. Programmers do a huge batch commit to the source tree which breaks everything at once. Huge commits are not a problem per se; however, if the whole thing was done in one shot, and if the huge commits repeat themselves, expect trouble. Things will start breaking left and right, and nobody will be able to figure out what caused it.

The last project I worked on that had that problem still exhibited severe instability until we got fed up and spent nearly a year rewriting huge parts of it. That was after nearly a year and a half of being completely b0rked, unpredictable and unable to change to accomodate features required by customers. Last I heard, there's still parts that are b0rked. As I remember, it started going unstable when people did those huge, deep-impact commits that always changed 2-3 different things at the same time.

Well, that's pretty much it. I'm not sure whether those are the "top" reasons for failure in software projects, but they're the ones I've seen the most often for projects that got in serious trouble.

1 comment:

Unknown said...

This is such a great resource that you are providing and you give it away for free. Survey Software