Unit testing and the narrowly averted Citicorp Center disaster

It was almost a disaster...

I was working on a project earlier today. Now, typically I always do test-driven development, where I’ll build unit tests that verify each class first and then build the code for the class after the tests are done. But once in a while, I’ll do a small, quick and dirty project, and I’ll think to myself, “Do I really need to write unit tests?” And then, as I start building it, it’s obvious: yes, I do. It always comes at a point where I’ve added one or two classes, and I realize that I have no idea if those classes actually work. I’ll realize that I’ve written a whole bunch of code, and I haven’t tested any of it. And that starts making me nervous. So I turn around and start writing unit tests for the classes I’ve written so far… and I always find bugs. This time was no exception.

This time, for some reason, that Lose Weight Exercise reminded me of the story of the nearly disastrous Citicorp Center building.

Citicorp Center was one of the last skyscrapers built in the New York City skyscraper and housing boom in the 1960s and 1970s. A lot of New Yorkers today probably don’t realize that it was actually one of the more interesting feats of structural engineering at the time. The building was built on a site occupied by St. Peter’s, a turn-of-the-century Lutheran church that would have to be demolished to make way for the skyscraper. The church agreed to let Citigroup demolish it, on one condition: that it be rebuilt on the same site.

The engineer, Bill LeMessurier, came up with an ingenious plan: put the base of the building up on columns, and cantilever the edge of the building over the church. Take a look at it on Google Maps’ Street View — you can pan up, navigate around, and see just how much of a structural challenge this was.

The building was completed in 1977. A year later, LeMessurier got a call from an engineering student studying the Citicorp building. Joe Morgenstern’s excellent 1995 New Yorker article about the building describes it like this:

The student wondered about the columns–there are four–that held the building up. According to his professor, LeMessurier had put them in the wrong place.

“I was very nice to this young man,” LeMessurier recalls. “But I said, ‘Listen, I want you to tell your teacher that he doesn’t know what the hell he’s talking about, because he doesn’t know the problem that had to be solved.’ I promised to call back after my meeting and explain the whole thing.”

Unfortunately, LeMessurier was mistaken, and in the article he describes the problem in all its gory detail. It’s a fascinating story, and I definitely recommend reading it — it’s a great example of how engineering projects can go wrong. It’ll probably seem eerily familiar to most experienced developers: after a project is done, someone uncovers something that seems to be a tiny snag, which turns out to be disastrous and requires a huge amount of rework.

Rework in a building isn’t pretty. In this case, it required a team to go through and weld steel plates over hundreds of bolted joints throughout the building, all over the weekends so nobody would find out and panic.

But what I found especially interesting about the story had to do with testing the building:

On Tuesday morning, August 8th, the public-affairs department of Citibank, Citicorp’s chief subsidiary, put out the long delayed press release. In language as bland as a loan officer’s wardrobe, the three-paragraph document said unnamed “engineers who designed the building” had recommended that “certain of the connections in Citicorp Center’s wind bracing system be strengthened through additional welding.” The engineers, the press release added, “have assured us that there is no danger.” When DeFord expanded on the handout in interviews, he portrayed the bank as a corporate citizen of exemplary caution–“We wear both belts and suspenders here,” he told a reporter for the News–that had decided on the welds as soon as it learned of new data based on dynamic-wind tests conducted at the University of Western Ontario.

There was some truth in all this. During LeMessurier’s recent trip to Canada, one of Alan Davenport’s assistants had mentioned to him that probable wind velocities might be slightly higher, on a statistical basis, than predicted in 1973, during the original tests for Citicorp Center. At the time, LeMessurier viewed this piece of information as one more nail in the coffin of his career, but later, recognizing it as a blessing in disguise, he passed it on to Citicorp as the possible basis of a cover story for the press and for tenants in the building.

Tests were at the center of this whole situation. It turned out that insufficient testing was done at the beginning of the project. Now, more tests were used to figure out how to handle the situation. Tests got them into the situation, and tests got them out.

So what does this have to do with software?

I have a hunch that anyone who’s done a lot of test-driven development will see the relevance pretty quickly. The quality of your software — whether it does its job or fails dramatically — depends on the quality of your tests. It’s easy to think that you’ve done enough testing, but once in a while your tests uncover a serious problem that would be painful — even disastrous — to repair. And as LeMessurier found, it’s easy to run tests that give a false sense of security because they’re based on faulty assumptions.

I’ve had arguments many times over my career with various people about how much testing to do. I can’t say that I’ve always handled them perfectly, but I have found a tactic that works. I point to the software and ask which of the features doesn’t have to work properly. But it’s good to remind myself how easy it is to question the importance of tests. It’s so easy, in fact, that I did it myself earlier today. And that’s why it’s important to have examples like Citicorp Center to remind us of how important testing can be.

How spending a little extra time and money on design might have saved Microsoft over a billion bucks

I really wanted an Xbox 360.

My old PS2 is showing its age, and I wanted to upgrade to a new system as soon as I finished the last few missions of GTA: Vice City Stories — especially now that it looks like Manhunt 2 won’t be coming out for PS2 any time soon. I’m a huge fan of the GTA series, and I’m especially psyched about GTA4. I grew up in Brooklyn, on a block that looks a more than a little like a GTA4 screenshot.

But then something happened…

Viva Pinata.png

But a couple of weeks ago my plans changed. Jenny was happily thrashing away on Guitar Hero, when her TV screen just went blank. She looked down at her console, which had suddenly gone quiet. (That’s pretty noticeable, apparently, because the Xbox 360 is a really loud machine… which, as it turns out, is important to our story.) Much to her disappointment, she saw those three telltale LEDs that every Xbox owner dreads: the red ring of death.

Luckily, Jenny’s 360 lasted long enough so that she could take advantage of the Xbox 360 service site that Microsoft launched earlier this month. But her poor console was just the latest in a long line of casualties. Some retailers estimate that 30% of Xbox 360s need repair, and we’ve seen plenty of anecdotal evidence that gamers are unhappy. It’s costing Microsoft sales and spooking investors. Microsoft is doing everything they can to fix the problem — they’ve extended the warranty to three years, and it’s costing them over a billion dollars. But it’s a real mess.

As much as I want a new console, I’m not going to buy one until I know that it won’t break. I still plan on getting a 360, but not until I can be reasonably sure that I won’t have to return it. By the time I eventually get one, hopefully they’ll have figured out how to make it quieter. I’m certain that I’m not the only one who’s decided to put off buying an Xbox. And that’s bad news for Microsoft.

So what can we, as software developers, learn from the Xbox 360 fiasco?

Productive meeting

Software people like us have a nasty habit of dismissing hardware problems as if they have nothing to do with us. We tend to think that designing software is really different from building hardware. And sure, there are definitely differences. We don’t have to worry about assembly lines, product getting damaged in shipment, or those pesky laws of physics that can prove to be such an irritating limitation when you have to design physical objects.

And it’s easy to dismiss the Xbox 360 failure as one of those unfortunate things that falls into that last category of physical faults. There’s a great Tech-On! article that gives us the dirt on exactly what’s caused the problem. It’s an excellent post-mortem on what amounts to terrible thermal design.

For those of you who’ve never taken a computer apart, here’s a little background information. Dealing with heat is an important part of modern computer design. Computer processors generate a lot of heat — so much that if you don’t come up with a way to get rid of it, they’ll fry themselves. So computer manufacturers will typically attach a heat sink to a processor. A heat sink is basically just a big radiator with fins or poles that lets air circulate and draw away the heat. (I once roasted a Pentium 4 processor by popping its heat sink off while the computer was running, just to see what would happen. It went “poof”.) A lot of processors are too hot even for heat sinks; in that case, you’ll need to stick a fan on top of it to cool it off. That’s why some computers are so noisy: they need fans to keep them cool.

It turns out that the Xbox 360 generates far too much heat, and a lot of people speculate that when that heat builds up past a critical point it unseats the GPU (a separate processor that’s used for graphics). Microsoft has so far refused to comment on exactly what the problem is, but as time goes on there does seem to be some consensus forming about it. And that Tech-On! article seems to have found a smoking (heh) gun.

But that’s just the hardware stuff. What does that have to do with building better software?

The punchline for all of this came at the end of that Tech-On! article, and it’s why I think this whole incident is so interesting. Here’s what it said:

Finally, we opened the chassis of the Xbox 360 repaired in May 2007 and compared it with the other Xbox 360 we purchased in late 2005.

“Huh? The heat sinks and fans are completely identical, aren’t they?”

To our surprise, the composition of the repaired Xbox 360 looked completely the same as that of the Xbox 360 purchased in late 2005. It turned out that Microsoft provided repair without changing the Xbox 360’s thermo design at least until May 2007.

The repaired units weren’t replaced with ones that had a better design. They were the same — as far as they could tell, Microsoft just replaced a broken unit with one that hadn’t broken yet. That’s probably why we’re seeing various reports of repeated breakdowns.

What that tells me is that the design of the Xbox 360 is deeply flawed, and that design flaw has already cost Microsoft well over a billion dollars. And it’s that flawed design that can teach us a whole lot about our own software projects.

Shoddy workmanship

So what does this all mean for us developers? Well, for the more cynical among us, it could just mean a whole lot of job security. I’ve met COBOL programmers who charge ridiculous amounts of money to maintain aging systems. But while their jobs pay well, personally they sound tedious and awful to me. Does anyone really aspire to spend years patching an aging software system? Most programmers will tell you that maintaining old systems is the worst part of the job. If you love designing new and innovative software, then the last thing you want to do is get your career stuck in maintenance mode.

And that’s what Microsoft is learning with the Xbox 360. I’m not a thermal design expert, but I am absolutely positive that they could have come up with a different design that wouldn’t fail so often. And while it may have cost more money to design the system and build each unit, I sincerely doubt those extra costs would have added up to over a billion dollars. And maybe the extra design time might have cost them more time… but now there are plenty of us who aren’t buying the system because we don’t want to be stung by the rampant quality problems.

Had Microsoft designed the system properly in the first place, they wouldn’t be in this mess now. And that’s the big lesson for us to learn. Oddly enough, it’s not a new lesson… in fact, it’s a pretty old one. One way to look at the Xbox thermal problem is to see it as a design defect that wasn’t caught until after the product was shipped.

Look what I found in an old 1997 issue of Windows Tech Journal… it’s an article by one of our favorite authors, Steve McConnell, called “Upstream Decisions, Downstream Costs”. The article lays out a scenario that most of us will recognize immediately: a fictional software company runs into problems because they don’t do enough planning up front, and end up getting buried with bugs, which cause awful delays. It also has a chart that anyone who’s read a few software engineering textbooks will recognize, showing that the earlier a bug is introduced in the project and the later it’s caught, the more expensive it is to fix.

So now we’ve seen a good, real-world situation where better design practices would have saved a whole lot of money. But what can we do about it in our own projects?

First and foremost, this gives us more ammunition when arguing with our coworkers and our bosses for more time to design our software. It’s really easy to get frustrated during the design phase of a software project, when a few people are generating a lot of paper or diagrams but nobody’s working on the code yet. That’s one of the things that we pointed out in our first book, Applied Software Project Management — that finding problems too late can sink projects. Luckily, there’s a relatively painless fix: adopt good review practices.

This is something that our friends in the open source world are really good at. Jenny and I talked about this in an ONLamp.com article we wrote last year called “What Corporate Projects Should Learn from Open Source”. A lot of high-profile, successful open source projects have very careful reviews, where they scrutinize scope and design decisions before they start coding. (To be fair, a lot of high-profile, successful closed source projects do the same, but we can’t just go to their websites and see their review results.)

So the moral of the story is that it often costs less to spend more time and money on design up front. And I bet there are some Microsoft shareholders that will agree.

Why “gold plating” is a lousy name

A few days ago I posted an answer to a question about gold plating and scope creep to the Head First PMP forum. I’m not surprised the question came up — people really seem to have trouble with the concept of gold plating. And I don’t think it’s because it’s a tough concept to get. I think it’s because it’s got a lousy name.

Gold plated silverware

In the usual gold plating scenario, a programmer adds features that were never requested because they’re “cool” or fun or seem like they’d be really useful. And sometimes they are — but more often, they’re just wasted effort, at least from the perspective of the person paying the programmer’s salary. Like I pointed out in my last post that mentioned gold plating, I completely sympathize. I’m definitely guilty of gold plating. There was one project I led about ten years ago where I created an entire scripting language, complete with interpreter, that was totally unnecessary. As far as I know, that product is still being used today, and not a single person has ever written one script for it. But it was definitely cool (or, at least, I thought so). More importantly, I really did think it would be useful, and make the software better. Classic gold plating.

On the surface, the “gold plating” does seem intuitive, but the analogy starts to break down under closer scrutiny. Think about what gets gold plated: all sorts of stuff, from cheap jewelry to expensive pens. All sorts of things get encrusted with, for lack of a better word, bling. And that’s what I pointed out in that forum post:

Gold plating is what we call it when the project team does work on the product to add features that the requirements didn’t call for, and that the stakeholder and customer didn’t ask for and don’t need. It’s called “gold plating” because of the tendency a lot of companies have to make a product more expensive by covering it in gold, without actually making any functional changes. (For example, there are plenty of watches and fountain pens you can buy from luxury companies that are identical to their cheaper versions, except that they’re covered in gold.)

This got me thinking about gold plating, and why it gives people so much trouble. Is gold plating in a software project actually similar to gold plating in real life? Or is it an odd, somewhat mismatched analogy?

To answer that question, we’ll need to take a step back and look at gold plating in the real world. There are a few different strains of gold plating, and they serve different purposes. First there’s the traditional, purely decorative gold plating. That’s the one we know and love: take an ordinary object, slap some gold on it, and charge a whole lot more. That’s the sort of product that’s associated with decadence and conspicuous consumption. It’s where the Gilded Age got its name.

Certainly, making a product arbitrarily more expensive is certainly a way to sell more of it to a certain sort of consumer. But while there are certainly numerous modern examples of opulent gold plating, it’s fallen out of favor somewhat. More importantly, it’s not really a great analogy for gold plating in software.

What it’s been replaced with is a somewhat similar but definitely distinct way to enhance (read: sell “upscale” versions of) products. This “enhancement” is done by adding features that are actually useful, but go far beyond the needs of the typical consumers of the product.

Here’s an example. Hhow many suburban homes really need an industrial refrigerator or restaurant-quality range? Those appliances have been a selling point of “luxury” and “upscale” homes’ kitchens for years. But the difference between that and, say, gilded kitchen items is that the “professional” applicances almost certainly worth the price — if you actually need them. Which you don’t, if you only use your kitchen to cook dinner for four every couple of days and thaw the occasional frozen turkey. But that’s not the point. The kitchen itself has been “upgraded” with cool but unnecessary items. And, in this case, it sells.

Canyonero

That’s certainly not the only example of products packed with features that are potentially useful, but which are, for the average owner of those products, essentially unnecessary (and by and large unused). There’s the slowly fading American love affair with the SUV; the top-of-the-line faucets, fixtures, and general home accouterments that litter the country’s McMansions; and pretty much everything in the Brookstone, Hammacher Schlemmer and Sharper Image catalogs. We’ve got our fill of amateur hill climbers with professional mountaineering gear, weekend golfers with $1,500 titanium clubs, and basement workshops stocked with industrial hardware used to build the occasional birdhouse. Every single one of those things, in the right hands, is almost certainly worth it. I’ve been playing bass guitar for about 20 years, and I know that there’s a big difference between the average $300 instrument and one that cost ten times as much. (Which is not to say you can’t spend $3,000 on a crappy bass, but that’s a whole different issue entirely.) Certainly, a beginner will see some small benefit from using a better instrument. But it’s probably not worth the price for someone who will only pick it up once every few months.

Neither of these two ideas is a perfect analogy for software gold plating. In some ways gold plating in software is a lot like like gilded products. In other ways, it’s similar to the kind of overkill that people perform when they use “top-of-the-line” products unnecessarily. More accurately, it’s really a mixture of the two.

So what drives us to gold plating? We add unrequested (and eventually unused) features because we want to build stuff that’s cool, and it never occurs to us that we’re building something our users won’t need. I like to think of Google as the ultimate in gold plating. Their latest offering, Google Street View, is a great example of cool software that doesn’t seem to meet an obvious need. And I love it — I think they did a great job with it, and I’m honestly impressed with the way a whole lot of moving parts came together the way they did. Maybe someone will think of a really great use for it. But isn’t that a solution in search of a problem, by definition?

A good rule of thumb is that people will generally only pay for a product that’s useful. One of my favorite ways to describe quality is to consider two pieces of software. The first one is beautifully designed, very well built, very stable, never crashes, has a very intuitive user interface, is extremely secure, and is generally a pleasure to use — but it doesn’t do the job you need it to do. The other is terribly built, painful to use, crashes at least once a day, and does 50% of what you need. And that’s the one you’re going to use. It meets your needs. And any feature in that software that doesn’t meet your needs (or the needs of any other users) is pure gold plating.

Which brings me back to the name. I don’t think “gold plating” does justice to the real phenomenon it refers to. It’s more than just gilding software by making it pretty (and/or more expensive). It’s about the way we genuinely feel that we’re making the software better by adding features that may be really cool, but which we, as programmers, simply don’t recognize are useless.

And the sooner we can figure out how to avoid doing that, the better our software will be.

(Luckily, we’ve got some really good tools to help us avoid gold plating. I’ll talk about them soon in another post.)