How spending a little extra time and money on design might have saved Microsoft over a billion bucks

I really wanted an Xbox 360.

My old PS2 is showing its age, and I wanted to upgrade to a new system as soon as I finished the last few missions of GTA: Vice City Stories — especially now that it looks like Manhunt 2 won’t be coming out for PS2 any time soon. I’m a huge fan of the GTA series, and I’m especially psyched about GTA4. I grew up in Brooklyn, on a block that looks a more than a little like a GTA4 screenshot.

But then something happened…

Viva Pinata.png

But a couple of weeks ago my plans changed. Jenny was happily thrashing away on Guitar Hero, when her TV screen just went blank. She looked down at her console, which had suddenly gone quiet. (That’s pretty noticeable, apparently, because the Xbox 360 is a really loud machine… which, as it turns out, is important to our story.) Much to her disappointment, she saw those three telltale LEDs that every Xbox owner dreads: the red ring of death.

Luckily, Jenny’s 360 lasted long enough so that she could take advantage of the Xbox 360 service site that Microsoft launched earlier this month. But her poor console was just the latest in a long line of casualties. Some retailers estimate that 30% of Xbox 360s need repair, and we’ve seen plenty of anecdotal evidence that gamers are unhappy. It’s costing Microsoft sales and spooking investors. Microsoft is doing everything they can to fix the problem — they’ve extended the warranty to three years, and it’s costing them over a billion dollars. But it’s a real mess.

As much as I want a new console, I’m not going to buy one until I know that it won’t break. I still plan on getting a 360, but not until I can be reasonably sure that I won’t have to return it. By the time I eventually get one, hopefully they’ll have figured out how to make it quieter. I’m certain that I’m not the only one who’s decided to put off buying an Xbox. And that’s bad news for Microsoft.

So what can we, as software developers, learn from the Xbox 360 fiasco?

Productive meeting

Software people like us have a nasty habit of dismissing hardware problems as if they have nothing to do with us. We tend to think that designing software is really different from building hardware. And sure, there are definitely differences. We don’t have to worry about assembly lines, product getting damaged in shipment, or those pesky laws of physics that can prove to be such an irritating limitation when you have to design physical objects.

And it’s easy to dismiss the Xbox 360 failure as one of those unfortunate things that falls into that last category of physical faults. There’s a great Tech-On! article that gives us the dirt on exactly what’s caused the problem. It’s an excellent post-mortem on what amounts to terrible thermal design.

For those of you who’ve never taken a computer apart, here’s a little background information. Dealing with heat is an important part of modern computer design. Computer processors generate a lot of heat — so much that if you don’t come up with a way to get rid of it, they’ll fry themselves. So computer manufacturers will typically attach a heat sink to a processor. A heat sink is basically just a big radiator with fins or poles that lets air circulate and draw away the heat. (I once roasted a Pentium 4 processor by popping its heat sink off while the computer was running, just to see what would happen. It went “poof”.) A lot of processors are too hot even for heat sinks; in that case, you’ll need to stick a fan on top of it to cool it off. That’s why some computers are so noisy: they need fans to keep them cool.

It turns out that the Xbox 360 generates far too much heat, and a lot of people speculate that when that heat builds up past a critical point it unseats the GPU (a separate processor that’s used for graphics). Microsoft has so far refused to comment on exactly what the problem is, but as time goes on there does seem to be some consensus forming about it. And that Tech-On! article seems to have found a smoking (heh) gun.

But that’s just the hardware stuff. What does that have to do with building better software?

The punchline for all of this came at the end of that Tech-On! article, and it’s why I think this whole incident is so interesting. Here’s what it said:

Finally, we opened the chassis of the Xbox 360 repaired in May 2007 and compared it with the other Xbox 360 we purchased in late 2005.

“Huh? The heat sinks and fans are completely identical, aren’t they?”

To our surprise, the composition of the repaired Xbox 360 looked completely the same as that of the Xbox 360 purchased in late 2005. It turned out that Microsoft provided repair without changing the Xbox 360’s thermo design at least until May 2007.

The repaired units weren’t replaced with ones that had a better design. They were the same — as far as they could tell, Microsoft just replaced a broken unit with one that hadn’t broken yet. That’s probably why we’re seeing various reports of repeated breakdowns.

What that tells me is that the design of the Xbox 360 is deeply flawed, and that design flaw has already cost Microsoft well over a billion dollars. And it’s that flawed design that can teach us a whole lot about our own software projects.

Shoddy workmanship

So what does this all mean for us developers? Well, for the more cynical among us, it could just mean a whole lot of job security. I’ve met COBOL programmers who charge ridiculous amounts of money to maintain aging systems. But while their jobs pay well, personally they sound tedious and awful to me. Does anyone really aspire to spend years patching an aging software system? Most programmers will tell you that maintaining old systems is the worst part of the job. If you love designing new and innovative software, then the last thing you want to do is get your career stuck in maintenance mode.

And that’s what Microsoft is learning with the Xbox 360. I’m not a thermal design expert, but I am absolutely positive that they could have come up with a different design that wouldn’t fail so often. And while it may have cost more money to design the system and build each unit, I sincerely doubt those extra costs would have added up to over a billion dollars. And maybe the extra design time might have cost them more time… but now there are plenty of us who aren’t buying the system because we don’t want to be stung by the rampant quality problems.

Had Microsoft designed the system properly in the first place, they wouldn’t be in this mess now. And that’s the big lesson for us to learn. Oddly enough, it’s not a new lesson… in fact, it’s a pretty old one. One way to look at the Xbox thermal problem is to see it as a design defect that wasn’t caught until after the product was shipped.

Look what I found in an old 1997 issue of Windows Tech Journal… it’s an article by one of our favorite authors, Steve McConnell, called “Upstream Decisions, Downstream Costs”. The article lays out a scenario that most of us will recognize immediately: a fictional software company runs into problems because they don’t do enough planning up front, and end up getting buried with bugs, which cause awful delays. It also has a chart that anyone who’s read a few software engineering textbooks will recognize, showing that the earlier a bug is introduced in the project and the later it’s caught, the more expensive it is to fix.

So now we’ve seen a good, real-world situation where better design practices would have saved a whole lot of money. But what can we do about it in our own projects?

First and foremost, this gives us more ammunition when arguing with our coworkers and our bosses for more time to design our software. It’s really easy to get frustrated during the design phase of a software project, when a few people are generating a lot of paper or diagrams but nobody’s working on the code yet. That’s one of the things that we pointed out in our first book, Applied Software Project Management — that finding problems too late can sink projects. Luckily, there’s a relatively painless fix: adopt good review practices.

This is something that our friends in the open source world are really good at. Jenny and I talked about this in an ONLamp.com article we wrote last year called “What Corporate Projects Should Learn from Open Source”. A lot of high-profile, successful open source projects have very careful reviews, where they scrutinize scope and design decisions before they start coding. (To be fair, a lot of high-profile, successful closed source projects do the same, but we can’t just go to their websites and see their review results.)

So the moral of the story is that it often costs less to spend more time and money on design up front. And I bet there are some Microsoft shareholders that will agree.

Q&A: How to succeed in business analysis without really trying

Hi everyone! We’re back from a summer break — no, not to go off to some vacation paradise. We were crunching away on our next big O’Reilly release, Head First C#. We’ll post more about it as we get closer to publication which, according to its Amazon page, is due out in just a couple of months.In the meantime, we’ll keep the new posts coming for our regular readers. We love you, readers! And we especially love readers who send us questions, because, as it turns out, Jenny and I get a lot of great questions sent to us. Here’s one we got a few days ago:

Dear Jenny and Andrew,

I’ve been told by a couple of people “business analyst” positions were safe locally in the near future. I found “Applied Software Project Management ” on the internet. I’m currently trying to improve some of my skills, in an effort to re-enter the local job market. I’ve been a developer, programmer, analyst for over twenty years. I love building new software. I know you have a number of books that would be helpful. I’m also reading some of the “Head First” books. Is “Applied Software Project Management” the best place to start to build these skills?

Thank you in advance for your time,

Eric D.
Clemmons, NC USA

Now, this is an especially good question because it lets us plug our first book, Applied Software Project Management. More importantly, it’s a topic that’s near and dear to my heart, since I spent a few years managing a really incredible team of business analysts a while back. I learned an enormous amount from them, and I feel lucky that I can share a little of that with you, too.

Business Analyst

It was near the tail end of the Silicon Alley dot-com days, although I was working at a decidedly non-dot-com company. At the time, I’d already been managing a team of software engineers — developers, architects and the like — for a few years at a New York-based financial software company. I felt a lot like Miles Silverburg on Murphy Brown: a relatively fresh-faced manager with a team of seriously seasoned and talented people who, to be honest, knew a lot more about the job than I did. Luckily, I’m a quick study, and I had some really great people to learn from.

First and foremost, I learned a lot more about what makes a good business analyst. Very few people start out as a business analyst — they usually move into the field from another area. Some of my team members started off as software engineers, others started off on the business side. What they all had in common was a deep understanding of users and how to make sure their needs were met. Throw in a healthy dose of project management knowledge, add a pinch of quality engineering, and top it off with a solid background in software requirements engineering tools and techniques, and you end up with a well-rounded, highly effective business analyst.

So that’s a tall order — and I should know, since I had to hire a whole team of people who fit that bill. I’ve gotten a lot of skepticism over the years from people who don’t believe that you really need someone who has such a broad background. And I’m sure that there are plenty of people with the title “business analyst” who can barely do their jobs… because there are people with every title in every industry who can barely do their jobs. But why does it take so much to be a good business analyst?

Consider what the real job of a business analyst is in a software project. She’s got to talk to users and figure out what their needs are. That’s a tough job, because users have a tricky habit of asking for things that don’t necessarily fit their needs. Time and time again I saw the people on my team tease the real needs out of users who, to be honest, didn’t really understand what they wanted. And that’s a really tough job. It requires some very solid elicitation skills, and a good familiarity with software requirements engineering tools and practices (like this one, this one and this one). Because the other part of the business analyst’s job is taking those needs and turning them into a functional solution that can be implemented in software.

That’s one thing that Jenny and I spend a lot of time talking about in Applied Software Project Management. We go over exactly what it means to construct a functional specification, and how that’s different from designing software. Just as importantly — and this is where a good understanding of quality engineering comes in — you need to make sure the users actually understand that solution, which means holding lots of reviews.

Finally, a good business analyst needs to be able to work with the rest of the engineering team to make sure that those needs and requirements actually get built into software. And that’s where a good understanding of project management comes in. She’ll have to work with the team to make sure they understand everything that needs to be built, and she’ll usually play a big role in the estimation — which means she needs a good understanding of estimation practices [PDF].

So to answer Eric’s original question, where should someone start building skills to get a job as a business analyst? If you looked at any of those links, then it probably won’t surprise you that I’d recommend Applied Software Project Management to anyone who wants to get a head start as a business analyst. I also really like Karl Weigers’ book, Software Requirements (2nd Edition). That book was a huge help to me when I was first learning about requirements engineering. I also learned a lot from Software Requirements Engineering (2nd Edition), a compilation of classic IEEE papers edited by Richard Thayer and Merlin Dorfman — including the classic Jim Rumbaugh paper on use cases, which I consider one of the most important papers ever written about requirements engineering. The best book on use cases that I’ve found is Use Cases: Requirements in Context (2nd Edition) by Daryl Kulak and Eamonn Guiney. When I was hiring business analysts, I would definitely have been pleased with a candidate who was familiar with the concepts in these books.