I was unhappy with the result of our pair programming meeting for various reasons: we were all unhappy with how things were going, I was pretty sure that we were doing something wrong, but I didn’t know what it was. We’d adopted short-term measures to ease some of the pains, but I didn’t see them as leading to a coherent solution that I’d be happy with.
After thinking about it for a day, I decided that our changes were leading in a direction that I certainly wasn’t happy with: while I’m still not sure of the merits and demerits of pairing, I am sure that it’s good for us to spend more time focusing on the quality of our code, and to spend more time in general talking about code. If we’re going to pull back on pairing, we should still try not to give up on that goal: so I instituted a policy that all non-trivial checkins would require a code review. (If the code was entirely developed while pairing, that counts as the code review, of course.) Code reviews are probably not quite as good as pairing for quality control, but they’re a lot better than nothing: I know that, when I was working on GDB, I got a lot of useful feedback from others’ code reviews, for example.
I felt better after that: people were talking more, the checkins were a bit cleaner. Not a lot cleaner, but that will come: editing, like any other skill, improves with practice.
A week or so later, we ran into another problem: the assignment that one of my team members was working on that week wasn’t done, it wasn’t clear to me when it would be done, and I wasn’t at all confident that I’d like the results when I saw it. (Of course, my lack of confidence may have been largely caused by my lack of information: maybe it was great code, I just had no easy way of telling.)
This wasn’t an isolated instance: when we estimated a story as taking a full week to accomplish, it would turn out to take more than a week most of the time. We were fooling ourselves with our estimates, and we were skimping on design: it’s one thing to be against “Big Design Up Front”, but that doesn’t mean that some amount of design isn’t appropriate.
And now a bunch of things clicked. I’d been aware for several months that we weren’t really planning in the XP way: the relevant issue here is that we were working exclusively in terms of “stories” (basically, features with user value that can be implemented in a week or less), but not breaking them down into “tasks” (individual technical steps necessary to implement the features, each of which can be accomplished in a single pairing session). When I first realized that we were doing the planning wrong, it wasn’t clear to me that this difference was a big deal, but all of a sudden introducing tasks seemed to solve several problems that we were having:
- Breaking a long story into tasks should make it easier to accurately estimate the story’s duration, with a bit of practice: a six task story will probably take longer than a four task story, but that wouldn’t have been so obvious before breaking it up into tasks.
- The process of breaking a story into tasks gives us a chance to talk about the story together and do an appropriate amount of up-front design.
- If a task takes longer than expected (in particular, longer than a day), that’s an immediate warning sign that something unexpected has turned up. We can deal with the problem right then, by calling an impromptu design session and breaking up the task into smaller tasks as appropriate.
- In the unhappy event that a story still takes longer than a week to accomplish, at least I’ll have a much better idea of its current status, because I’ll know what tasks have been accomplished and what tasks haven’t been accomplished.
- It seems plausible that it will significantly improve our mood towards pairing: it’s not much fun showing up in the middle of somebody else’s project, working on it for a little while without really knowing what’s going on, and then leaving while that person continues. It’s a lot better if you come in at the beginning of a coherent project, work on it together for a few hours, and finish it.
We’ve been doing this for a grand total of a week now; it’s probably largely my imagination, but I’m a lot happier with how things are going. We actually had a pretty bad week in terms of completing stories (we were still underestimating how long long stories were taking), but the one problematic story was in much better shape: we’d finished 5 of the 6 tasks that we’d broken that story into, we knew the last task was turning out to be more complicated than we expected, so we found a coherent way to split it into two tasks.
In our weekly meeting on Friday, most of the stories were fairly well-defined, but one of them was pretty amorphous. So we spent about 20 minutes breaking it up into talks, talking about pros and cons, with lots of people chipping in about what they remembered about the different pieces of affected code. At the end, there was general agreement that the story was significantly less scary than it had seemed before we started talking about it.
And maybe it’s my imagination, but I think I’ve been enjoying pairing more. Yesterday, for example, I had a very pleasant time writing a really solid class. I particularly appreciated my partner’s winces whenever I chose a bad name for a variable: joke all you want, but little things like that are important. (Incidentally, we also tried out programming by intention some more, with good results.)
Not everything is perfect yet, but I’m much more optimistic than I was. We’re still underestimating large stories, but hopefully tasks will give us a better handle on that. Significant issues still remain with pairing: in particular, our differences in familiarity with different parts of the code and in programming background make pairing hard, but I can deal with that, and those differences will lessen over time. As long as we have a plausible path for improvement there, I’m happy.
On the one hand, I feel a bit silly that we didn’t start using tasks a lot earlier: I should have been paying more attention to what the XP books were saying, because the authors of those books have a lot of useful experience. (Incidentally, it’s fascinating reading the XP mailing list.) And I’ll certainly keep on rereading various XP books to find more mismatches between our practice and their descriptions that might shed light on problems we’re having. On the other hand, making mistakes is a classic way to learn, and for good reason: I have a much more active grasp of this issue than I would have if we’d done things right from the start.
My next management issue, aside from monitoring this one: reading about Scrum, to see if we can use that as a blanket methodology for the entire software team (i.e. my group, the other two groups parallel to it, and my manager’s group). It’s compatible with but less specific than XP, and explicitly addresses issues involving multiple groups; with luck, it will be something we can all get behind. But I have some reading to do to learn more about it, to see if I think it is a good match for current and potential problems that the larger group has.
Post Revisions:
There are no revisions for this post.
Are you going back and comparing the estimates with the actual amount of time? (Tracking your velocity?) You might be able to establish an average underestimate percentage and modify the estimates based on it.
8/8/2005 @ 8:58 am
We’re doing some amount of measuring. Specifically, we’re measuring how many hours we spent programming each week, how many stories we under- and over-estimated, how many stories we expected to finish but didn’t. So we probably could establish an average underestimate percentage based on that.
I’d rather leave that as a bit of a last resort: for one thing, I think we have more of an estimation problem on long stories than on small stories, which suggests to me that we have to be better about splitting stories. But if we don’t get some traction on our estimates soon, I think that applying a blanket percentage makes sense.
8/8/2005 @ 9:53 am