It probably would surprise people who have interacted with me recently to hear it, but I actually spent a fair amount of time a few years back trying to get good (as an individual, as part of a team) at estimating: reading the literature (both agile and otherwise), trying it out, and refining and repeating in an attempt to get better. These days, I do not care about estimates nearly as much (or at least I don’t care for estimates nearly as much!); that’s basically because I haven’t seen spending time on estimates as a particularly direct road towards doing software development well as measured in terms that matter to me. I understand estimates in the context of, say, XP, but within XP I see them as a means to the end, and other approaches to those ends seem more productive / fundamental to me.

Still, estimates aren’t going away (and, incidentally, Ron Jeffries recently wrote a very interesting post that touched on the subject); so, it’s time for another installment in the ongoing series of “David tries to understand his reactions to some aspect of software development practices”.

 

I actually don’t think that I would react negatively to all discussions of estimates. To that end, I’ve come up with three aspects of any such discussion that are important to me; if the discussion of estimates doesn’t acknowledge at least two and probably three of them, I’ll probably start off with a strong instinctive negative reaction. But if it touches on all of them, I’ll be a lot happier. Those aspects are:

  1. Why are we estimating? How will we deliver software differently if we do estimate than if we don’t?
  2. What does the estimate mean? We can’t predict precisely when something will be done; so what’s the underlying probability model, and how are we boiling down that probability model down to one or two numbers?
  3. How are we getting better at estimation? Estimation is hard, but with feedback it is possible to improve.

These are, of course, interrelated, but probably the first of them is the most important; as a mathematician, though, the second one of those bugs me the most, so I’ll start there.

 

So: what is your probability model? Honestly, I’ll be happy with any initial interaction that shows that you are thinking from a probabilistic point of view at all. Here is my stab at what I think probability distributions in estimating software projects are like:

  • Error models are more likely to be multiplicative than additive: i.e. rather than being off by plus or minus two days, you’re more likely to be correct if you think that it’s equally likely to take either twice as long or half as long as your estimate.
  • I don’t believe that it’s multiplicative either, though: things are more likely to go catastrophically bad than they are to go gloriously well.
  • If your estimates are accurate to within a 2x factor more than, say, three quarters of the time, then you’re doing a good job at estimating. But for some kinds of work, even that is hard to attain: bug fixes in particular are notoriously hard to estimate.
  • Software engineers who haven’t actively practiced estimating are very unlikely to get close to the most likely spot on the estimation graph: instead, we’ll default to saying the fastest time that we can imagine something being done by.

Or, to reduce my discomfort to a smaller number of bullet points: if you’re asking me for an estimate, either ask me for a range or specify what portion of the curve of answers you want to be to the left of the number I provide. I can work with either of those.

 

That’s the second aspect. Moving back to the first aspect, here are some possible good answers to the question of why we are estimating:

  • We are planning external commitments (a press release, a contract) around work that isn’t yet completed.
  • We are trying to choose between tasks that are of approximately equal business value, and we want to know which can get done faster.
  • We want to have a discussion about what it means for a piece of work to get done or what the candidate pieces of work are at all, and we are using estimates as a sneaky Jedi trick to help that discussion along.

And here are some less good answers to the question of why we are estimating:

  • We have a particular fascination with predicting in advance exactly what collection of work will get done one or two or three or months from now.
  • We see other people doing it, so we’ll do it as well.
  • We think programmers are lazy, so we want to use estimates as a tool to fight that.

I may be somewhat eccentric in putting the first of those less good answers on the less good list: that’s garden variety Scrum or XP. The thing is, though, it strikes me as a quite difficult task that won’t actually affect what you’ll do over that time period. Whereas stripping down that question in one way or another (asking only about a subset of the tasks or else relaxing the time bound) do lead to actively actionable choices: they’re basically the first and second choices on the good list.

I think the third answer on the good list is probably the best answer of them all; I wish I were better at that. (Here’s a good post by Esther Derby on the subject.) I think the third answer on the second list is the most common subtext for questions about estimations.

One reason why the choice of question matters even within the range of good answers is how it interrelates with your answer to the probability space question. If you’re making an external commitment to getting something done, I would recommend that you aim for a point on the probability space that’s pretty far to the right of the curve, rather than putting down a more aggressive estimate and flipping a coin. And I would also recommend only treating a small subset of the work items that way: that way, you can drop other items in order to improve your chances of meeting the date. (So I certainly wouldn’t recommend talking about a commitment to a 50% estimate for an entire iteration’s worth of work, as some Scrum treatments seem to do.) Whereas if you’re choosing between items of equal business value, then you’re probably better off trying to figure out the middle of the probability curve. (And you might want to compare the variance of the business probability curve with the variance of the engineering probability curve!) If you’re going for the third answer, then the fact that different types of work have significantly different probability curves can lead to interesting discussions.

 

And then there’s the question of getting better. I don’t have a lot to say here, because it’s mostly pretty obvious: pick a model with some parameters missing, gather data, and then try to figure out what the actual values are for those parameters. Story points plus velocity is one traditional way; skipping prediction entirely and measuring cycle time is a way that I’ve been curious about recently; I’m sure you can come up with other ideas. The main pitfall is not having a model at all, and the second main pitfall is leaving work out of your measurements without taking that into account in your model.

Actually, the real main pitfall is not wanting to improve at all: I tend to suspect that that’s associated with my third bad reason for wanting to estimate. Though even there, talking about improvement can potentially lead in interesting discussions: if you end up leading in a direction saying that you expect programmers to give optimistic estimates and to work evenings and weekends if those estimates are wrong or don’t meet your business desires, then I would rather have that underlying assumption be overt than not, and maybe talking about estimates will help bring that out.

 

So, to summarize: if we have a discussion about estimates and spend time talking about why we’re estimating, what our probability model is, and how we’ll improve our ability to estimate in a way that is congruent with those starting points, I’ll try not to be a jerk about it. (Though I may probe at whether the “why we’re estimating” answer is pointing at an underlying need that we should be attacking in other ways than estimating.) If the discussion starts from a different starting point, though, my reaction will probably depend more on my underlying emotional reserves that day.

Post Revisions: