Significant changes will indeed make your forecast not come true. But mostly people are not asking to predict for situations with significant changes.
Incremental changes will just be incorporated in the forecasts over time. The forecast continually adjusts to new information becoming available.
It's interesting to know that after about ten work items completed the number become pretty stable. So it's easy to reset or adjust the forecast in case of a big change in the team.
I was personally surprised to learn how stable the output of a team is if its composition doesn't change. Tools have some impact on productivity but so big that you have to throw away your predictions. No one is complaining anyway if you over deliver a little.
> people are not asking to predict for situations with significant changes
That’s for coming up changes. Your predictions will also be of poor quality after a significant change as you lose the link with historical data.
In the end there might be only a small window between two changes where they are worth anything.
To be clear I am not saying you shouldn’t make prediction anyway, just that the effort to come with “a system” is not worth it in a lot of situation.
For context, tech industry has one of the highest turnover, a team losing/gaining a member is not some rare event. A new boss coming in to change process or teams isn’t either.
There are situations that are way more predictable than the stock market. We as humans use the past to successfully predict the future all the time. For example we predict that we won't be able to walk through walls not because we understand how atoms bounce off each other but because we experienced not being able to do that.
Software Development is certainly more predictable than the stock market but less predictable than the walking through walls example.
In a stable team, often past performance is a strong indicator of what will happen next. It's not perfect, black swans can occur, but no one is asked to predict if your project will get cancelled for example.
I agree with this response. We are normally not asked to predict for situations where something big changes in the team. But I of course acknowledge that these things do happen. When you have a stable team, the numbers that this method yields are also very stable.
My experience is that most project managers take a non-probabilistic approach.
Say you have your usual list of breakdown tasks and assign a time/budget estimate for each in terms of “low”, “most likely”, and “high”. The intuitive answer is to sum up the “most likely” for your total estimate. However, this ignores the probability that a delay in one task affects others.
Instead, if you take into account the covariance relationship between tasks (using historic or simulated data) you often find that “most likely” summation has a quite low probability of being met. For the org that applied this, there was a less than 20% chance we’d meet or best that intuitive estimate. No wonder we were chronically over budget and over schedule!
I've been reading the "Software Estimation: Demystifying the Black Art" from Steve McConnell.
He introduces a distinction that, at least for me, has been instrumental: estimations and plans are different things.
Estimations are honest, based on past performance data and probabilistic on their very natures.
Plans are, on the other hand, built with a target date in mind, taking into account the estimate previously made, desired delivery dates from customers and everything we are so used to.
By planing fulfillment of tasks closer to the estimates, you decrease the risk of the plan failing. You can build a shorter schedule and assume that staff will work overtime, assume more optimistic estimates and so on, but, then, the risk of failure will be higher. Such risk will, of course, never be zero though.
It's a simple distinction, but it has important implications. We don't feel anymore the pressure of making pessimistic, therefore dishonest estimates just out of fear of being pressed to cut the schedule. And also gave us a better argumentative tool to negotiate schedules with our clients.
I think it's also useful for making all the probabilities a bit clearer to project managers. It's like "OK, I know that you need me to commit with a delivery date, but I'm also going to make clear to you that there are some risks involved and I wanna make everybody aware of them"
That’s an important distinction. The way we handled it was by letting managers define their acceptable level of risk and then use the model to define the estimates in that context.
For example, if they were ok with a 60% chance of making or beating a cost estimate, the forecast could be much more aggressive than, say, a management expectation of 90% chance of being on budget
It’s a straightforward enough primer that it can be done in Excel, including simulating the data if necessary.
Even if this type of model is too simple for actual estimation, it’s a useful (and sobering) tool to help managers understand why their intuitive estimates can so often be incorrect.
You can intuit how much “active time” it will take you, personally, to do something. How can you intuit how long a task is going to spend in a queue waiting to be worked on because your team doesn’t have capacity, or another team “down the chain” doesn’t have capacity?
We have queuing theory because people are bad at intuiting the latter, and I don’t even think we're anywhere close to good, as an industry, at intuiting the former.
You can talk about BS (in the context of software) like queuing theory or you can actually write software. I suggest the Mythical man month.
Sometimes I think humans developed language only to be able to pretend doing something:
Best hunter of the tribe kills a mammoth. But he is not verbally talented. Now an army of bureaucrats appear and tell everyone that they were instrumental in slaying the prey by applying some BS methodology. The tribe is gaslighted, the bureaucrats gain importance, influence and economic wealth.
Queuing theory is a branch of mathematics. It is useful, in a software context, for things like predicting server capacity and predicting response times of programs. It is also regularly used to predict things like hospital wait times.
Here is a very good introduction, I hope you can learn something new from it (:
there's no actual use of queuing theory in the article though, it's just mentioned as some sort of irrelevant justification. it's not even a monte carlo simulation, it's a bootstrap. you definitely don't need queuing theory to run a bootstrap
If you build a general ledger application for the 10th time, sure forecasting is fairly straightforward. Nothing I do at my (very large, non tech but highly software driven) employer has ever been done before here. All estimates are treated officially as if they are date time accurate, but changes happen during the lifetime of the project so often you may as well use a random number. I call it a "nod nod wink wink" estimate: every wants it to be accurate, but no one really expects it to mean anything, other than the budget people.
One of my favorite managers required i give him estimates.
I hated it because we both knew the number was bullshit.
On the other hand, having to think about the estimate and give him something, even if at times it was a guess, i still found it beneficial. It meant i focused better, stayed on task, and often delivered on time anyway.
Im not saying everyone needs the accountability rails, but some people excell with this particular helper.
Unit of work is explicitly vague because it refers to the units you are using in your project, be it user story, requirement, epic, bug etc.
You do not need to know the size of each unit. In fact the method I describe acknowledges that there is variance in the size of each unit.
Some people advocate "same sizing" units of work. I have always thought that was a weird idea because it would imply making small units bigger. How would you make the unit that describes changing the color on a button the same size as integrating with an API? You can't even if you would have perfect knowledge of the amount of work required.
> Throughput is the amount of work that comes out of your project by unit of time. It’s up to you to decide what units make sense for your context. Days, weeks, sprints, stories, bugs, epics – anything goes as long as you’re consistent.
Using days/weeks/sprints as your unit of work for determining throughput seems circular. If you want to know how many weeks of work your team can produce per week then you don't need a monte carlo simulation to tell you that.
Using stories/bugs/epics is flawed too, I think. You can have a fantastic model for your team's throughput in stories per week, but that doesn't tell you anything about when the project is going to be done unless you know how many stories there will be. There are two variables here (throughput and quantity) and you can't get useful information out of the product of them for free.
To see why, imagine that you take only the minimum amount of effort to very roughly divide the project into sensible-seeming chunks. In that case, your throughput in chunks per week will be meaningless (i.e. your model will have a confidence window which is uselessly wide) because the chunk division will have barely any relationship at all to the amount of actual work in each chunk. Now imagine that you're a bit more diligent in your project planning and look in a bit more detail at the work that will be involved. You've done some work to clarify what code will need to be written, and your confidence window will narrow accordingly. Now imagine that you're even more diligent. And so on, and so on. You eventually end up with zero error in your model, but in the process you've completely determined what code will need to be written, and the project is finished! Congratulations, you've invented waterfall. There's no free lunch here as the intro paragraphs of the blog post promise.
I assume that in practice you're stopping at some point in the middle of the extremes of doing nothing and planning out every line of code, and then applying your statistical model at that point, but I think you'll still be thwarted by the (at that point) partial disconnect between the chunk divisions and the actual amount of work in each chunk. We've all experienced innocent-seeming tasks that end up consuming vast amounts of time unexpectedly, and various degrees of this phenomenon is what fundamentally ties the amount of error in any pure "stories per week" model to the amount of effort you spend planning and estimating the stories. No free lunch.
Exactly. I read the article fairly carefully, but the logic seemed circular to me. For a novel project, the problem of how to divide it up into predictable units of work remains. It's always the black swan little subprojects that destroy an estimate, and this approach doesn't help with those unforeseen events. If I could avoid those that reliably, I'd be able to estimate projects with a high degree of accuracy, even without using this approach.
In my experience, only increased subject matter expertise and a solid team with lots of experience can minimize the black swan events, and even then not 100%. They still happen with disheartening frequency.
And as you say, it doesn't buy you much if you want to measure how much work your team does per day/week/month, either, for the reasons you describe.
So what happens in your method when you hit a major unforeseen issue, as frequently happens in a project? If your prior 2 projects went by without a problem, then you'll still be sitting on a bad estimate.
Yes, after time, this should start to even out with your method, but who keeps the same team or even stays in the same job for more than 3-4 major projects? It's fairly unusual these days. Yes, I know there are some senior devs that stay at the same job for 5 or even 10 years, but their team, and even the company's hiring standards, are likely to chance significantly during that time.
That said, it is an intriguing idea, and one that bears more study. I'm not yet convinced but I may try it out.
The thinking here was that work that reaches a cycle time that is higher than 85% of the other work we completed is probably worth some extra attention. Is the person working on it stuck? Is the problem so complex that it might be useful to sit down and have a look at it together?
We all felt it was a good motivator to work together on a regular basis. Or at least step up the amount of communication around that unit of work.
My intuition would be: our "forecasting" solution is never going to be 100% right. Here's an example where it's completely wrong, scrap the "forecast" and get the work done. Don't sleep in the office overnight, don't throw more developers at it; acknowledge no forecasting system ever will get creative work predictions/forecasting correct.
I normally recommend when starting out to take a small segment of your process first and then widen it as you get more experience.
For the start point I usually try to think of these aspects:
- Can work still be cancelled at this part of the process. If so it's probably not the right spot to measure the start date.
- Can work still be reprioritized at this point in the process? If so that will significantly widen the precision of the predictions that the forecast will make.
Also note that setting a the start date at the moment of commitment still gives you valuable insights in how much work you will be a able to next.
For setting the end date I think when the software is released is the most interesting point to measure. As always it depends on context. If you do a yearly release this doesn't make any sense because the granularity of all your predictions will be a year.
Significant changes will indeed make your forecast not come true. But mostly people are not asking to predict for situations with significant changes.
Incremental changes will just be incorporated in the forecasts over time. The forecast continually adjusts to new information becoming available.
It's interesting to know that after about ten work items completed the number become pretty stable. So it's easy to reset or adjust the forecast in case of a big change in the team.
I was personally surprised to learn how stable the output of a team is if its composition doesn't change. Tools have some impact on productivity but so big that you have to throw away your predictions. No one is complaining anyway if you over deliver a little.