I think the Agile community needs to change how it measures success for Agile teams. The ways that we gather metrics and the information we seek out of those metrics is actually getting in the way of what’s most important, making working software.
The way I see it, there’s two major problems:
The observer effect: The observer effect states that observing a process can impact its output. For instance, telling a team that you’ll be keeping a close eye on their velocity might cause that team to overestimate their items in order to increase their velocity. This is especially dangerous when working with story points since there’s no way to compare the validity of an estimate. Should your Scrum team stop using story points?
The streetlight effect: The streetlight effect is our human tendency to look for answers where it’s easy to look rather than where the actual information is. For instance, counting the lines of code produced is easy but doesn’t tell us anything about the quality of the application, the functionality it provides or even the effectiveness.
So let’s apply these concepts to some common Metrics. What’s easy to measure?
Unit Tests written: Most agile developers write a lot of unit tests, test driven development creates even more tests(both of which create better quality code). So measuring a developer’s productivity by the number of tests they create must be good!
Actually, the observer effect kills this one dead. Telling a developer that they’ll be measured on the number of tests they write ensures they’ll create many tests with no respect to the quality of those tests. Our goal is not to ship tests, our goal is to ship working code, I’ll take fewer better tests than more crappy tests any day.
Individual Velocity: Once again the observer effect makes this a bad metric. If a developer knows he’s being individually graded on his performance and also knows that he only gets credit for the things he specifically works on then he’s actively discouraged from contributing to the group. He’s placed in the very un-agile situation of competing with his team rather than contributing to it.
In a perfect world an Agile team is collaborating, interacting, discussing and reviewing almost everything they do. This is a good thing for building quality software and solving problems fast but this level of interaction makes it nigh impossible to separate a person’s individual productivity from the group, so don’t try, you’ll simply hurt your team’s ability to make good software.
Team Velocity: This is one of the most misunderstood metrics in all of Scrum. A team’s velocity is unique to them. It simply can’t be compared to another team. Let’s say that team A estimates a certain amount of work at 50 pts for a sprint and team B estimates that same work at 150 pts for the same sprint. Now if both teams finish their sprint successfully then team A has a velocity of 50 pts and team B has a velocity of 150 pts. Which team is more productive? Neither. They both did the same amount of work.
This metric is particularly evil because it encourages teams to fudge the numbers on their estimates which can affect the team’s ability to plan their next sprint. If the team can’t properly plan a sprint then that puts your entire release in danger of shipping late.
For more about your Scrum team’s velocity, you can check out an earlier blog post I wrote.
Okay smart guy, what metrics should we use?
Glad you asked, we measure productivity by the working software we deliver. We measure actual output rather than contributing factors. This approach is more Agile because it frees the team to build software in whatever way can better contribute to their success rather than whatever way creates better metric scores. It’s also much more logical since working software is something that we can literally take to the bank (after it’s been sold of course).
So what are the actual new metrics?
Value Delivered: You’ll need your product owner for this. Ask him to give each user story a value that represents its impact to his stakeholders. You can enumerate this with an actual dollar amount or some arbitrary number of some kind. At the end of each sprint you’ll have a number that can tell you how much value you’ve delivered to your customers through the eyes of the product owner.
This metric does not measure performance, instead it measures impact. Ideally your product owner will prioritize higher value items towards the top of the backlog and thus each sprint will deliver the maximum value possible. If you’re working on a finite project with a definite end in sight, your sprints will start out very high value and gradually trend towards delivering less and less value as you get deeper into the backlog. At some point, the cost of development will eclipse the potential value of running another sprint, that’s typically a good time for the team to switch to a new product.
On Time Delivery: People sometimes tell me that Agile adoption failed at their company because they couldn’t give definite delivery dates to their clients. I don’t buy this. One thing that an Agile team should definitely be able to do is deliver software by a certain date. It’s possible that a few stories may not be implemented but those are typically the lowest value stories that would have the least amount of impact on the client. That being said, a team’s velocity should be reasonably steady, if it goes up or down it should do so gradually. Wild swings in velocity from sprint to sprint make long term planning harder to do.
Here’s the metric: if a team forecasts 5 stories for an upcoming sprint and they deliver 5 stories then they earn 2 points toward this metric. If they deliver 4 stories or they deliver less than 2 days early (pick your own number here) then they earn one point. If they deliver more than 2 days early or they only deliver 3 (out of 5) stories they earn no points. At the end of a quarter or the end of a release or the end of the year the team will be judged by how accurately they can forecast their sprints.
So what we’re measuring is value delivered to the customer and on time delivery of that software. Which are the only two real metrics you can literally cash checks with.