Testing as a Productivity Metric

As part of our iteration reports that we compose and send out to interested parties every two weeks (after retrospectives), we keep track of a handful of metrics. Most of the metrics are pulled from our kanban board (number and distribution of tasks and blocking issues), our happy-sad-deadcat-shovel retrospective notes, and our test stats. One interesting metric we’ve started to experiment with is the application of “# of tests” as a productivity metric.

Productivity metrics are hard things to track, especially in a software development context. Some teams use number and/or size of tasks completed (aka velocity) – but tasks can be of variable length and complexity. Some teams use lines of code – but a good refactoring could easily produce 10,000 lines of code changed, while a 3-day bug fix might result in a one-line resolution. Some teams use number of commits – but commits can be as long or as short, as large or as small as each individual developer feels comfortable with. So how does one measure progress and productivity?

Productivity as a Metric

First off, I think it’s important to realize that any or all of these indicators have to be taken with a grain of salt. While you might be able to derive some larger meaning from a large set of data, just because your one particular productivity metric is down for an iteration doesn’t mean your team (or individual team member) did a terrible job. Variations in productivity can be produced by any number of factors, including sick days, vacation, meetings, research spikes, and the so on.

Productivity metrics, therefore, can really only be applied when looking at a range of data. If your team is consecutively performing at a lower than acceptable rate (where “acceptable” should be reached by team consensus, not management decree), then the team should start trying new things to address that problem: less meetings, professional development, more team members, etc. After trying those new approaches, if the overal metric increases and sustains its growth, you know you’ve introduced a positive change. Conversely, if productivity decreases over a long term in response to a change (new management, loss of a team member, etc.) you know something negative has happened and needs to be addressed.

Testing Productivity

So why does number of tests work in this context? Well, on our team we have a pretty strong focus on testing – a feature isn’t considered done until there’s been at least one unit and/or integration test written for it. The key word there is “done” – by having a new test (something that is easily tracked), we’ve more or less indicated that the feature we’re working on is in a state where we can write passing tests for it. Which is as good of a measure of productivity (completed features) as any.

Another key tenant of this is the metric is natrually dervied – we don’t have to go out of our way at all to calculate it, we just run our test suite like we do every other day. There’s no extra work, no estimating, no data analysis to gather the progress of the team.

Team Over Individuals

Of course, this is subject to the same scrutiny of everything else. Its just as easy for developers to write a lot of small, easy tests to give a perception of performance. Which is why I think it’s important for productivity metrics to be treated as a very general measure of overall team health. You shouldn’t be basing performance reviews on a team member’s individual test contributions, instead using it as one of many indicators of the team’s success rate. If you turn metrics into deterministic measures of an employee’s success or failure, employees will game the system and the metric will lose all value.

In summary, “number of tests” is turning out to be an interesting metric that we can use in combination of other metrics to assess team productivity and health. This works for us because of a team-level focus on testing. It’s important to find a metric that can be natrally derived from workflow and fairly consistant, and also to apply the lessons from that metric effectively. All of these things can be achieved by asking your team for ideas for and responses to metric analysis.