Browse Courses

Measuring What Matters in DevOps

Explores the critical role of measuring meaningful metrics in DevOps environments. Discusses how metrics drive behavior, the importance of social and collaborative measurements, establishing improvement baselines, and the shift from failure prevention to rapid recovery strategies.

This document explains the importance of measuring what matters in DevOps, highlights the value of social and DevOps metrics, and describes how shifting measurement strategies can drive cultural and operational improvements.

Measuring What Matters in DevOps

Measurement is a powerful driver of behavior in organizations. When teams are evaluated based on specific metrics, their actions align with those measurements, often to the exclusion of other important activities. This principle is illustrated by Steven Kerr’s classic paper, The folly of rewarding for A, while hoping for B, which emphasizes that individuals and teams will focus on what is measured and rewarded.

The Risks of Misaligned Metrics

If organizations measure the number of widgets produced or lines of code written, the result will be an increase in those outputs, regardless of their actual value. For example, measuring lines of code can encourage verbose, inefficient coding, as developers are incentivized to maximize quantity over quality. Similarly, ranking employees against each other can foster antisocial behavior, as individuals may be less inclined to collaborate if their success is measured in competition with peers.

Encouraging Social Coding and Collaboration

To promote positive behaviors such as collaboration and knowledge sharing, organizations should measure and reward social interactions among developers. Metrics that track code sharing and reuse help create a culture of social coders. Two useful metrics are: who is leveraging the code being built, and whose code is being leveraged. These metrics encourage both sharing and reuse, reducing redundant work and increasing the value of shared solutions.

Setting Baselines and Goals for Improvement

Continuous improvement in DevOps requires establishing a baseline for current performance, such as deployment time, cost per release, or number of defects. Setting clear, achievable goals—like reducing deployment time or the number of team members required for a release—enables teams to track progress and adjust strategies as needed. Success is measured by progress toward these goals, and the process is repeated for ongoing improvement.

Shifting from Failure Prevention to Recovery

Traditional approaches focused on preventing failures, such as maximizing mean time to failure. Modern DevOps emphasizes mean time to recovery, accepting that failures will occur but prioritizing rapid recovery to minimize impact. Techniques like containerization and microservices enable quick recovery, often without customers noticing any disruption. This shift in mindset supports resilience and continuous service availability.

Conclusion

Measuring and rewarding the right behaviors is essential for driving improvement in DevOps. Social metrics foster collaboration, while DevOps metrics provide insight into progress and guide goal setting. By focusing on recovery rather than prevention and continuously refining measurement strategies, organizations can achieve lasting cultural and operational transformation.


FAQs

The principle is that individuals and teams will focus on what is measured and rewarded, often to the exclusion of other important activities.

Measuring lines of code can encourage verbose, inefficient coding, as developers are incentivized to maximize quantity over quality.

Ranking employees against each other can foster antisocial behavior and reduce collaboration, as individuals may compete rather than cooperate.

Metrics that track code sharing and reuse, such as who is leveraging code and whose code is being leveraged, help promote social coding and collaboration.

Setting baselines and clear goals enables teams to track progress, adjust strategies, and achieve continuous improvement.

Shifting focus from failure prevention to rapid recovery supports resilience and continuous service availability.

Measuring and rewarding the right behaviors drives improvement, fosters collaboration, and guides cultural and operational transformation.

TermDefinition
Social metricsMeasurements that encourage collaboration and code sharing
DevOps metricsIndicators that track progress and guide goal setting in DevOps
Mean time to recoveryThe average time it takes to restore service after a failure
BaselineThe starting point for measuring improvement in a process or activity