Build Failure Recovery Time is an important DevOps metric as it tracks the team’s ability to recover quickly from a failed build. As the name suggests, Build Failure Recovery Time calculates the duration of time between a failed build until the completion of the next successful build, on a particular code branch or pipeline.
As such, it is an important measure of DevOps maturity as it tracks the ability of a team to effectively manage failures and to ‘fail fast and recover fast’ if there is to be a failure.
Example Build Failure Recovery Time chart – Plandek DevOps metrics dashboard
Example Build Failure Recovery Time drill-down charts – Plandek DevOps metrics dashboard
As per the charts above and below, DevOps Metrics platforms like Plandek (www.plandek.com) give powerful drill-down views to identify the source of failure (by project and branch) and to identify the recovery time.
Build Failure Recovery Time is one of many DevOps metrics used to improve the integration and deployment process. It is related to the popular DORA metrics popularised in the ‘Accelerate’ book by Forsgren, Kim and Humble.
Other related DevOps metrics and DORA metrics include:
- Deployment Frequency – perhaps the most critical DevOps metric of all. Deployment Frequency tracks the frequency with which increments of code are deployed to staging, testing and production.
- Build Failure Rate – an extremely helpful metric which identifies the percentage of workflows which fail and the overall risk this poses to development. A significant source of risk both in day-to-day development and responding to incidents due to the delays.
- Mean Build Time a related metric which as the name suggests analyses the time taken for each build. This is a helpful metric to identify slow build processes which affect the ability of the team to deliver software. A steadily increasing mean workflow time will want to be addressed and will drive longer Cycle Times. We particularly like filtering by status to help you keep an eye on slow builds which ultimately end in failure
Key use cases
Build Failure Recovery Time is a very useful DevOps metric to help teams understand how effectively they recover from failed builds, thereby reducing the time taken between code completion and production.
An improvement in Build Failure Recovery Time will reduce your Lead Time for Change (a key DORA metric) and demonstrates that the DevOps team is knowledgeable, fast to react and responding efficiently.
Build Failure Recovery Time is particularly important for organisations looking to develop their Agile DevOps maturity.
Reducing Build Failure Recovery Time will reduce overall Lead Time for Change and improve DevOps effectiveness.
Plandek works by mining data from toolsets used by delivery teams (such as Jira, Git, CI/CD tools and Slack), to provide end-to-end delivery metrics/analytics to optimise software delivery predictability, risk management and process improvement.
Plandek is a global leader in this fast-growing field, recognised by Gartner as a top nine global vendor in their DevOps Value Stream Management Market Guide (published in Sept 2020).
Plandek is based in London and works with clients globally to apply predictive data analytics and machine learning to deliver software more effectively.