Release Confidence
A metric I’ve been mulling over the past few months is something called “Release Confidence”. The idea is that you’d be able to measure the confidence implementation teams have in the code that they’re shipping to production - that it’s feature complete, bug free, and won’t cause any issues when deployed. While it’s not something you can measure with hard numbers, there are a few related metrics you can keep an eye on, and having a “low” release confidence can enable some productivity-killing behaviours to compensate for it.
Generally I’m not a fan of wishy-washy metrics like this where things are measured on “vibes”, but having worked with teams where release confidence was low, it’s definitely a ‘feeling in the air’. Teams with low release confidence are generally more anxious or over-prepared for “release day”, where the process of releasing a new version of their software package is a dedicated event with lots of manual ceremony. Conversely, teams with high release confidence treat releases as a minor process, often just kicking off an automated deployment script and verifying the end result.
There’s always going to be a certain amount of anxiety around releases as they often rely on external resources beyond your control. What if the network drops mid-deploy? What happens if a server upgrade fails? What about if AWS goes down? These are all things that require a bit of faith, and consequently a bit more manual work to verify and defend against. Those unavoidable situations I consider out of scope for this particular metric. Release confidence, instead, is directly affected by processes and procedures you do have control over as a team.
Actual Metrics of Release Confidence
While it’s hard to measure the vibes of a team around releases (short of something like retrospectives and employee surveys), there are a few hard measurements you can take to give a sense of what a product’s release confidence is like. These measurements are more about the side-effects of having a low or high confidence, and in aggregate can likely give a good picture.
- Test Coverage
- How easy is it for a developer to say “everything is working as expected” after writing code? Does it require a lot of manual effort, or is there a solid foundation of test coverage?
- Defect Density
- Measuring bugs both pre- and post-release, are we catching issues before they show up in production? Are the bug counts unmanageably high after a feature is merged into the main branch? (“Bugs” here mean issues recorded as a new bug ticket after implementation - developer-caught bugs aren’t included)
- Deployment Success Rate
- When actually performing a release, have you ever had to abort or roll back the changes? Have you had a release branch “fail” (i.e. have so many bugs and integration issues that it was un-releasable)?
- Feedback Loop
- What is the average time between code being written and bugs being reported for that code? Are developers informed of issues within a day or two, or does it take weeks after they’ve implemented the feature?
- Time to Recovery
- When something does go wrong in production, how long does it take your team to respond to it?
- How much work does it take to release a version?
- When the company does decide to release the newest version, how long of a process is it? How many manual steps are involved?
All these metrics are good measurements of the quality of your releases and the processes around feature development, which directly feed into the team’s confidence in their releases.
What does high Release Confidence look like?
In short, teams with high release confidence aren’t affected by releases - they’re just another step in the process. Releases on these teams tie up as few resources as possible and are as automated as can be, relying on devops scripts and automated tests to deploy and verify the new version of the product. Here’s a few benefits:
- Releases require as few resources as possible
- A release can be managed by one person, or even a separate team, as there’s no real need to have specialized knowledge or attention on the release
- Manual testing is optimized
- Having confidence in your automated tests enable developers and quality assurance analysts to focus on the areas that need attention
- Higher innovation
- Removing concern over the fragility of the software project enables developers to try new things and innovate approaches, which also leads to…
- Lower Tech Debt
- Because it’s easier to confirm if things are working as expected, it’s easier to remove and avoid tech debt. Compromises don’t have to be made during the development process to ensure that nothing breaks in a fragile system
- Lower Bug Rates
- Automated testing pulls its weight here by being able to quickly inform developers that their changes may have affected existing functionality. The feedback loop is small in a system with release confidence - coders are informed within minutes about bugs, rather than days (or weeks!) later after manual testing effort
- Failsafes
- When releases do go wrong, the team has a clear plan in place to roll back to prior versions that can be executed with little ceremony.
- Team Confidence
- Knowing that their product is solid, team members are more likely to feel job satisfaction, pride, motivation, and engagement, which leads to a bunch of add-on benefits from internal trust between teams all the way to a positive effect on recruitment
What does low Release Confidence look like?
Conversely, teams with low release confidence usually have entire manual processes set up around releases. Things are rushed, many people are brought in, there’s a shared anxiety during the process as well as a collective sigh when it’s completed. Some signs of low release confidence on an implementation team include:
- Increased anxiety and stress during releases
- Team members are usually on edge during releases, crossing their fingers that nothing goes wrong, and heaven forbid a hiccup does happen - then there’s a rush to determine who broke what and why.
- Hesitation, procrastination, reduced velocity
- Because there’s no clear way to say “yes, this project is still in a good state”, the development cycle is directly affected, adding repetitive throw-away work in the form of manual testing, over-verification, and a general sense of “tip toeing” through the codebase
- Risk aversion, lack of innovation
- Without knowing how changes may affect other components in the system, developers are inadvertently trained to “do what works” and repeat patterns (and their associated tech debt) for new features. Innovation is effectively punished with undetected bugs and instability that become the developer’s problem days and weeks later.
- Longer feedback loop
- Developers aren’t informed of bugs until the issues are found much later in the process, if at all - often through manual testing that is prone to error and omission. False bugs take up developer time, and uncaught bugs that make their way into production reduce product confidence further.
- Reduced Team Confidence
- Knowing that your code could break at any moment due to changes that you (or another member on your team) added weeks ago doesn’t really inspire confidence or pride in the project; your job becomes more about making sure you don’t break anything more than adding value.
How do we inspire Release Confidence?
There’s a few things that teams can do to improve their release confidence.
Automated Testing!
It still blows my mind how many companies are missing the boat on automated testing. Whether it’s unit testing, partial system integration testing, or full on end-to-end testing, a large number of companies have minimal or a complete lack of automated testing. It’s particularly shocking for teams who deal with money transactions or financial institutions, where a bug could legitimately lead to a wrongly charged customer or the denial of a loan. Promised post specifically on automated testing will be coming soon :)
Improved Release Processes
Making sure your releases are as scripted as possible is a good start here. The less manual work involved in a release, the lower a chance for error, the less anxiety felt during the release. You can automate acceptance testing, use a blue/green deployment pattern, even automate rollbacks where necessary. You’ll never completely remove the anxiety of pushing to production here, but making sure what does happen can happen consistently and without manual intervention makes it so the only things you’ll have to debug are external dependencies (and with blue/green you won’t be doing it while prod is down).
Foster a culture of confidence and safety
It’s important that when issues do arise, the team focuses on the what, why, and how, not the who. High severity issues are everyone’s problem, not a specific person. When issues are resolved, it’s important to identify the causes so that guards can be installed to avoid the issue in the future (automated tests, validations), but taking time to point out who caused the issue and/or questioning their motives or skill do nothing other than waste everyone’s time and demoralize the entire team.
High bug counts aren’t necessarily a measure of developer skill; they should first be investigated as a process failure (particularly if it’s a cross-developer issue). With each bug the team should be asking itself “is there a way this could have been caught by the developer beforehand?”
Enhancing team devops skills
Releases shouldn’t be a black box or mystery to the implementation team. I know most companies nowadays have a separate “devops” team who handles most of the scripting and resourcing, but I’ve always been a fan of enabling the team to own their own deployments. Of course there’s saved effort in having consistent scripts between projects and shared resources, and that’s where a devops role (as an internal service that teams have access to) makes sense, but ultimately the team should understand how their code is deployed, what happens, what resources are involved, and how they integrate with one another.
Observability
There’s a difference between feeling confident, and proving that you should be confident. Automated testing is one component of that, but observability is the other side, demonstrating that things are working (and continue to work) as expected. Knowing that when issues do arise you’ll be able to catch them before end users do, via log and performance alerting, gives the team another level of confidence. Observability can also be applied at the regression testing level; if your main branch suddenly runs 20ms slower or has new warnings compared to the previous night’s run, you’ve reduced the feedback window related to a possible bug as well as caught the issue before it showed up in production. Being able to find the possible issue introduced over the last 24 hours is much easier than trying to figure out what possible change caused the issue in the last four weeks.