What is an incident post mortem? And how do I do one?

Learn from software incidents in a blameless format and discover how to prevent them from happening again in the future.

Updated:

Author:

Craft Author: Tom Norman
Tom Norman

In the world of software development, anything can happen. There are times when things don't go as planned, and projects may fail. When that happens, a post mortem is a useful tool for determining what went wrong and how to avoid similar problems in the future.

An incident post mortem examines the details of the incident, the cause of it, and the steps that were taken to resolve it. This type of post mortem is particularly important for identifying weaknesses in a project, in order to prevent similar incidents from occurring in the future. The post mortem report will typically include recommendations for improvements that can be made to the project, as well as any lessons learned from the incident.

Here are the steps to follow when performing an incident post mortem:

1. Gather the team

Bring together everyone who was involved in the project, including developers, QA testers, project managers, and anyone else who played a role. This will help ensure that everyone’s perspective is taken into account.

2. Define the problem

Clearly articulate the issue that led to the project’s failure. Was it a technical issue or a process failure? If it was a technical issue, was it a bug or a problem with system architecture? If it was a process failure, what could have been done differently in the planning and execution phases?

3. Analyze the events

Review the project’s timeline from start to finish, documenting every milestone and decision made along the way. Determine which events led up to the problem and identify any missteps that may have contributed to the issue.
 

Screenshot of the incident post mortem template
Craft's Incident Post Mortem Template

4. Determine the root cause

After analyzing the events that led to the failure, it’s here that you can take note of what you’ve identified as the root cause. We really like the 5 Whys Method which was popularized by Taiichi Ohno at Toyota. By starting with the problem and then asking why five times should reveal the root cause of the issue. Each "why" iterates on the information brought forward in the previous why which helps dig deeper into the issue and identify its root cause.

5. Create an action plan

Take the lessons learned and use them to create an action plan that outlines how to prevent similar issues from happening in the future. This plan should include recommendations for process changes, technology updates, or training for team members.

6. Communicate the findings

Once the post mortem is complete, make sure to communicate the findings to stakeholders, including the project sponsor, senior management, and other team members. This will help ensure that everyone understands what happened and what steps are being taken to prevent similar issues in the future.

 

In conclusion, performing an incident post mortem is a valuable tool for improving software development processes. By following these steps, your team can learn from past mistakes and create a plan for moving forward. Remember that the goal of a post mortem is not to place blame, but rather to identify areas for improvement and ensure that everyone is working together to achieve success.

Retrospective templates