How ITIL Differentiates Problems and Incidents
- Date: 04 April, 2019
It’s a question we still get asked all the time, even in the new ITIL 4 Foundation classes:
How do you differentiate between incidents and problems?
To address this issue and offer clarification, this article will identify the differences between incidents and problems, how they are related, and why it matters.
What is an incident?
According to ITIL 4, an incident is an unplanned interruption to a service, or reduction in the quality of a service. What often determines the classification of something as an incident is whether or not the service level agreement (SLA) was breached. However, ITIL allows for raising an incident even before an SLA has been breached in order to limit or prevent impact. For example, automated system monitoring may notice a degradation in response time or other error before an SLA is breached or a customer even notices. In layman’s terms, an incident is the representation of an outage.
What is a problem?
According to ITIL 4, a problem is a cause, or potential cause, of one or more incidents. Problems can be raised in response to a single significant incident or multiple similar incidents. They can even be raised without the existence of a corresponding incident. For example, monitoring may reveal an issue that has not yet resulted in an incident but if left unchecked it may cause more issues. In layman’s terms, a problem is the representation of the cause or potential cause or one or more outages.
Why does best practice distinguish between incidents and problems?
The point of distinguishing between incidents and problems is the same as separating cause and effect. Problems are the cause, and incidents are the effect.
ITIL 4 encourages organizations to distinguish between the two because they are often treated and resolved differently. Addressing an incident simply means that whatever service was impacted has been “temporarily” restored. It does not mean that the incident will not recur at some time in the future. When we say “temporarily,” keep in mind that could mean one minute or 10 years. The point is that a resolution to an incident is not always permanent.
Problems, however, are the cause of incidents. We might use different techniques to identify the underlying causes of a problem, potential workarounds and ultimately a structural resolution to the problem.
Effective incident management ensures that as a service provider you are able to keep the promises you made in your SLAs by providing a mechanism to quickly restore service when it’s necessary. Problem management ensures that as a service provider you are able to reactively respond to incidents so that they don’t recur and proactively prevent incidents from happening.
These are separate practices in ITIL 4 because they often require different skill sets and activities. Incident management wants to quickly restore service in line with any SLAs that are in place whereas problem management wants to eliminate the underlying causes of incidents.