Attendees of our webinar “Understanding the Difference Between Incident Management and Problem Management” had a lot of great questions for the webinar host, Global Knowledge ITIL course director and ITIL Expert, Michael Scarborough. So great, in fact, we decided to share them, along with Michael’s great answers.
Q: In general the service desk isn’t responsible for restoring a service that went down. This is handled by the technical groups. Wouldn’t we expect the technical groups to be fairly involved with incident management too?
A: According to best practice, the service desk is responsible for incident management and is focused on restoration of service. However, technical groups would likely be involved as well when it is necessary to escalate an incident that requires more specialized knowledge to restore service.
ITIL recommends for service restoration to occur as close to the user as possible for several reasons. First, escalation takes time, which means there is delayed restoration of service, which means the user and customer are impacted longer. Second, escalating an incident tends to cost more than handling it directly at the service desk, because it engages more specialized technical resources. Finally, too much escalation can affect project timelines when more skilled technical people are continually interrupted to handle incidents that would make more sense for the service desk to handle directly.
I suggest reading what the “ITIL Service Operation” book says about the role of the service desk and incident management for more information about this topic.
Q: Is it best practice to have a team dedicated to root cause analysis?
A: ITIL doesn’t necessarily specify that a team is dedicated to root cause analysis, however it does recommend that a problem management process exists in an organization that identifies and potentially removes problems from the environment. This could be resources in many ways, for example with a dedicated team, or it could be resources by making it a part of people’s regular roles in an organization.
Q: We have a separate problem management and Incident management. How can we educate incident management on when to engage problem management?
A: I would recommend that you base the engagement on a couple of things. First, the impact of the incident may be a good indicator of when to engage problem management. Those incidents that have higher impact are the ones that you generally don’t want to see repeated, so it makes sense to invoke problem management in those cases. Second, I would recommend that if a service desk is experiencing the same incident repeatedly, then problem management should be engaged to research and potentially remove the root cause.
Q: If we see multiple incidents – wide impacting – is this a problem or does it still lie in incident management?
A: Both. Incidents represent the interruption of a service. In this case, these are incidents. Users and customers want restoration of service, which is what incident management does. However, all incidents have a root cause. Problem management is involved in investigating, diagnosing and potentially correcting the root causes of incidents. I would expect in this case that incident management would restore service, and problem management would research and potentially correct the root causes of these incidents.
Q: The focus is on SERVICE, but if the device covers multiple services, is there a difference in how it is addressed?
A: A device that is a component of multiple services might represent potentially higher impact in the event of a failure. I would say the difference in how it is addressed might be that this device failing would result in incidents and problems being of a higher priority due to the higher level of impact.
Q: In our organization we just started a dedicated ITSM group. I am both the incident manager and problem manager. I am responsible for the process itself, but the service desk is not under our ITSM. Have you ever seen an organization split the service desk and incident management process and be successful?
A: I’ve never really seen these split. Typically most of what the service desk does is incident management and request fulfillment, and more often than not, the service desk – or someone very close to it – owns the incident management process.
Q: Question about who owns problem management. I would think an operations management group would own the problem but might assign it to a technical group for analysis/resolution, in the same way that the help desk owns the incident. So I would not think that a technical group would own problem management. Thoughts?
A: ITIL does not prescribe who exactly owns problem management. It could be many different people in an organization. About the only constraint would be where ownership of problem management could not exist. For example, there are many things that ITIL discusses that are related but somewhat mutually exclusive. Incident and problem management are two of these mutually exclusive processes, meaning that it wouldn’t be appropriate to have the same owner for both. This is for a really simple reason: Sometimes to resolve a problem, one has to cause an incident, which is the opposite of what incident management wants to happen.
Q: How much of a root cause analysis should the overall problem manager (as opposed to a technical problem manager) perform/document?
A: I’m not sure I completely understand the question, as ITIL does not define a role called “overall problem manager.” This sounds more like an organization-specific thing than something defined in the best practice.
Q: How do the incident process owner and incident manager differ? Is the incident manager invoked when a P1 occurs?
A: An “owner,” such as the incident management process owner, is accountable for the incident management process. This means that he or she ensures that a process exists, proper funding is available, and the process is happening as intended, among other things. An incident manager is a role that carries out the incident management process, or from an ITIL perspective, is “responsible.” An incident manager is more of a hands-on person who performs the steps of incident management and ensures operational coordination of the overall process.
Whether or not an incident manager is invoked when a priority one (P1) incident occurs is an organizational decision. However, in most organizations, P1 incidents are so impactful that a clearly designated incident manager would be allocated.
Q: I heard multiple related incidents in a shorter period is a problem. Is that a correct statement to make?
A: No, but it is a common misunderstanding. Incidents are representative of the interruption or disruption to a service. Problems are representative of the root cause of one or more incidents. In this case, these multiple related incidents are representative of the impact of the disruption, and there is most likely an underlying shared root cause for each of these incidents that would be represented as a problem.