To understand your Problem Management approach, there are four key indicators of failure:
- Incidents close too early and then recur
- Incidents remain open during the Root Cause Analysis (RCA), losing visibility as to when service restores
- Missing agreed on Service Level Agreement (SLA) targets
- Having a large number of open incidents, many of which never close
Suppose that you currently have one or more symptoms from the list above. Assume that Problem Management occurs as a series of disconnected activities, primarily within a single technology domain or “silo.” It’s common to have Problem Management activities in a single silo, but this isn’t usually as effective as Problem Management that spans multiple technology silos. If coordinated efforts that span more than one silo rarely occur, then you probably don’t have effective Problem Management.
Examine why Problem Management is silo-oriented versus syndicated:
- Does your IT Service Management (ITSM) support tool link and relate Incident and Problem Management records?
- Do the second and third line “resolver” staffs have good working relationships with first-line Incident Management staff?
- Do all responders and resolvers, regardless of silo, understand the business impact?
- Do you have Configuration Models, Problem Models, Operational Level Agreements (OLAs), escalations and related knowledge readily available?
- Does all resolver staff receive regular training on the business implications of the services they support and the processes they follow?
To start organizing Problem Management, create a new role titled “Problem Manager”. This should be a single point of contact who owns the Problem Management process. This Problem Manager should work with your team to create escalation and integration with Incident Management. The teams need to create and agree to OLAs with shared priorities that empower a Problem Manager to assemble resources as needed. The Problem Manager coordinates activities while technical support staff and/or vendors solve the problem. Once resolved, the Problem Manager releases resources back to the organization, ensures knowledge management tasks are complete, and then returns to her normal role (until the next problem).
Problem Management is more than simply waiting for the next outage. It’s proactive and reactive and incorporates multiple departments as well as suppliers and vendors.
By understanding and separating Incident and Problem Management, you can reduce the quantity and duration of service disruptions. The key to success is Problem Management doesn’t require a functional organization, unlike Incident Management.
How to Master Problem Management