LP4.7: Resolve problems
Previous process: Resolve incidents and service requests
The problem management process in YaSM (fig. 1) is about managing the lifecycle of all problems, where a problem is the underlying cause of one or several (potential) incidents. The primary objective of the problem resolution process is to prevent service incidents from happening, and to minimize the impact of incidents which cannot be prevented.
Compatibility: YaSM problem management is aligned with ISO 20000, the international standard for service management (see ISO/IEC 20000-1:2018, section 8.6), and it corresponds to the practice of 'ITIL 4 problem management'.
YaSM's problem management process has the following sub-processes:
- LP4.7.1: Pro-actively identify problems
- Process objective: To improve overall availability of services by proactively identifying problems. This process aims to identify and solve problems and/ or provide suitable workarounds before (further) service incidents occur.
- LP4.7.2: Categorize and prioritize problems
- Process objective: To record and prioritize the problems with appropriate diligence, in order to facilitate a swift and effective resolution.
- LP4.7.3: Analyze and resolve problems
- Process objective: To identify the underlying causes of problems and to determine the most appropriate and economical problem solution. If possible, a temporary workaround should be supplied while no full solution is available.
- LP4.7.4: Monitor outstanding problems
- Process objective: To constantly monitor outstanding problems with regards to their processing status, and to take corrective action as required.
- LP4.7.5: Close problems
- Process objective: To ensure that the problem resolution has been successful and all related information is up-to-date.
This section lists the documents and records produced by the problem resolution process. YaSM data objects [*] are marked with an asterisk, while other objects are displayed in gray.
- Incident model
- An incident model contains the pre-defined steps that should be taken for dealing with a particular type of incident. The aim of providing incident models is to ensure that recurring incidents are handled efficiently and effectively. [*]
- Problem record
- A set of data with all details of a problem, documenting the history of the problem from registration to closure. A problem is defined as the underlying cause of one or more (potential) incidents, although the cause may not be known at the time a problem record is created. Often, a workaround is provided for a problem while a full resolution is not yet available. [*]
- Recovery plan
- Recovery plans contain detailed instructions for returning specific services and/ or systems to a working state, which often includes recovering data to a defined consistent state. [*]
- Service operation manual
- A service operation manual specifies the activities required for the operation of a service and its underlying infrastructure. The information in the service operation manual is meant to describe the day-to-day tasks in a way that is useful for operational staff. Some instructions related to the operation of particular applications, systems or other infrastructure components may be documented in separate technical manuals or 'standard operating procedures (SOPs)'. [*]
- Suggested process modification
- A suggestion for modifying one or several service management processes. Suggestions for process modifications or improvements may originate from anywhere within the organization.
- Suggested service modification
- A suggestion for modifying a service, for example to improve service quality or economics. Suggestions may originate from anywhere within or outside of the service provider organization.
- Support request
- A request to support the resolution of an incident or problem, usually issued from the incident or problem manager when further assistance is needed from technical experts or external suppliers.
[*] "YaSM data objects" are those documents or records for which the YaSM model provides detailed recommendations: Every YaSM object has an associated checklist (see example) describing its typical contents, and an associated lifecycle diagram depicting how the status of the object changes as it is created, updated, read and archived by various YaSM processes (see example).
"Other objects" are mostly informal data or information where YaSM has less strong views about their contents. There are no associated lifecycle diagrams or checklists.
Process metrics are used, for example, to assess if the service management processes are running according to expectations.
For suggestions of suitable metrics, please refer to the list of metrics for the problem resolution process.
Roles and responsibilities
Process owner: Problem manager
- The problem manager is responsible for managing the lifecycle of all problems, where the primary objective is to prevent incidents from happening if possible, and to minimize the impact of incidents that cannot be prevented. Apart from resolving the underlying causes of (potential) incidents, the problem manager often provides workarounds while a full solution is not yet available.
|YaSM role / sub-process||Problem manager||Service owner||Technical domain expert|
|LP4.7.1||Pro-actively identify problems||AR||-||-|
|LP4.7.2||Categorize and prioritize problems||AR||-||-|
|LP4.7.3||Analyze and resolve problems||AR||R||R|
|LP4.7.4||Monitor outstanding problems||AR||-||-|
Is based on: The problem resolution process from the YaSM Process Map.
by: Stefan Kempter
Operational processes, and in particular incident management and problem management, are probably the most widely known (and used) parts of IT service management best practice. [...]