If ain’t broke, don’t fix it, as the saying goes. However, even unbroken IT installations must be fixed by patches, upgrades or redesigns to meet new business objectives. ITIL Problem Management Processes tackles the issue by taking a problem-solving and root cause approach.
ITIL applies metrics to track progress in reducing the number of problems. MTBF (mean time between failure) is one such metric. So is MTTR (mean time to repair). The first metric is often used, the second one less so, yet they are both good indicators of the quality of ITIL processes for managing problems.
MTBF concerns the time between failures, as its name suggests. MTTR, on the other hand, is for the time between a service being interrupted and then resumed. Problem diagnosis often accounts for large portion of an MTTR statistic: 80% is common.
If the diagnosis phase can be reduced significantly, MTTR can frequently be reduced significantly as well. Finding out what change caused the interruption can be accelerated by having ready access to the change information you need to consult.
In other words, the more control you have over your IT environment, the smaller your MTTR is likely to be, reflecting well on your ITIL implementation.
Control can be boosted with a number of procedures and processes. Keeping track of changes made is an obvious one. Small IT installations might do this manually.
As IT departments grow larger, an automated solution using a configuration management database (CMDB) may be more efficient, or even indispensable. Upstream, a designated team to review proposed changes can help to reduce or avoid problems.
Controls to detect accidental or unauthorised changes can also be effective, possibly also based on a CMDB. Finally, tracking MTTR and reviewing data from different systems shows where the most improvement is needed or can be made the most quickly. In short, both MTBF and MTTR are valuable indicators and should therefore both be used.