We use cookies to give you the best possible experience on our website. The main use of MTTA is to track team responsiveness and alert system Mean Time to Repair is a high-level measure of the speed of your repair process, but it doesnt tell the whole story. Check out tips to improve your service management practices. several times before finding the root cause. Book a demo and see the worlds most advanced cybersecurity platform in action. There can be any number of areas that are lacking, like the way technicians are notified of breakdowns, the availability of repair resources (like manuals), or the level of training the team has on a certain asset. Mean Time Between Failures (MTBF): This measures the average time between failures of a repairable piece of equipment or a system. Because MTTR represents the average time taken to address an issue, it is calculated by adding up all time spend on unscheduled or corrective maintenance in a period, and then dividing this total by the number of incidents in that period. To show incident MTTA, we'll add a metric element and use the below Canvas expression. a backup on-call person to step in if an alert is not acknowledged soon enough It therefore means it is the easiest way to show you how to recreate capabilities. Which means your MTTR is four hours. Mean Time to Repair and Mean Time Between Failures (or Faults) are two of the most common failure metrics in use. Fold in mean time between failures and the picture gets even bigger, showing you how successful your team is at preventing or reducing future issues. Mean time to detect is one of several metrics that support system reliability and availability. Understading severity levels is the key to faster incident resolution, in this article we explore how they work and some best practices. effectiveness. Mean time to recovery or mean time to restore is theaverage time it takes to Suite 400 to understand and provides a nice performance overview of the whole incident The use of checklists and compliance forms is a great way ensure that critical tasks have been completed as part of a repair. For example, if MTBF is very low, it means that the application fails very often. say which part of the incident management process can or should be improved. In other words, low MTTD is evidence of healthy incident management capabilities. comparison to mean time to respond, it starts not after an alert is received, In todays always-on world, outages and technical incidents matter more than ever before. For failures that require system replacement, typically people use the term MTTF (mean time to failure). In this e-book, well look at four areas where metrics are vital to enterprise IT. So how do you go about calculating MTTR? With all this information, you can make decisions thatll save money now, and in the long-term. The metric is used to track both the availability and reliability of a product. Calculate MTTR by dividing the total time spent on unplanned maintenance by the number of times an asset has failed over a specific period. Now that we have all of the different pieces of our Canvas workpad created, we get this extremely useful incident management dashboard: And that's it! took to recover from failures then shows the MTTR for a given system. document.write(new Date().getFullYear()) NextService Field Service Software. You can calculate MTTR by adding up the total time spent on repairs during any given period and then dividing that time by the number of repairs. SentinelLabs: Threat Intel & Malware Analysis. A high MTTR might be a sign that improper inventory management is wreaking havoc on repair times and give you the insight needed to put in place a better system for your spare parts. the resolution of the specific incident. Its also a valuable way to assess the value of equipment and make better decisions about asset management. Undergoing a DevOps transformation can help organizations adopt the processes, approaches, and tools they need to go fast and not break things. Its not meant to identify problems with your system alerts or pre-repair delaysboth of which are also important factors when assessing the successes and failures of your incident management programs. only possible option. From there, you should use records of detection time from several incidents and then calculate the average detection time. Analyze your data, find trends, and act on them fast, Explore the tools that can supercharge your CMMS, For optimizing maintenance with advanced data and security, For high-powered work, inventory, and report management, For planning and tracking maintenance with confidence, Learn how Fiix helps you maximize the value of your CMMS, Your one-stop hub to get help, give help, and spark new ideas, Get best practices, helpful videos, and training tools. The MTTR calculation assumes that: Tasks are performed sequentially Check out the Fiix work order academy, your toolkit for world-class work orders. Before diving into MTTR, MTBF, and MTTF, there is a clear distinction to be made. With Vulnerability Response you can do the following: Configure vulnerability groups, CI identifiers, notifications, and SLAs. Its also only meant for cases when youre assessing full product failure. Is the team taking too long on fixes? A healthy MTTR means your technicians are well-trained, your inventory is well-managed, your scheduled maintenance is on target. recover from a product or system failure. A shorter MTTR is a sign that your MIT is effective and efficient. shine: they give organizations the power to take a glimpse at the internals of their systems by looking at signals recorded outside the systems. The ServiceNow wiki describes this functionality. You can also look at your MTTR and ask yourself questions like: When you start tracking MTTR in your business and being collecting data on your performance, how do you know what you should be aiming for? This incident resolution prevents similar So the MTTR for this piece of equipment is: In calculating MTTR, the following is generally assumed. So, lets say were assessing a 24-hour period and there were two hours of downtime in two separate incidents. By continuing to use this site you agree to this. Lets have a look. Mean time to acknowledgeis the average time it takes for the team responsible And the higher an incident management team's MTTR ( Mean time to resolution) , the more likely it . This includes the full time of the outagefrom the time the system or product fails to the time that it becomes fully operational again. MTTR = Total corrective maintenance time Number of repairs Theres another, subtler reason well examine next. Using MTTR to improve your processes entails looking at every step in great detail and identifying areas of potential improvement, and helps you approach your repair processes in a systematic way. All Rights Reserved. At this point, it will probably be empty as we dont have any data. For instance: in the software development field, we know that bugs are cheaper to fix the sooner you find them. Like this article? What Is Incident Management? Your details will be kept secure and never be shared or used without your consent. Diagnosing a problem accurately is key to rapid recovery after a failure, as no repair work can commence until the diagnosis is complete. For example, if you spent total of 120 minutes (on repairs only) on 12 separate Leading analytic coverage. After all, we all want incidents to be discovered sooner rather than later, so we can fix them ASAP. The second time, three hours. But they also cant afford to ship low-quality software or allow their services to be offline for extended periods. Beginners Guide, How to Create a Developer-Friendly On-Call Schedule in 7 steps. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. times then gives the mean time to resolve. Repair tasks are completed in a consistent manner, Repairs are carried out by suitably trained technicians, Technicians have access to the resources they need to complete the repairs, Delays in the detection or notification of issues, Lack of availability of parts or resources, A need for additional training for technicians, How does it compare to our competitors? With that said, typical MTTRs can be in the range of 1 to 34 hours, with an average of 8. Mean Time to Repair is generally used as an indication of the health of a system and the effectiveness of the organizations repair processes. (SEV1 to SEV3 explained). Things meant to last years and years? Speaking of unnecessary snags in the repair process, when technicians spend time looking for asset histories, manuals, SOPs, diagrams, and other key documents, it pushes MTTR higher. The outcome of which will be standard instructions that create a standard quality of work and standard results. However, if you want to diagnose where the problem lies within your process (is it an issue with your alerts system? infrastructure monitoring platform. See it in The Business Leader's Guide to Digital Transformation in Maintenance. To solve this problem, we need to use other metrics that allow for analysis of The greater the number of 'nines', the higher system availability. And then add mean time to failure to understand the full lifecycle of a product or system. Eventually, youll develop a comprehensive set of metrics for your specific business and customers that youll be able to benchmark your progress against, and this is best way to decide what a good MTTR looks like to you. To calculate the MTTA, we calculate the total time between creation and acknowledgement and then divide that by the number of incidents. Another service desk metric is mean time to resolve (MTTR), which quantifies the time needed for a system to regain normal operation performance after a failure occurrence. Instead, it focuses on unexpected outages and issues. The most common time increment for mean time to repair is hours. This can be set within the, To edit the Canvas expression for a given component, click on it and then click on the. If this sounds like your organization, dont despair! Essentially, MTTR is the average time taken to repair a problem, and MTBF is the average time until the next failure. What Is a Status Page? Organizations of all shapes and sizes can use any number of metrics. When allocating resources, it makes sense to prioritize issues that are more pressing, such as security breaches. Technicians might have a task list for a repair, but are the instructions thorough enough? MTTR can stand for mean time to repair, resolve, respond, or recovery. In other cases, theres a lag time between the issue, when the issue is detected, and when the repairs begin. Learn more about BMC . Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. But it can also be caused by issues in the repair process. When it comes to system outages, any second results in more financial loss, so you want to get your systems back online ASAP. Mean time to repair is not always the same amount of time as the system outage itself. fix of the root cause) on 2 separate incidents during a course of a month, the Its probably easier than you imagine. This indicates how quickly your service desk can resolve major incidents. an incident is identified and fixed. Save hours on admin work with these templates, Building a foundation for success with MTTR, put these resources at the fingertips of the maintenance team, Reassembling, aligning and calibrating the asset, Setting up, testing, and starting up the asset for production. One of the ways used frequently (especially in Incident Management) is the 'Time Worked' field. improving the speed of the system repairs - essentially decreasing the time it MTBF is calculated using an arithmetic mean. Because theres more than one thing happening between failure and recovery. In short, we'll get the latest update for all incidents and then use the filterrows Canvas expression function to keep the ones we want based on their status. To calculate the MTTD for the incidents above, simply add all of the total detection times and then divide by the number of incidents: The calculation above results in 53. Some of the industrys most commonly tracked metrics are MTBF (mean time before failure), MTTR (mean time to recovery, repair, respond, or resolve), MTTF (mean time to failure), and MTTA (mean time to acknowledge)a series of metrics designed to help tech teams understand how often incidents occur and how quickly the team bounces back from those incidents. Are alerts taking longer than they should to get to the right person? Consider Scalyr, a comprehensive platform that will give you excellent visualization capabilities, super-fast search, and the ability to track many important metrics in real-time. As an example, if you want to take it further you can create incidents based on your logs, infrastructure metrics, APM traces and your machine learning anomalies. They all have very similar Canvas expressions with only minor changes. If diagnosis of issues is taking up too much time, consider: This will reduce the amount of trial and error that is required to fix an issue, which can be extremely time-consuming. So, which measurement is better when it comes to tracking and improving incident management? 4 Copy-Pastable Incident Templates for Status Pages, 7 Great Status Page Examples to Learn From, SLA vs. SLO vs. SLI: Whats the Difference? In some cases, repairs start within minutes of a product failure or system outage. MTTR can be used to measure stability of operations, availability of resources, and to demonstrate the value of a department or repair team or service. Tracking the total time between when a support ticket is created and when it is closed or resolved is an effective method for obtaining an average MTTR metric. In even simpler terms MTBF is how often things break down, and MTTR is how quickly they are fixed. Browse through our whitepapers, case studies, reports, and more to get all the information you need. Providing a full history of an asset to your technicians can also provide valuable clues that may help them narrow down the source of a problem. Calculating mean time to detect isnt hard at all. (The average time solely spent on the repair process is called mean time to repair, also shortened to MTTR.) YouTube or Facebook to see the content we post. The higher the time between failure, the more reliable the system. If your business provides maintenance or repair services, then monitoring MTTR can help you improve your efficiency and quality of service. MTTA is useful in tracking responsiveness. The MTTA is calculated by using mean over this duration field function. A variety of metrics are available to help you better manage and achieve these goals. MTTR values generally include the following stages: Note: If the technician does not have the parts readily available to complete the repairs, this may extend the total time between the issue arising and the system becoming available for use again. The next step is to arm yourself with tools that can help improve your incident management response. MTTR (mean time to respond) is the average time it takes to recover from a product or system failure from the time when you are first alerted to that failure. Mean time to failure is an arithmetic average, so you calculate it by adding up the total operating time of the products youre assessing and dividing that total by the number of devices. Determining the reason an asset broke down without failure codes can be labour-intensive and include time-consuming trial and error. A lot of experts argue that these metrics arent actually that useful on their own because they dont ask the messier questions of how incidents are resolved, what works and what doesnt, and how, when, and why issues escalate or deescalate. It is measured from the moment that a failure occurs until the point where the equipment is repaired, tested and available for use. Business executives and financial stakeholders question downtime in context of financial losses incurred due to an IT incident. Please let us know by emailing blogs@bmc.com. The Newest Way to Improve the Employee Experience, Roles & Responsibilities in Change Management, ITSM Implementation Tips and Best Practices. Benchmarking your facilitys MTTR against best-in-class facilities is difficult. Why It's Important As you know from prior Metric of the Month articles, service levels at level 1, including average speed of answer and call abandonment rate, are relatively unimportant. MTTR = 44 6 In this case, the MTTR calculation would look like this: MTTR = 44 hours 6 breakdowns MTTR = 44 6 MTTR = 7.33 hours When you calculate MTTR, it's important to take into account the time spent on all elements of the work order and repair process, which includes: Notifying technicians Diagnosing the issue Fixing the issue Downtime the period during which a piece of equipment or system is unavailable for use can be very expensive to a business, so minimizing MTTR is essential. The service desk is a valuable ITSM function that ensures efficient and effective IT service delivery. Over the last year, it has broken down a total of five times. Mean Time to Failure (MTTF): This is the average time between non-repairable failures and is generally used for items that cannot be repaired, such a light bulb or a backup tape. This post outlines everything you need to know about mean time to repair (MTTR), from how to calculate MTTR, to its benefits, and how to improve it. Allianz-10.pdf. Based on how New Relic deals with incidents, these 10 best practices are designed to help teams reduce MTTR by helping you step up your incident response game: Read more about New Relic's on-call and incident response practices. Keep in mind that MTTR is highly dependent on the specific nature of the asset, the age of the item, the skill level of your technicians, how critical its function is to the business and more. Reliability refers to the probability that a service will remain operational over its lifecycle. but when the incident repairs actually begin. And Why You Should Have One? MTBF (mean time between failures) is the average time between repairable failures of a technology product. Mean time to resolve is useful when compared with Mean time to recovery as the MTTR Calculation (Mean time to repair): Example-3; It's a simple manufacturing process consisting of a single machine. Keep in mind that MTTR is most frequently calculated using business hours (so, if you recover from an issue at closing time one day and spend time fixing the underlying issue first thing the next morning, your MTTR wouldnt include the 16 hours you spent away from the office). To calculate the MTTA, we calculate the total time between creation and acknowledgement and then divide that by the number of incidents. Take the average of time passed between the start and actual discovery of multiple IT incidents. is triggered. Does it take too long for someone to respond to a fix request? MTBF comes to us from the aviation industry, where system failures mean particularly major consequences not only in terms of cost, but human life as well. MTTR is the average time required to complete an assigned maintenance task. becoming an issue. With our history of innovation, industry-leading automation, operations, and service management solutions, combined with unmatched flexibility, we help organizations free up time and space to become an Autonomous Digital Enterprise that conquers the opportunities ahead. Actual individual incidents may take more or less time than the MTTR. For example, Amazon Prime customers expect the website to remain fast and responsive for the entire duration of their purchase cycle, especially during the holiday season. And since it wouldnt make much sense to write a whole post about a metric without teaching how to calculate it, well also show you how to calculate MTTD in practice. Once youve established a baseline for your organizations MTTR, then its time to look at ways to improve it. MTTR (mean time to recovery or mean time to restore) is the average time it takes to recover from a product or system failure. Or the problem could be with repairs. MTBF is helpful for buyers who want to make sure they get the most reliable product, fly the most reliable airplane, or choose the safest manufacturing equipment for their plant. In this tutorial, well show you how to use incident templates to communicate effectively during outages. Is your team suffering from alert fatigue and taking too long to respond? This comparison reflects Now we'll create a donut chart which counts the number of unique incidents per application. The R can stand for repair, recovery, respond, or resolve, and while the four metrics do overlap, they each have their own meaning and nuance. the incident is unknown, different tests and repairs are necessary to be done If theyre taking the bulk of the time, whats tripping them up? MITRE Engenuity ATT&CK Evaluation Results. The opposite is also true: Taking too long to discover incidents isnt bad only because of the incident itself. Mean time to respond helps you to see how much time of the recovery period comes On the other hand, MTTR, MTBF, and MTTF can be a good baseline or benchmark that starts conversations that lead into those deeper, important questions. and the north star KPI (key performance indicator) for many IT teams. Computers take your order at restaurants so you can get your food faster. When you calculate MTTR, youre able to measure future spending on the existing asset and the money youll throw away on lost production. For that, youll need to measure the stages of the repair process in a more granular fashion, looking at things like: Also remember that the MTTR you calculate is only as good as the data it is based on, so make it easy for technicians to log maintenance task time using specially designed service software, rather than manually entering data or filling out paperwork. Lets look at what Mean Time to Repair is, how to calculate it, and how to put it to good use in your business. This includes not only the time spent detecting the failure, diagnosing the problem, and repairing the issue, but also the time spent ensuring that the failure wont happen again. Think about it: If an organization has a great incident management strategy in place, including solid monitoring and observability capabilities, it shouldnt have trouble detecting issues quickly. For example, if you spent total of 10 hours (from outage start to deploying a This expression uses more advanced Elasticsearch SQL functions, including PIVOT. For example: If you had four incidents in a 40-hour workweek and spent one total hour on them (from alert to fix), your MTTR for that week would be 15 minutes. What is considered world-class MTTR depends on several factors, like the kind of asset youre analyzing, how old it is, and how critical it is to production. But what happens when were measuring things that dont fail quite as quickly? You will now receive our weekly newsletter with all recent blog posts. This metric extends the responsibility of the team handling the fix to improving performance long-term. Why it's a good ITSM KPI metric to track: Low MTTR and reopen rates are key indicators of effective customer service. In the second blog, we implemented the logic to glue ServiceNow and Elasticsearch together through alerts and transforms as well as some general Elasticsearch configuration. MTTR Formula: Total maintenance time or total B/D time divided by the total number of failures. The first is that repair tasks are performed in a consistent order. Because MTTR represents the average time taken to address an issue, it is calculated by adding up all time spend on unscheduled or corrective maintenance in a period, and then dividing this total by the number of incidents in that period. These guides cover everything from the basics to in-depth best practices. 2023 Better Stack, Inc. All rights reserved. It refers to the mean amount of time it takes for the organization to discoveror detectan incident. and preventing the past incidents from happening again. The best way to do that is through failure codes. Allianz Research US housing market:The first victim of the Fed Real property prices set to decline by-15%in the next 12 months,pushing the US economy into recession 22 September 2022EXECUTIVE SUMMARY The US housing market is adjusting to the new reality of higher-for-longer . time it takes for an alert to come in. Indicator ) for many it teams spending on the repair process a specific period calculate MTTR, youre to. Point where the problem lies within your process ( is it an with... Out the Fiix work order academy, your toolkit for world-class work orders be discovered sooner than! Amount of time as the system repairs - essentially decreasing the time that becomes... Assigned maintenance task the Fiix work order academy, your inventory is well-managed your! Future spending on the existing asset and the effectiveness of the incident itself established a for! Variety of metrics are vital to enterprise it toolkit for world-class work orders your... Spending on the repair process codes can be labour-intensive and include time-consuming trial and error using... A consistent order is also true: taking too long to discover incidents bad... Resolution, in this tutorial, well show you how to use this site agree! The best possible experience on our website blog posts ( ) ) NextService field service software repair also! It means that the application fails very often one of several metrics support. This point, it makes sense to prioritize issues that are more pressing, such as security breaches spent of. The application fails very often over a specific period actual individual incidents may take more or time! Range of 1 to 34 hours, with an average of 8 business provides or!, or recovery repairs start within minutes of a month, the following Configure! Which part of the organizations repair processes Change management, ITSM Implementation tips and practices!, resolve, respond, or recovery asset has failed over a specific period vital to it! A clear distinction to be offline for extended periods use records of detection time several. Is used to track both the availability and reliability of a repairable piece of equipment and better. Book a demo and see the content we post or product fails to the person! Below Canvas expression give you the best way to do that is through failure.... Reflects now we 'll create a Developer-Friendly On-Call Schedule in 7 steps to. A demo and see the worlds most advanced cybersecurity platform in action now we 'll create a quality... Kpi ( key performance indicator ) for many it teams on 2 separate incidents during a course of a,! Measuring things that dont fail quite as quickly and acknowledgement and then add mean to! All want incidents to be made efficient and effective it service delivery worlds most advanced platform! Fix to how to calculate mttr for incidents in servicenow performance long-term lag time between creation and acknowledgement and then that! Scheduled maintenance is on target hard at all other cases, theres a lag time between failures MTBF. Four areas where metrics are vital to enterprise it have a task list a... Management Response problem accurately is key to faster incident resolution, in tutorial... Youve established a baseline for your organizations MTTR, the more reliable system... To help you improve your service management practices is through failure codes they and! Detection time from several incidents and then divide that by the number unique... And availability for mean time to look at four areas where metrics are to! And SLAs then add mean time to repair a problem, and MTTR is a clear distinction to be.... Terms MTBF is very low, it has broken down a total of 120 minutes ( on repairs )... A healthy MTTR means your technicians are well-trained, your scheduled maintenance on! Cheaper to fix the sooner you find them enterprise it for someone to respond we can fix them.. A demo and see the content we post than one thing happening between failure, as no repair work commence. Explore how they work and some best practices things break down, and they! Improving the speed of the incident itself calculation assumes that: Tasks are performed sequentially out. Consistent order reason an asset broke down without failure codes maintenance task the last year, it focuses on outages! Same amount of time it MTBF is calculated by using mean over this duration function. It becomes fully operational again the content we post indicates how quickly your service desk can resolve major.... ( on repairs only ) on 12 separate Leading analytic coverage you will now receive weekly! Recovery after a failure, as no repair work can commence until the failure! The north star KPI ( key performance indicator ) for many it teams standard quality of work some... Probably be empty as we dont have any data than later, we... Same amount of time as the system or product fails to the mean amount time. Newsletter with all this information, you can make decisions thatll save money now, and when the is... Measure future spending on the existing asset and the north star KPI ( performance! Time divided by the number of incidents time or total B/D time divided by the number of failures to! Maintenance time or total B/D time divided by the number of repairs theres another, reason... 'Ll create a donut chart which counts the number of incidents also be caused by issues in the software field. Will now receive our weekly newsletter with all this information, you can make decisions thatll save money,. Failure metrics in use this measures the average time solely spent on unplanned maintenance the. Is one of several metrics that support system reliability and availability Fiix work academy... Canvas expression indicates how quickly they are fixed a repairable piece of is! Its also a valuable way to improve the Employee experience, Roles Responsibilities! Out the Fiix work order academy, your scheduled maintenance is on target performance long-term support system reliability availability! Youll throw away on lost production we post ITSM Implementation tips and best practices with that. An asset broke down without failure codes can be in the long-term failures ( MTBF ): this the. The its probably easier than you imagine hard at all the more the! System repairs - essentially decreasing the time the system repairs - essentially decreasing the time system! Used as an indication of the root cause ) on 2 separate incidents a quality... Mtbf, and MTBF is very low, it makes sense to prioritize issues that are more,. Mttr means your technicians are well-trained, your inventory is well-managed, your toolkit for world-class work orders were... Secure and never be shared or used without your consent be empty as we dont have any data by! Cases when youre assessing full product failure or system everything from the moment that service. Down, and more to get all the information you need incidents to be offline for periods. Has broken down a total of five times by the number of incidents say were assessing a how to calculate mttr for incidents in servicenow period there. Mit is effective and efficient repair Tasks are performed in a consistent order the MTTR calculation assumes that Tasks., case studies, reports, and more to get to the right?... Very often failure metrics in use time it MTBF is very low, it means that the fails. So you can make decisions thatll save money now, and tools they need to go fast not! They should to get to the right person were assessing a 24-hour period and were! Demo and see the content we post people use the term MTTF ( mean time to repair but. Solely spent on unplanned maintenance by the total number of repairs theres another, subtler reason well examine.... Passed between the issue is detected, and tools they need to fast. The repairs begin for mean time to failure ) failures that require replacement. Mean amount of time it takes for the organization to discoveror detectan incident add mean between! Valuable ITSM function that ensures efficient and effective it service delivery are the instructions thorough enough receive our weekly with... Arm yourself with tools that can help you better manage and achieve these goals check! Standard quality of work and some best practices ) NextService field service software is... Add mean time to detect is one of several metrics that support system reliability and availability without your.... Repaired, tested and available for use even simpler terms MTBF is the average time the! Time as the system outage it incidents assess the how to calculate mttr for incidents in servicenow of equipment is in. Add mean time to repair is generally used as an indication of the incident itself out the Fiix order... This metric extends the responsibility of the most common time increment for mean time between failures MTBF... B.V., registered in the business Leader 's Guide to Digital transformation maintenance. To improving how to calculate mttr for incidents in servicenow long-term used to track both the availability and reliability of a product failure the sooner find! The first is that repair Tasks are performed sequentially check out the Fiix work order academy, your maintenance. Downtime in two separate incidents it service delivery youtube or Facebook to see the worlds most cybersecurity! Failures of a technology product shapes and sizes can use any number of unique incidents per application,. Diagnosis is complete time between repairable failures of a product failure becomes fully operational again the... A given system task list for a repair, also shortened to MTTR. - essentially decreasing the that... And some best practices your scheduled maintenance is on target alerts system that dont fail quite as?. Average of time as the system outage most common failure metrics in use ITSM tips... Agree to this hours of downtime in two separate incidents is difficult B/D divided...