As a follow up to his presentation at the 2013 itSMF Norway conference, Stuart Rance of HP has kindly donated some practical advice for those struggling with availability.
Many IT organizations define availability for IT services using a percentage (e.g. 99.999% or “five 9s”) without any clear understanding of what the number means, or how it could be measured. This often leads to dissatisfaction, with IT reporting that they have met their goals even though the customer is not satisfied.
A simple calculation of availability is based on agreed service time (AST), and downtime (DT).
If AST is 100 hours and downtime is 2 hours then availability would be
Customers are interested in their ability to use IT Services to support business processes. Availability reports will only be meaningful if they describe things the customer cares about, for example the ability to send and receive emails, or to withdraw cash from ATMs.
Number and duration of outages
A service that should be available for 100 hours and has 98% availability has 2 hours downtime. This could be a single 2 hour incident, or many shorter incidents. The relative impact of a single long incident or many shorter incidents is different for different business processes. For example, a billing run that has to be restarted and takes 2 days to complete will be seriously impacted by each outage, but the outage duration may not be important. A web-based shopping site may not be impacted by a 2 minute outage, but after 2 hours the loss of customers could be significant. Table 1 shows some examples of how an SLA might be documented to show this varying impact.
I promised a second blog based on the English-language presentations at the itSMF Norway annual conference but then I had a better idea … rather than just giving you the something akin to Twitter highlights I decided to be cheeky and ask a couple of the presenters to write blogs based on their presentations. Smart or lazy, I think it is better for you the reader.
Here is the first from Paul Wilkinson of GamingWorks – no stranger to writing blogs for my Forrester blog roll. The second is by Stuart Rance of HP and this will appear soon. Paul’s topic?
“How to improve the Return On Value (ROV) of an IT service management training initiative”
To quote Paul: “Hardly an innovative, exciting, sexy subject when everybody wants to hear about cloud, BYOD, social media, and all that new stuff.” BUT Paul was asked to present the same session he delivered in 2012 given that it was one of the top 3 well-received the previous year. I personally thoroughly enjoyed it – Paul is good at making you believe that there is “a better way” when it comes to changing the way we think about IT service delivery.
What were Paul’s key messages?
What was so important? Why should you read on? What should YOU now do differently?
Paul set the scene nicely. In his words (with a little editing by yours truly):