— How To Communicate SLAs In The Cloud?

Today Informatica announced the availability of a trust site similar to what other major cloud platforms like ( and NetSuite (with its availability status) have done before.

Informatica also added more enterprise-level connectivity and a 24x7 support to its cloud offering, thus making it more enterprise ready than ever.

Let’s have a look at these sites for a minute and analyze the value of this new way of communicating availability:






It actually looks like the industry is moving away from the traditional service-level agreement (SLA) communication, with its well-defined statistical availability number of 99.9%, 99.995%, etc. I believe that this makes a lot of sense for most cloud computing platforms in the SaaS and PaaS category, as I noted in my recent blog on cloud computing taxonomy

  • SLAs in the cloud are simply different. The service-level agreement of a cloud platform is not automatically better or worse than a traditionally hosted system on premise. It is simply different! While you have a stable level of performance with a dedicated server on premise, the cloud operates highly scaled out as an elastic environment. The pool of performance can handle much bigger load peaks than the traditional approach, but, at the same time, all “your” spare capacity is directly used by other citizens on the same platform. The perceived performance in cloud computing economics can commit therefore to a high average performance, while the response times can fluctuate if some systems are about to be re-provisioned at the time of access. The same thing happens to the total availability of a business app: While a traditional server might have outages, similar to single servers in a cloud platform, the effect is totally different. The former case leads to a direct outage of the business application while the cloud might simply have a temporary reduction in performance if single machines fail. The bottom line is that the availability and average performance could be higher, while in single-performance situations it might fluctuate.
  • Cloud applications are perceived by business people — not by IT people. IT people are closely monitoring the performance of the systems that they are “responsible” for. In contrast to this, performance outages of software-as-a-service (SaaS) applications reach IT’s awareness in many cases only through complaints from end users. Thus, the target group to be convinced about reliability and performance is not primarily the IT operations staff of a corporate environment anymore, it is the business user. And many of these business users are struggling to understand a statistical availability number like 99.x%. Trust is actually created more by examples of continuous performance along this person’s workday and time zone.
  • Issue communication is proactive. A great way to create business user trust is to communicate issues even before the user is aware of it. The above screen shots show that all of the platforms had minor issues. The big difference to traditional application performance management is the fact that all this information is public. You can look at the past issues even if you are not a user of that cloud platform (yet).
  • Traditional SLAs stay around for most cloud platforms. Cloud computing platforms that follow more of an infrastructure-as-a-service (IaaS) approach are replacing some real hardware on premise, and the subscribing IT operational pros are used to statistical availability figures. But developers that might subscribe to an IaaS approach for test and development purposes might look more at the visualized availability over the past weeks than at a statistical number. For business applications deployed as SaaS, only the IT staff who are involved in the initial purchase process of the subscriptions look at the availability figure. While the 99.9x% type figure will need to stay around, it will become a more secondary figure.
  • Mind faked data! Once business users do experience a problem connecting to their application in the cloud, they will immediately look at the site to figure out if the problem might be caused by a local problem in the local-area network (LAN), the PC, or post-PC device, or if it is actually an outage of the platform. If the latter unlikely case happens, and the trust page does not report a problem, the customer’s confidence in the accuracy of the information might be quickly lost and will be difficult to recover. Vendors need to report very honestly and sensitively to avoid falling into this trap.

The availability of a page is the right thing to build trust into a utility computing grid. In my opinion, Informatica is absolutely following the right trend in this direction. But, vendors need to keep an actual technical availability figure — at least on demand — prepared for the more technical buying personas.

Please post a comment on how, as a cloud user, you like to have availability reported in a trust page or how you experienced it as a cloud provider.



there are no statistics

Hi Stefan,

thanks for that post which added a few worthwhile thoughts to the topic "trust on SaaS".
Firstly it is importent to remember that in classic On-Premise oder ASP-Solutions they never really had 99,XX SLA (in ASP contracts many exlusions whre written down).

But having a closer look at these trust-sites, it becomes obvious, that this isn`t any monitoring, that is just communication in a timeline instead of blogs.

At AWS for example ( they reportet on 2009/12/3. a problem that was reported one month ago, on 11/7. So they reported their problem one month after it was reported, just when they solved it.
Looking at the AWS Blog, in comparison to the Salesforce table shows also that there happend nearly nothing this year. (excellent? Or do they have some reported problems they will publish, when solved?)

Anyway, I agree with you, that these forms of platforms help to build customers confidence and I am curios of th further development in this area.

Have a nice day!

keep challenging the vendors

Hi Anselm,

thanks for you comment. Such an delay you mention on AWS is obviously not creating thus but brings up more concerns about the credibility. That's why customers should keep demanding for accurate and current data. Hopefully the trust pages become a competitive differentiator and their quality will become better in the future.