It should come as no surprise that websites thrive on traffic. So naturally, it follows that driving traffic to your site is a strong motivation for any company looking to grow their web presence. However ironically, driving traffic to your site can also be a double-edged sword if your infrastructure is not properly prepared to handle the load. This means that, strangely, popularity can actually become a potential cause of an outage.
Yesterday, popular Internet forum and message board Reddit discovered this firsthand.In an interesting campaign move, President Barack Obama graced the site with his presence by doing an “Ask Me Anything” (AMA) thread, a message thread in which commenters submit questions and the original poster responds. Word about this rare opportunity to send the President of the United States a direct message spread across social media like a wildfire, leading to a massive spike in traffic that ultimately brought down Reddit a mere few minutes into the life of the thread. Current figures show that their number of unique connections and pageviews both more than tripled compared to their typical traffic. Eventually the site came back online and the AMA progressed as usual.
During the past three years, you may have noticed that security and risk professionals have added a new term to their lexicon – business resiliency. Is this just an attempt by vendors to rebrand business continuity (BC) and IT disaster recovery (DR) in much the same way that vendors rebranded information security as cybersecurity to make it seem sexier and to sell more of their existing products? Some of it certainly is rebranding. However, like the shift in the threat landscape from lone hackers to well-funded crime syndicates and state sponsored agents that precipitated the use of the term cybersecurity, a real shift has also taken place in BC/DR.
If you look up the term “resiliency” in the dictionary, it’s defined as “an occurrence of rebounding or springing back”. Thus, business resiliency refers to the ability of a business to spring back from a disruption to its operations. Historically, BC/DR focused on the ability of the business to recover from a disruption. Recovery implies that there was in fact a disruption, that for some period of time, business operations were unavailable, there was downtime as the business strove to recover. Resiliency, on the other hand, implies that an event may have affected the business’ operations, perhaps the business operated in a diminished state for some period of time, but operations were never completely unavailable, the business was never down.
The current state of business continuity management (BCM) standards? Abysmal. According to a joint Forrester/DRJ study, 69% of respondents said that British Standard (BS) 25999 did not influence or only somewhat influenced BCM at their company. It’s not much better for NFPA 1600, 70% of respondents said that it did not, or only somewhat, influenced BCM at their company. I find this shocking. BS 25999 is one of the most widely recognized standards for BCM worldwide and NFPA 1600 has been popular in the US for years. In addition, the U.S Department of Homeland Security’s Private Sector Preparedness Program (PS‑Prep) recognizes both of these standards for assessing preparedness. If you’re wondering what standards respondents named in the “Other” category, it was mostly the Federal Financial Institutions Examination Council (FFIEC) and NIST. Not surprising but also a little disheartening, it’s clear that unless compelled to do so, most BC professional would not adopt or follow a BCM standard.
Even if you don’t intend to certify to these standards, they should strongly influence your BCM program. Why? It’s because:
They provide a foundation and a common vocabulary for BCM best practices and processes. This is important if you need to implement BCM across a geographically dispersed enterprise or you have to work with a multitude of global partners on joint preparedness.
In a recent Forrester/DRJ joint survey on BC preparedness, of organizations that have invoked a BC plan in the last five years, 37% said that their BC plans had not adequately addressed communication. In my experience, I’ve found that many organizations:
Don’t appreciate the importance of effective communication. Many organizations focus the content of their BC plans and the goals of their BC exercises on the details of recovery procedures but don’t focus on how they will contact and coordinate response teams, employees, partners, first responders and customers. If you can’t communicate, you can’t respond to anything.
Rely on manual procedures like call lists or email alone. By themselves, manual procedures are unreliable, they don’t scale for organizations with thousands of employees (or citizens) and they don’t provide any kind of reporting.
Underestimate the difficulty of communicating effectively under stress. During the incident is not the time to attempt to craft effective communication messages or look for a secondary mode of communication because your first mode of communication (land lines and email) is no longer available.
There has been a lot of buzz around using the cloud for disaster recovery lately, and with good reason -- it's a new and compelling approach to fast recovery. However, along with any hype comes a certain amount of confusion, so I set out to get some clarity on what cloud-based disaster recovery really is. The core feature of any cloud-based recovery is that ability to actually recover at the providers' location using their cloud assets. Just copying data there is not true recovery. I also realized that the term "cloud-based disaster recovery" was too broad, and that actually solutions fall into one of three categories:
Do-it-yourself (DIY): Using the public cloud to architect a custom failover solution leveraging the agility and speed of the cloud.
DR-as-a-service (DRaaS): Prepackaged services that provide a standard DR failover to a cloud environment that you can buy on a pay-per-use basis with varying rates based upon your recovery point objective (RPO) and recovery time objective (RTO). Data is either sent using backups or replication.
Cloud-to-cloud disaster recovery (C2C DR): The ability to failover infrastructure from one cloud data center to another, either within a single vendor's environment or across multiple vendors.
Right now, the internet probably seems like the Wild West. Hackers are roaming around, seemingly attacking websites on a whim. Most recently, groups like Anonymous, the Jester, and Lulz Security (LulzSec – now supposedly disbanded) have been attacking and successfully taking down web sites of all types. Government and corporate, public and private, anybody seems as though they can be a target for these attacks. While their reasons for attacking a site range from political statement to simply for the fun of it, hacktivists and black hat trouble makers alike, the end result is that hacking is now a real cause of downtime.
Disaster recovery-as-a-service (DRaaS), in my opinion, is one of the most exciting areas I look at. To me, using the cloud for disaster recovery (DR) purposes makes perfect sense: the cloud is an on-demand resource that you pay for as you need it (i.e., during a disaster or testing). Up until now, there haven't been many solutions out there that truly offered DRaaS--replicating physical or virtual servers to the cloud and the ability to failover production to the cloud provider's environment (you can read more about my definition of DRaaS in my recent TechRadar report), but so far today, we've seen TWO new DRaaS platforms announced from VMware and SunGard! Here's a quick roundup of what was announced today:
VMware. VMware announced at VMworld that they will be making their popular Site Recovery Manager (SRM), a DR automation tool, available as a service through hosting and cloud partners. At launch, participating partners are FusionStorm, Hosting.com, iland, and Veristor. Benefits: Built into the VMware platform. Limitations: VMware specific.
A recent RFP for consulting services regarding strategic platforms for SAP from a major European company which included, among other things, a request for historical and forecast data for all the relevant platforms broken down by region and a couple of other factors, got me thinking about the whole subject of the use and abuse of market share histories and forecasts.
The merry crew of I&O elves here at Forrester do a lot of consulting for companies all over the world on major strategic technology platform decisions – management software, DR and HA, server platforms for major applications, OS and data center migrations, etc. As you can imagine, these are serious decisions for the client companies, and we always approach these projects with an awareness of the fact that real people will make real decisions and spend real money based on our recommendations.
The client companies themselves usually approach these as serious diligences, and usually have very specific items they want us to consider, almost always very much centered on things that matter to them and are germane to their decision.
The one exception is market share history and forecasts for the relevant vendors under consideration. For some reason, some companies (my probably not statistically defensible impression is that it is primarily European and Japanese companies) think that there is some magic implied by these numbers. As you can probably guess from this elaborate lead-in, I have a very different take on their utility.
I've got backup on the brain. I guess this isn't an unusual occurrence for me, but it's also been bolstered by a week at Symantec Vision, a week at EMC World, as well as backup announcements about IBM's data protection hardware and CommVault's PC backup enhancements not to mention the flurry of cloud backup news this week from Trend Micro, CA Technologies, and Carbonite. All of this has gotten me thinking about the future of backup... we've come a long way from simple agent-based backup and recovery. Backup is just one piece in an ever-increasingly complicated puzzle we call continuity. If backup software vendors want to stay relevant they're going to need to offer a lot more than just backup in their "data protection" suites.