Welcome to the I&O Transformation Playbook.

To complement the brilliant introduction to the “Infrastructure Transformation Playbook for 2015” by my friend Glenn O’Donnell, the operation’s analyst team, the “O” in I&O, would like to welcome you to the “Infrastructure And Operations Transformation Playbook for 2015”.

In this playbook, we do not predict the future of technology but we try to understand how, in the age of the customer, I&O must transform to support businesses by accelerating the speed of service delivery, enabling capacity when and where needed and improving customers and employee experience.

All industries mature towards commoditization and abstraction of the underlying technology because knowledge and expertise are cumulative. Our industry will follow an identical trajectory that will result in ubiquitous and easier to implement, manage and change technology.

Read more

BMC Software Goes Private

 

Yesterday, BMC Software announced that has signed a definitive agreement to be acquired by a private investor group led by Bain Capital and Golden Gate Capital together with GIC Special Investments Pte Ltd (“GIC”) and Insight Venture Partners (collectively, the “Investor Group”).

Under the terms of the agreement, affiliates of the Investor Group will acquire all outstanding BMC common stock for $46.25 per share in cash, or approximately $6.9 billion.

This is one of the largest M&A operations in a long time. Significantly, it has been prepared for quite some time, which culminated in a restructuring a month ago, by which the five product groups operating under BMC Software became one. Instead of having several categories reporting their gains (or losses) we have now one happy family where the gain of one member balances the loss of another. We have also a unique opportunity to have these former product lines working together for a better integration of BMC Software solutions with a corollary prospect of having more R&D investments in previously “weak” categories. Being free of the short term mandatory “good results to satisfy the street” will also participate in building a better BMC Software.

Although fourth quarter results were below the Street expectation by a hair (-$.06 per share and -.04% in Revenue), BMC Software bookings grew 14% from a year ago, with an encouraging result for ESM which was up 9% from a year ago.

Over the past ten years, BMC Software has made its mark on the IT Management Software (ITMS) market, and is today only second to CA Technologies. From what we can see, the privatization of BMC Software provides an opportunity to invest into the future of ITMS and to become a serious contender for first place in the years to come.

Read more

If You Don’t Manage Everything, You Don’t Manage Anything

I’m always surprised to see that the Citroen 2CV (CV: Cheval Vapeur, hence the name Deux Chevaux) has such a strong following, even in the United States. Granted, this car was the epitome of efficiency: It used minimum gas (60 miles to the gallon), was eminently practical, and its interior could be cleaned with a garden hose. Because the car was minimalist to the extreme, the gas gauge on the early models was a dipstick of some material, with marks to show how many liters of gas were left in the tank. For someone like me, who constantly forgot to consult the dipstick before leaving home, it meant that I would be out of gas somewhere far from a station almost every month. A great means of transportation failed regularly for lack of instrumentation. (Later models had a gas gauge.)

 This shows how failure to monitor one element leads to the failure of the complete system — and that if you don’t manage everything you don’t manage anything, since the next important issue can develop in blissful ignorance.

The point is that we often approach application performance management from the same angle that Citroen used to create the 2CV: provide only the most critical element monitoring in the name of cost-cutting. This has proved time and again to be fraught with risk and danger. Complex, multitier applications are composed of a myriad of components, hardware and software, that can fail.

In application performance management, I see a number of IT operations focus their tools on some critical elements and ignore others. But even though many of the critical hardware and software components have become extremely reliable, it doesn’t mean that they are impervious to failure: There is simply no way to guarantee the life of a specific electronic component.

Read more

Is Infrastructure & Operations Vulnerable To Job Market Trends?

A couple of weeks ago, I read that one of the largest US car makers was trying to buy out several thousand machinists and welders. While we have grown accustomed to bad news in this economy, what I found significant was that these were skilled workers. Personally, I find it a lot easier to write code than to weld two pieces of steel together, and I have tried both.

For the past 20 years, the job market in industrialized countries has shown a demand increase at the high and low ends of the wage and skill scale, to the detriment of the middle. Although it’s something that we may have intuitively perceived in our day-to-day lives, a 2010 paper by David Autor of MIT confirms the trend:

“. . . the structure of job opportunities in the United States has sharply polarized over the past two decades, with expanding job opportunities in both high-skill, high-wage occupations and low-skill, low-wage occupations, coupled with contracting opportunities in middle-wage, middle-skill white-collar and blue-collar jobs.”

One of the reasons for this bipolarization of the job market is that most of the tasks in the middle market are based on well-known and well-documented procedures that can be easily automated by software (or simply offshored). This leaves, at the high end, jobs that require analytical and decision-making skills usually based on a solid education, and at the low end, “situational adaptability, visual and language recognition, and in-person interactions. . . . and little in the way of formal education.”

Can this happen to IT? As we are fast-forwarding to an industrial IT, we tend to replicate what other industries did before us, that is remove the person in the middle through automation and thus polarize the skill and wage opportunities at both ends of the scale.

Read more

BSM Rediscovered

I have in the past lamented the evolution of BSM into more of an ITIL support solution than the pure IT management project that we embarked on seven years ago. In the early years of BSM, we all were convinced of the importance of application dependency discovery: It was the bridge between the user, who sees an application, and IT, which sees infrastructures. We were all convinced that discovery tools should be embedded in infrastructure management solutions to improve them. I remember conversations with all big four product managers, and we all agreed at the time that the “repository” of dependencies, later to become CMDB, was not a standalone solution. How little we knew!

 What actually happened was the discovery tools showed a lot of limitations and the imperfect CMDB that was the result became the center of ITIL v2 universe. The two essential components that we saw in BSM for improving the breed of system management tools were quietly forgotten. These two major failures are: 1) real-time dependency discovery — because last month’s application dependencies are as good as yesterday’s newspaper when it comes to root cause analysis or change detection, and 2) the reworking of tools around these dependencies — because it added a level of visibility and intelligence that was sorely lacking in the then current batch of monitoring and management solutions. But there is hope on the IT operations horizon.

These past few days, I have been briefed by two new companies that are actually going back to the roots of BSM.

Neebula has introduced a real-time discovery solution that continuously updates itself and is embedded into an intelligent event and impact analysis monitor. It also discovers applications in the cloud.

Read more

How Complexity Spilled The Oil

The Gulf oil spill of April 2010 was an unprecedented disaster. The National Oil Spill Commission’s report summary shows that this could have been prevented with the use of better technology. For example, while the Commission agrees that the monitoring systems used on the platform provided the right data, it points out that the solution used relied on engineers to make sense of that data and correlate the right elements to detect anomalies. “More sophisticated, automated alarms and algorithms” could have been used to create meaningful alerts and maybe prevent the explosion. The Commission’s report shows that the reporting systems used have not kept pace with the increased complexity of drilling platforms. Another conclusion is even more disturbing, as it points out that these deficiencies are not uncommon and that other drilling platforms in the Gulf of Mexico face similar challenges.

If we substitute “drilling platform” with “data center,” this sound awfully familiar. How many IT organizations are relying on relatively simple data collection coming from point monitoring such as network, server, or application while trying to manage the performance and availability of increasingly complex applications? IT operations engineers sift through mountains of data from different sources trying to make sense of what is happening and usually fall short of finding meaningful alerts. The consequences may not be as dire as the Gulf oil spill, but they can still translate into lost productivity and revenue.

The fact that many IT operations have not (yet) faced a meltdown is not a valid counterargument: There is, for example, a good reason to purchase hurricane insurance when one lives in Florida, even though destructive storms are not that common. Like the weather, there are so many variables at play in today’s business services that mere humans can’t be expected to make sense of it.

Read more

Evaluating Complexity

We’re starting to get inquiries about complexity. Key questions are how to evaluate complexity in an IT organization and consequently how to evaluate its impact on availability and performance of applications. Evaluating complexity wouldn’t be like evaluating the maturity of IT processes, which is like fixing what’s broken, but more like preventive maintenance: understanding what’s going to break soon and taking action to prevent the failure.

Volume of application and services certainly has something to do with complexity. Watts Humphrey said that code size (in KLOC: thousands of lines of code) doubles every two years, certainly due to increase in hardware capacity and speed, and this is easily validated by the evolution of operating systems over the past years. It stands to reason that, as a consequence, the total number of errors in the code also doubles every two years.

But code is not the only cause of error: Change, configuration, and capacity are right there, too. Intuitively, the chance of an error in change and configuration would depend on the diversity of infrastructure components and on the volume of changes. Capacity issues would also be dependent on these parameters.

There is also a subjective aspect to complexity: I’m sure that my grandmother would have found an iPhone extremely complex, but my granddaughter finds it extremely simple. There are obviously human, cultural, and organizational factors in evaluating complexity.

Can we define a “complexity index,” should we turn to an evaluation model with all its subjectivity, or is the whole thing a wild goose chase?

Read more

Consider The Cloud As A Solution, Not A Problem

It’s rumored that the Ford Model T’s track dimension (the distance between the wheels of the same axle) could be traced from the Conestoga wagon to the Roman chariot by the ruts they created. Roman roads forced European coachbuilders to adapt their wagons to the Roman chariot track, a measurement they carried over when building wagons in America in the 19th and early 20th centuries. It’s said that Ford had no choice but to adapt his cars to the rural environment created by these wagons. This cycle was finally broken by paving the roads and freeing the car from the chariot legacy.

IT has also carried over a long legacy of habits and processes that contrast with the advanced technology that it uses. While many IT organizations are happy to manage 20 servers per administrator, some Internet service providers are managing 1 or 2 million servers and achieving ratios of 1 administrator per 2000 servers. The problem is not how to use the cloud to gain 80% savings in data center costs, the problem is how to multiply IT organizations’ productivity by a factor of 100. In other words, don’t try the Model T approach of adapting the car to the old roads; think about building new roads so you can take full advantage of the new technology.

Gains in productivity come from technology improvements and economy of scale. The economy of scale is what the cloud is all about: cookie cutter servers using virtualization as a computing platform, for example. The technology advancement that paves the road to economy of scale is automation. Automation is what will abstract diversity and mask the management differences between proprietary and commodity platforms and eventually make the economy of scale possible.

Read more

The Cloud Technology Challenge

Most, if not all, technology improvements need what is commonly referred to as “complementary inputs” to yield their full potential. For example, Gutenberg's invention of movable type wouldn't have been viable without progress in ink, paper, and printing press technology. IT innovations depend on complements to take hold. The use of internal cloud differences will affect applications, configuration, monitoring, and capacity management. External clouds will need attention to security and performance issues related to network latency. Financial data availability is also one important cloud adoption criteria and must be addressed. Without progress in these complementary technologies, the benefits of using cloud computing cannot be fully developed.

Internal cloud technology is going to offer embedded physical/virtual configuration management, VM provisioning, orchestration of resources, and most probably, basic monitoring or data collection in an automated environment, with a highly abstracted administration interface. This has the following impact:

More than ever, we need to know where things are. Discovery and tracking of assets and applications in real time is more important than ever: As configurations can be easily changed and applications easily moved, control of the data center requires complete visibility. Configuration management systems must adapt to this new environment.

Applications must be easily movable. To take advantage of the flexibility offered by orchestration, provisioning, and configuration automation, applications must be easily loaded and configured. This assumes that there is, upstream of the application release, an automated process that will “containerize” the applications, its dependencies, and its configuration elements. This will affect the application life cycle (see figure).

Read more

Service Oriented Organizations

A few days ago I read an interesting article about how organizations need to adapt to virtualization to take full advantage of it.

If we consider that this is, in fact, the first step toward the industrialization of IT, we should consider how the organization of industry evolved over time, from the beginning to the mass-production era. In fact, I think IT will reach the mass-production stage within a few years. If we replicate this evolution in IT, it will go through these phases:

  • The craftsperson era. At the early stage of any industry, we find a solitary figure in a shop soon complemented by similarly minded associates (this is me, 43 years ago). They create valuable and innovative products, but productivity and cost per unit of production is usually through the roof. This is where IT was at the end of the 1960s and the beginning of the 1970s. The organization landscape was dominated by “gurus” who seemed to know everything and were loosely coupled within some kind of primitive structure.
  • The bureaucratic era. As IT was getting more complex, an organizational structure started to appear that tended to “rationalize” IT into a formal, hierarchical structure. In concept, it is very similar to what Max Weber described in 1910: a structure that emphasizes specialization and standardization in pursuit of a common goal. Tasks are split into small increments, mated to skills, and coordinated by a strong hierarchical protocol. The coordination within the organization is primarily achieved through bureaucratic controls. This is the “silo” concept.
Read more