Complex Event Processing And IT Automation

Events are, and have been for quite some time, the fundamental elements of IT infrastructure real-time monitoring. Any status changed, threshold crossed in device usage, or step performed in a process generates an event that needs to be reported, analyzed, and acted upon by IT operations.

Historically, the lower layers of IT infrastructure (i.e., network components and hardware platforms) have been regarded as the most prone to hardware and software failures and have therefore been the object of all attention and of most management software investments. In reality, today’s failures are much more likely to be coming from the application and the management of platform and application updates than from the hardware platforms. The increased infrastructure complexity has resulted in a multiplication of events reported on IT management consoles.

Over the years, several solutions have been developed to extract the truth from the clutter of event messages. Network management pioneered solutions such as rule engines and codebook. The idea was to determine, among a group of related events, the original straw that broke the camel’s back. We then moved on to more sophisticated statistical and pattern analysis: Using historical data we could determine what was normal at any given time for a group of parameters. This not only reduces the number of events, it eliminates false alerts and provides a predictive analysis based on parameters’ value evolution in time.

The next step, which has been used in industrial process control and in business activities and is now finding its way into IT management solutions, is complex event processing (CEP). 

Read more

The Convergence Of IT Automation Solutions

We are sometimes so focused on details that we forget to think clearly. Nothing new there; it’s still a story about trees and forest. A few years ago, this was clearly the case when I met with one of the first vendors of run book automation. My first thought was that it was very similar to workload automation, but I let myself be convinced that it was so different that it was obviously another product family. Taking a step back last year, I started thinking that in fact these two forms of automation complemented each other. In “Market Overview: Workload Automation, Q3 2009,” I wrote that “executing complex asynchronous applications requires server capacity. The availability of virtualization and server provisioning, one of the key features of today’s IT process [run book] automation, can join forces with workload automation to deliver a seamless execution of tasks, without taxing IT administrators with complex modifications of pre-established plans.”In June of this year, UC4 announced a new feature of its workload automation solution, by which virtual machines or extension to virtual machines can be provisioned automatically when the scheduler detects a performance issue (see my June 30 blog post “Just-In-Time Capacity”). This was a first sign of convergence. But there is more.

Automation is about processes. As soon as we can describe a process using a workflow diagram and a description of the operation to be performed by each step of the diagram, we can implement the software to automate it (as we do in any application or other forms of software development). Automation is but a variation of software that uses pre-developed operations adapted to specific process implementations.

Read more

Standardize Interfaces, Not Technology

Infrastructure diversity is one important component of many IT infrastructures’ complexity. Even at a time when organizations are standardizing on x86 hardware, they often maintain separate support groups by types of operating systems. In the meantime, we see even more technology diversity developing in a relentless pursuit of performance, and ironically, simplification. This begs a simple question: Should we, for the sake of operational efficiency, standardize at the lowest possible level, e.g., the computing platform, or at a much higher level, e.g., the user interface?

In the past months, I think a clear answer was provided by the mainframe world. One key element that actually limits mainframe expansion in some data centers is the perception from higher levels of management that the mainframe is a complex-to-operate and obsolete platform, too radically different from the Linux and Windows operating systems. This comes from the fact that most mainframe management solutions use an explicit interface for configuration and deployment that requires a detailed knowledge of the mainframe specificity. Mastering it requires skills and experience that unfortunately do not seem to be taught in most computer science classes. Because mainframe education is lacking, the issue seems to be more acute than in other IT segments. This eventually would condemn the mainframe when all the baby boomers decide that they would rather golf in Florida.

 This whole perception was shattered to pieces by two major announcements. The most recent one is the new IBM zEnterprise platform, which regroups a mix of hardware and software platforms under a single administration interface. In doing this, IBM provides a solution that actually abstracts the platforms’ diversity and removes the need for different administrators versed in the vagaries of the different operating systems.

Read more

Lesson From History

I am starting to see signs of important changes in technology and IT organizations. The increased complexity of IT and business services forces the industry down a new path. In this context, there are signs reminiscent of what happened to the mainframe vendors in the late 80s and early 90s, when the transition from proprietary to open systems was usually not very successful. In fact, the major players of today (with the exception of IBM) were small potatoes in the 80s, while the major players of that time are either gone or dying. And some vendors today seem to be following the same recipe for eventual disaster.

What’s happening, in the case of a major change of market direction in a company with revenue based on old technology, is what I would call a “sales force failure.” This is the inability of the sales force to get out of its base of usual customers and compete head to head with new vendors in the new market.

Usually these organizations are technically capable of building up-to-date products, but the sales results often don’t meet expectations. Since the new product created internally does not sell, the company management may be tempted to fix the problem (i.e., satisfy the shareholders in the short term) by cutting the cost center, that is the engineering organization making this new product. With R&D gone, the marketing group  may license another product to replace the one that it killed. Of course, the margins are not the same, but the cost is almost nonexistent. Eventually, this product does not sell either (the sales force is still in the same condition), and, when the old legacy products are finally dead, the company is no more than a value-added reseller.

Read more

Just-In-Time Capacity

One of the great revolutions in manufacturing of the past decades is just-in-time inventory management. The basic idea is to provision only what is needed for a certain level of operation and to put in place a number of management functions that will trigger the provisioning of inventory. This is one the key elements that allowed the manufacturing of goods to contain production costs. We have been trying to adapt the concept to IT for years with little success. But a combination of the latest technologies is finally bringing the concept to a working level. IT operations often faces unpredictable workloads or large variations of workloads during peak periods. Typically, the solution is to over-provision infrastructure capacity and use a number of potential corrective measures: load balancing, traffic shaping, fast reconfiguration and provisioning of servers, etc.

Read more

Of Social Computing And Filtering Through The Information Deluge

While it may have taken humans thousands of years to progress from oral to written to audio and then to video communications, in the past five years, the Internet has accelerated at a breakneck pace through all of these different communication transmission stages. It started as a way to post and communicate text and still pictures, then moved to digital voice and music, and then took a giant step to video delivery, bringing you news, sports, movies, whenever and wherever you wanted to view them. The Internet is now the prime platform for distributing video content, effectively replacing your video store and your cable or broadcast distribution.

Read more

Categories:

The Strategic Role Of IT Management Software

Among critical industrial processes, IT is probably the only one where control and management come as an afterthought. Blame it on product vendors or on immature clients, but it seems that IT management always takes a second seat to application functionalities.

IT operation is seen as a purely tactical activity, but this should not occult the need for a management strategy.  Acquiring products on a whim and hastily putting together an ad hoc process to use them is a recipe for chaos. When infrastructure management, which is supposed to bring order and control in IT, leads the way to anarchy, a meltdown is a forgone conclusion.

Most infrastructure management products present a high level of usefulness and innovation. One should be, however, conscious of the vendor’s limitations. Vendors spend a lot of time talking about the mythical customer needs, while most of them have no experience of IT operations. Consequently, their horizon is limited to the technology they have, and that tree does hide the forest. Clients should carefully select products for the role they play in the overall infrastructure management strategy, not solely on the basis of immediate relief. As the world of IT Operations is becoming more complex every day, the value of an IT management product lies not only with its capability to resolve an immediate issue, but also in its ability to participate future management solutions. The tactical and strategic constraints should not be mutually exclusive.

Business Services Value Analysis

The choice between different formats of cloud computing (IaaS, SaaS mostly) and their comparison to internal IT business service deployment must be based on objective criteria. But this is mostly uncharted territory in IT. Many organizations have difficulties implementing a realistic chargeback solution, and the real cost of business services is often an elusive target. We all agree that IT needs a better form of financial management, even though 80% of organizations will consider it primarily as a means for understanding where to cut costs rather than a strategy to drive a better IT organization.

Financial management will help IT understand better its cost structure in all dimensions, but this is not enough to make an informed choice between a business service internal or external deployment. I think that the problem of which deployment model to choose from requires a new methodology that will get data from financial management. As I often do, I turned to manufacturing to see how they deal with this type of analysis and cost optimization. The starting point is of course an architectural model of the “product”, and this effectively shows how valuable these models are in IT. The two types of analysis, FAST (Function Analysis System Technique) and QFD (Quality Function Deployment), combine into a “Value Analysis Matrix” that lists the customer requirements against the way these requirements are answered by the “product” (or business service) components. Each of these components has a weight (derived from its correlation with the customer requirements) and a cost associated to it. Analyzing several models (for example a SaaS model against an internal deployment) would lead to not only an informed decision but also would open the door to an optimization of the service cost.

I think that such a methodology would complement a financial management product and help IT become more efficient.

Read more

An abstraction layer for IT operations

Technology growth is exponential. We all know about Moore’s Law by which the density of transistors on a chip doubles every two years; but there is also Watts Humphrey’s comment that the size of software doubles every two years, Nielsen’s Law by which Internet bandwidth available to users doubles every two years, and many others concerning storage, computing speed, and power consumption in a data center. IT organizations and especially IT operations must cope with this afflux of technology, which brings more and more services to the business, as well as the management of the legacy services and technology. I believe that the two most important roadblocks that prevent IT from optimizing its costs are in fact diversity and complexity. Cloud computing, whether SaaS or IaaS, is going to add diversity and complexity, as is virtualization in its current form. This is illustrated by the following chart, which compiles answers to the question: “Approximately how many physical servers with the following processor types does your firm operate that you know about?”

Physical Server Installed

If virtualization could potentially address the number of servers in each category, it does not address the diversity of servers, nor does it address the complexity of services running on these diverse technologies.

Read more

Really end to end management: Gomez and Compuware

The marriage of Gomez and Compuware is starting to bear fruits. One of the key aspects of web application performance management is end user experience. This is approached largely from the data center standpoint, within the firewall. But the best solution to understand the real customer experience is to have an agent sitting on the customer side of the application, without the firewall, a possibility that is clearly out of bounds for most public facing applications. The Gomez-Compuware alliance is the first time that these two sides are brought together within the same management application, Compuware Vantage. What Vantage brings to the equation is the Application Performance Management (APM) view of IT Operations: response time collected from the network and correlated with infrastructure and application monitoring in the data center. But, it’s not the customer view. What Gomez brings with its recent version, the “Gomez Winter 2010 Platform Release” is a number of features that let IT understand what goes beyond the firewall: not only how the application content was delivered, but how the additional content from external providers was delivered and what was the actual performance at the end user level: the outside-in view of the application is now combined with the inside-out view of IT Operations provided by Vantage APM. And this is now spreading outside the pure desktop/laptop user group to reach out the increasing mobile and smart phone crowd. IT used to be able to answer the question of “is it the application or the infrastructure?” with Vantage. IT can now answer a broader set of questions: “is it the application, the internet service provider, the web services providers?’ for an increasingly broader range of use-case scenarios.

Read more