I just received yet another call from a reporter asking me to comment on yet another BI vendor announcing R integration. All leading BI vendors are embedding/integrating with R these days, so I was not sure what was really new in the announcement. I guess the real question is the level of integration. For example:
Since R is a scripting language, does a BI vendor provide point-and-click GUI to generate R code?
Can R routines leverage and take advantage of all of the BI metadata (data structures, definitions, etc.) without having to redefine it again just for R?
How easily can the output from R calculations (scores, rankings) be embedded in the BI reports and dashboards? Do the new scores just become automagically available for BI reports, or does somebody need to add them to BI data stores and metadata?
Can the BI vendor import/export R models based on PMML?
Is it a general R integration, or are there prebuilt vertical (industry specific) or domain (finance, HR, supply chain, risk, etc) metrics as part of a solution?
What server are R models executed in? Reporting server? Database server? Their own server?
Then there's the whole business of model design, management, and execution, which is usually the realm of advanced analytics platforms. How much of these capabilities does the BI vendor provide?
Did I get that right? Any other features/capabilities that really distinguish one BI/R integration from another? Really interested in hearing your comments.
I’m excited to announce that our new research on how firms use customer analytics was just published today. The new research reveals some interesting findings:
Customer analytics serves the customer lifecycle , but measurement is restricted to marketing activities. While customer analytics continues to drive acquisition and retention goals, firms continue to measure success of customer analytics using easy-to-track marketing metrics as opposed to deeper profitability or engagement measures.
Finding the right analytics talent remains challenging . It’s not the just the data. It’s not the just technology that hinders analytics success. It’s the analytical skills required to use the data in creative ways, ask the right questions of the data, and use technology as a key enabler to advance sophistication in analytics. We’ve talked about how customer intelligence (CI) professionals need a new breed of marketing scientist to elevate the consumption of customer analytics.
CI professionals are keen to use predictive analytics in customer-focused applications, Forty percent of respondents to our Global Customer Analytics Adoption Survey tell us that they have been using predictive analytics for less than three years, while more than 70% of respondents have been using descriptive analytics and BI-type reporting for more than 10 years. CI professionals have not yet fully leveraged the strengths of predictive analytics customer applications.
Wanted to run the following two questions and my answers by the community:
Q. What is the average age of reporting applications at large enterprises?
A. Reporting apps typically involve source data integration, data models, metrics, reports, dashboards, and queries. I'd rate the longevity of these in descending order (data sources being most stable and queries changing all the time).
Q. What is the percentage of reporting applications that are homegrown versus custom built?
A. These are by no means solid data points but rather my off-the-cuff – albeit educated - guesses:
The majority (let's say >50%) of reports are still being built in Excel and Access.
Very few (let's say <10%) are done in non-BI-specific environments (programming languages).
The other 40% I'd split 50/50 between:
off-the-shelf reports and dashboards built into ERP or BI apps,
and custom-coded in BI tools
Needless to say, this differs greatly by industry and business domain. Thoughts?
As one of the industry-renowned data visualization experts Edward Tufte once said, “The world is complex, dynamic, multidimensional; the paper is static, flat. How are we to represent the rich visual world of experience and measurement on mere flatland?” Indeed, there’s just too much information out there for all categories of knowledge workers to visualize it effectively. More often than not, traditional reports using tabs, rows, and columns do not paint the whole picture or, even worse, lead an analyst to a wrong conclusion. Firms need to use data visualization because information workers:
Cannot see a pattern without data visualization. Simply seeing numbers on a grid often does not convey the whole story — and in the worst case, it can even lead to a wrong conclusion. This is best demonstrated by Anscombe’s quartet where four seemingly similar groups of x/y coordinates reveal very different patterns when represented in a graph.
Cannot fit all of the necessary data points onto a single screen. Even with the smallest reasonably readable font, single-line spacing, and no grid, one cannot realistically fit more than a few thousand data points on a single page or screen using numerical information only. When using advanced data visualization techniques, one can fit tens of thousands (an order-of-magnitude difference) of data points onto a single screen. In his book The Visual Display of Quantitative Information, Edward Tufte gives an example of more than 21,000 data points effectively displayed on a US map that fits onto a single screen.
Forrester's global Marketing Technology Adoption survey investigates:
What technologies do marketers currently use, and what do they plan to use?
How much do marketers budget for technology acquisition and operations?
What are the users' top goals for and pain points from marketing technology?
You can use the survey results to:
Provide justification for a business case in your 2013 technology road map.
Compare your spend levels and technology use to those of other marketing professionals.
Spot trends and see best practices to incorporate into your technology strategy.
The survey will close on Friday, August 3, and the completed research report will publish in early September. Once the research publishes, I will also present the findings in a Forrester Webinar and in advisory sessions to interested clients.
I get the following question very often. What are the best practices for creating an enterprise reporting policy as to when to use what reporting tool/application? Alas, as with everything else in business intelligence, the answer is not that easy. The old days of developers versus power users versus casual users are gone. The world is way more complex these days. In order to create such a policy, you need to consider the following dimensions:
Historical (what happened)
Operational (what is happening now)
Analytical (why did it happen)
Predictive (what might happen)
Prescriptive (what should I do about it)
Exploratory (what's out there that I don't know about)
Looking at static report output only
Lightly interacting with canned reports (sorting, filtering)
Fully interacting with canned reports (pivoting, drilling)
Assembling existing report, visualizations, and metrics into customized dashboards
Full report authoring capabilities
External (customers, partners)
Report latency, as in need the report:
In a few days
In a few weeks
Strategic (a few complex decisions/reports per month)
Tactical (many less-complex decisions/reports per month)
Operational (many complex/simple decisions/reports per day)
Traditional BI approaches and technologies — even when using the latest technology, best practices, and architectures — almost always have a serious side effect: a constant backlog of BI requests. Enterprises where IT addresses more than 20% of BI requirements will continue to see the snowball effect of an ever-growing BI requests backlog. Why? Because:
BI requirements change faster than an IT-centric support model can keep up. Even with by-the-book BI applications, firms still struggle to turn BI applications on a dime to meet frequently changing business requirements. Enterprises can expect a life span of at least several years out of enterprise resource planning (ERP), customer relationship management (CRM), human resources (HR), and financial applications, but a BI application can become outdated the day it is rolled out. Even within implementation times of just a few weeks, the world may have changed completely due to a sudden mergers and acquisitions (M&A) event, a new competitive threat, new management structure, or new regulatory reporting requirements.
If you think the term "Big Data" is wishy washy waste, then you are not alone. Many struggle to find a definition of Big Data that is anything more than awe-inspiring hugeness. But Big Data is real if you have an actionable definition that you can use to answer the question: "Does my organization have Big Data?" Proposed is a definition that takes into account both the measure of data and the activities performed with the data. Be sure to scroll down to calculate your Big Data Score.
Big Data Can Be Measured
Big Data exhibits extremity across one or many of these three alliterate measures:
I said last year that this would happen sometime in the first half of this year, but for some reason my colleagues and clients have kept asking me exactly when we would see a real ARM server running a real OS. How about now?
To copy from Calxeda’s most recent blog post:
“This week, Calxeda is showing a live Calxeda cluster running Ubuntu 12.04 LTS on real EnergyCore hardware at the Ubuntu Developer and Cloud Summit events in Oakland, CA. … This is the real deal; quad-core, w/ 4MB cache, secure management engine, and Calxeda’s fabric all up and running.”
This is a significant milestone for many reasons. It proves that Calxeda can indeed deliver a working server based on its scalable fabric architecture, although having HP signing up as a partner meant that this was essentially a non-issue, but still, proof is good. It also establishes that at least one Linux distribution provider, in this case Ubuntu, is willing to provide a real supported distribution. My guess is that Red Hat and Centos will jump on the bus fairly soon as well.
Most importantly, we can get on with the important work of characterizing real benchmarks on real systems with real OS support. HP’s discovery centers will certainly play a part in this process as well, and I am willing to bet that by the end of the summer we will have some compelling data on whether the ARM server will deliver on its performance and energy efficiency promises. It’s not a slam dunk guaranteed win – Intel has been steadily ratcheting up its energy efficiency, and the latest generation of x86 server from HP, IBM, Dell, and others show promise of much better throughput per watt than their predecessors. Add to that the demonstration of a Xeon-based system by Sea Micro (ironically now owned by AMD) that delivered Xeon CPUs at a 10 W per CPU power overhead, an unheard of efficiency.
In the latest evolution of its Linux push, IBM has added to its non-x86 Linux server line with the introduction of new dedicated Power 7 rack and blade servers that only run Linux. “Hah!” you say. “Power already runs Linux, and quite well according to IBM.” This is indeed true, but when you look at the price/performance of Linux on standard Power, the picture is not quite as advantageous, with the higher cost of Power servers compared to x86 servers offsetting much if not all of the performance advantage.
Enter the new Flex System p24L (Linux) Compute Node blade for the new PureFlex system and the IBM PowerLinuxTM 7R2 rack server. Both are dedicated Linux-only systems with 2 Power 7 6/8 core, 4 threads/core processors, and are shipped with unlimited licenses for IBM’s PowerVM hypervisor. Most importantly, these systems, in exchange for the limitation that they will run only Linux, are priced competitively with similarly configured x86 systems from major competitors, and IBM is betting on the improvement in performance, shown by IBM-supplied benchmarks, to overcome any resistance to running Linux on a non-x86 system. Note that this is a different proposition than Linux running on an IFL in a zSeries, since the mainframe is usually not the entry for the customer — IBM typically sells to customers with existing mainframe, whereas with Power Linux they will also be attempting to sell to net new customers as well as established accounts.