I just received yet another call from a reporter asking me to comment on yet another BI vendor announcing R integration. All leading BI vendors are embedding/integrating with R these days, so I was not sure what was really new in the announcement. I guess the real question is the level of integration. For example:
Since R is a scripting language, does a BI vendor provide point-and-click GUI to generate R code?
Can R routines leverage and take advantage of all of the BI metadata (data structures, definitions, etc.) without having to redefine it again just for R?
How easily can the output from R calculations (scores, rankings) be embedded in the BI reports and dashboards? Do the new scores just become automagically available for BI reports, or does somebody need to add them to BI data stores and metadata?
Can the BI vendor import/export R models based on PMML?
Is it a general R integration, or are there prebuilt vertical (industry specific) or domain (finance, HR, supply chain, risk, etc) metrics as part of a solution?
What server are R models executed in? Reporting server? Database server? Their own server?
Then there's the whole business of model design, management, and execution, which is usually the realm of advanced analytics platforms. How much of these capabilities does the BI vendor provide?
Did I get that right? Any other features/capabilities that really distinguish one BI/R integration from another? Really interested in hearing your comments.
Do you think you are ready to tackle Big Data because you are pushing the limits of your data Volume, Velocity, Variety and Variability? Take a deep breath (and maybe a cold shower) before you plunge full speed ahead into unchartered territories and murky waters of Big Data. Now that you are calm, cool and collected, ask yourself the following key questions:
What’s the business use case? What are some of the business pain points, challenges and opportunities you are trying to address with Big Data? Are your business users coming to you with such requests or are you in the doomed-for-failure realm of technology looking for a solution?
Are you sure it’s not just BI 101? Once you identify specific business requirements, ask whether Big Data is really the answer you are looking for. In the majority of my Big Data client inquiries, after a few probing questions I typically find out that it's really BI 101: data governance, data integration, data modeling and architecture, org structures, responsibilities, budgets, priorities, etc. Not Big Data.
Why can’t your current environment handle it? Next comes another sanity check. If you are still thinking you are dealing with Big Data challenges, are you sure you need to do something different, technology-wise? Are you really sure your existing ETL/DW/BI/Advanced Analytics environment can't address the pain points in question? Would just adding another node, another server, more memory (if these are all within your acceptable budget ranges) do the trick?