Data scientists don’t work in isolation. As with any scientists, they rely on a wide range of people in adjacent roles to help them do their jobs as effectively as possible.
Think about science generally. In the historical development of modern science, the specialization of roles continues to proliferate. But today’s professional science establishment is a relatively recent phenomenon. Back in the Middle Ages — and even well into the modern era — scientists often had to be jacks of all trades in order to carry on their investigations. Until the 19th century, there were few professional scientists, research universities, or commercial labs. There were no eager, underpaid graduate students to press into service. Until the 20th century, most professional scientists had to build and maintain their own laboratories, invent and calibrate their own instruments, painstakingly record their own observations, and concoct and promote their own theories.
Today’s professional scientists — of which data scientists are a key category — have it much easier. Whether they work with particle accelerators or linear regression models, scientists know they don’t need to be their own chief cooks and bottle washers. They can make science their day job and rely on a host of others for all of the necessary supporting tools and infrastructure. We find the following broad division of labor in all of today’s scientific disciplines, including data science:
Enterprise laptops are on the shopping list for many I&O professionals I speak with every week, with some asking if Netbooks are the antidote to the MacBook Air for their people. Well, on the menu of enterprise laptops, I think of Netbooks as an appetizer -- inexpensive, but after an hour my stomach is growling again. Garden-variety ultraportables on the other hand are like a turkey sandwich -- everything I need to keep me going, but they make me sleepy halfway through the afternoon.
Ultrabooks are a new class of notebook promoted by Intel and are supposed to be a little more like caviar and champagne -- light and powerful, but served on business-class china with real silverware and espresso. At least that's what I took away after being briefed by Intel on the topic. I had the chance to sample HP's new Ultrabook fare in San Francisco a few weeks ago while they were still in the test kitchen, and it seems they took a little different approach. Not bad, just different.
It struck me that rather than beluga and Dom Perignon , HP has created more of a Happy Meal -- a tasty cheeseburger and small fries with a Diet Coke, in a lightweight, easy to carry package for a bargain price. It has everything the road warrior needs to get things done, and like a Happy Meal, they can carry it on the plane and set it on the tray table…even if the clown in front of them reclines. Folio offers the Core i5-2467M processor, 4GB RAM, a 13.3" LED display and a 128GB SSD storage, a 9-hour battery and USB 3.0 + Ethernet ports as highlights, all for $900. It's a true bargain. I think I will call it the McUltrabook.
It's not the business leaders or the rank employees either. While all of these choice make sense and all are, according to our Forrights surveys, more excited by it than infrastructure & operations professionals, it's the person who supposedly has the most to gain from it -- your CFO.
One of the things that truly differentiates Forrester Research from other analyst firms is the breadth of conversations we have with clients and the range of employees in an enterprise that we survey. We serve not only those in IT roles across the world but those in marketing, sales and strategy roles too. And our Forrsights surveys complete the picture by talking with CFOs, CEOs, workplace professionals and of course consumers. This breadth gives us the ability to present 360-degree views on certain key topics such as mobility and cloud computing. And it is through this holistic view that you get a real psychograph of the enterprise and can examine and respond to these differences in opinion and consumption.
This week AMD finally released their AMD 6200 and 4200 series CPUs. These are the long-awaited server-oriented Interlagos and Valencia CPUs, based on their new “Bulldozer” core, offering up to 16 x86 cores in a single socket. The announcement was targeted at (drum roll, one guess per customer only) … “The Cloud.” AMD appears to be positioning its new architectures as the platform of choice for cloud-oriented workloads, focusing on highly threaded throughput oriented benchmarks that take full advantage of its high core count and unique floating point architecture, along with what look like excellent throughput per Watt metrics.
At the same time it is pushing the now seemingly mandatory “cloud” message, AMD is not ignoring the meat-and-potatoes enterprise workloads that have been the mainstay of server CPUs sales –virtualization, database, and HPC, where the combination of many cores, excellent memory bandwidth and large memory configurations should yield excellent results. In its competitive comparisons, AMD targets Intel’s 5640 CPU, which it claims represents Intel’s most widely used Xeon CPU, and shows very favorable comparisons in regards to performance, price and power consumption. Among the features that AMD cites as contributing to these results are:
Advanced power and thermal management, including the ability to power off inactive cores contributing to an idle power of less than 4.4W per core. Interlagos offers a unique capability called TDP, which allows I&O groups to set the total power threshold of the CPU in 1W increments to allow fine-grained tailoring of power in the server racks.
Turbo CORE, which allows boosting the clock speed of cores by up to 1 GHz for half the cores or 500 MHz for all the cores, depending on workload.
Last week, I presented at the itSMF UK’s annual conference on the subject of value or, more specifically, I hid an awful lot of IT financial management-related content behind the title: “Anybody Questioning Your Value?” Importantly, this is not IT value; I am referring to business value.
It was a surprising session in many ways. Firstly, the number of attendees (I didn’t count them but I would guestimate about 80 ... I’m sure my IT service management peers in attendance will now quickly tell me it was a lot, lot less). Secondly, that they all seemed to stay to the end (well bar one worried-looking lady who left in a rush early on ... I assumed a Sev1 incident or an upset tummy, or both).
The third surprise was the response to a simple question I posed:
If your CEO or CFO stopped you in the corridor and asked, “I like the look of this Gmail-for-business thingy, how does it compare cost-wise with our internal email service?” Would you know the per-unit cost of delivering your corporate email service?
The surprise? Not one person in the room admitted to knowing what their corporate email service costs. I expected to see a low number of raised hands but not a wave-less sea of hands-in-laps. Unfortunately, being unable to answer such off-the-cuff and more formal questions around costs and value can only expose the absence of I&O’s business savvy and lack of cost-awareness. This is not a place I&O wants to be in right now (or ever).
Data scientists are a curious breed. The term encompasses a wide range of specialties, all of which rely on statistical algorithms and interactive exploration tools to uncover nonobvious patterns in observational data.
Who belongs in this category? Clearly, the “quants” are fundamental. Anybody who builds multivariate statistical models, regardless of the tool they use, might call themselves a data scientist. Likewise, data mining specialists who look for hidden patterns in historical data sets — structured, unstructured, or some blend of diverse data types — may certainly use the term. Furthermore, a predictive modeler or any analyst who builds fact-based what-if simulations is a data scientist par excellence. We should also include anybody who specializes in constraint-based optimization, natural language processing, behavioral analytics, operations research, semantic analysis, sentiment analysis, and social network analysis.
But these jobs are only one-half of the data-science equation. The “suits” are also fundamental. Any business domain specialist who works with any of the tools and approaches listed above may consider him- or herself a data scientist. In fact, if one and the same person is a black belt in SAS, SPSS, R, or other statistical tools, and also an expert in marketing, customer service, finance, supply chain, or other business specialties, they are a data scientist par excellence.
Both of these skill sets are fundamental to high-quality data science. Lacking statistical expertise, you can’t understand which are the most appropriate algorithms and approaches to make the foundation of your statistical models. Lacking business domain expertise, you can’t identify the most valid variables and appropriate data sets to build into your models around.
Step 6 of my 10-step program on how to master your service experience is to make your agent tool set more usable. This is because the work environment of a customer service agent is pretty awful. Agents use dozens — sometimes hundreds — of disconnected tools and technologies like CRM systems, billing systems, ERP, transactional systems, knowledge bases, information in email correspondence, and training manuals to find answers to customer questions. Have a look at the customer service IT ecosystem from a North American telecom company to internalize this complexity.
Most applications that agents use lack intuitive navigation, have cluttered screens that contain too much information, and have overly complex process flows that rely too heavily on agents to navigate. Moreover, agents don’t always navigate through their set of disconnected systems in the same way to find the answer they are looking for.
All these usability issues lead to variable handle times and inconsistent customer experiences. There is no way for managers to make sure that agents are complying with regulations or company policy. Knowledge exists on an island of its own, disconnected from the rest of the customer service ecosystem, and is sometimes duplicated for each communication channel that the company supports — which leads to inconsistent answers that are sometimes just plain wrong. In addition, agents don’t have access to a consolidated view of a customer’s purchase history or prior interactions and thus cannot personalize the conversation to the customer.
The shift towards the empowered consumer and employee is no more obvious than in Asia - particularly in Singapore, where a recent Google study showed that smartphone penetration is a whopping 62% (compared to 31% in the US). In fact, of the 11 countries in Asia surveyed, four of them (Singapore, Australia - 37%, Hong Kong - 35%, Urban China - 35%) had higher smartphone penetration rates than the US (and amongst 18-29 year olds, 84% of Singaporeans had smartphones, compared to 47% in the US!). With many of the more populous countries having young populations (average age: Philippines - 22.9, China - 35.5, India - 26.2, Indonesia - 28.2 - see World Factbook), the gen Y factor is driving employees to question whether the current way of working makes the most sense.
With so many young, mobile and connected employees, it is no surprise that CIOs across the region regularly complain about the company staff self-deploying devices, applications and services from the web or from app stores. The attitude of many IT shops is to shut it down - interestingly, the whole concept of "empowered employees" is quite "taboo" in some countries across the Asia Pacific region. A CIO recently told me that "smartphones and social media have come five years too soon" - referring to the fact he is planning to retire in five years, and that these technology-centric services are proving to be quite a headache for his IT department!
My colleague John Rymer expects platform-as-a-service (PaaS) technology to “cross the chasm” into mainstream status over the next three years (2012-2014). Today, PaaS solutions, which provide application development and deployment tools abstracted from the underlying cloud infrastructure on which they run your apps, fall into four types: 1) Pure cloud integrated development environments (IDEs); 2) Traditional IDEs that offer the option of cloud deployment; 3) IDE-neutral cloud runtimes that can run apps built by multiple types of IDEs; and 4) PaaS solutions designed for use by business developers. John sees all four of these categories aiming to cross the chasm in this timeframe but doesn’t expect all four segments to succeed in making that transition.
Why does this matter? PaaS is one of the easiest and most productive ways to take advantage of cloud economics, and the elasticity of the cloud, by providing an easily consumable elastic app platform. Today, most apps for the cloud either lack the ability to automatically scale up or down in their use of cloud resources, based on demand, or else gain that ability through complex programming to low-level APIs and frameworks. PaaS provides access to the cloud without all the drama. Only through taking full advantage of these attributes of the cloud can your business realize the full benefits the cloud theoretically provides.
Demands by users of business intelligence (BI) applications to "just get it done" are turning typical BI relationships, such as business/IT alignment and the roles that traditional and next-generation BI technologies play, upside down. As business users demand more control over BI applications, IT is losing its once-exclusive control over BI platforms, tools, and applications. It's no longer business as usual: For example, organizations are supplementing previously unshakable pillars of BI, such as tightly controlled relational databases, with alternative platforms. Forrester recommends that business and IT professionals responsible for BI understand and start embracing some of the latest BI trends — or risk falling behind.
Traditional BI approaches often fall short for the two following reasons (among many others):
BI hasn't fully empowered information workers, who still largely depend on IT
BI platforms, tools and applications aren't agile enough