The Forrester Blog For Information & Knowledge Management Professionals

« February 2008 | Main | April 2008 »

March 2008

March 24, 2008

SharePoint For The Enterprise

CraigleclairBy Craig Le Clair

Forrester recently surveyed 233 IT decision-makers who have plans to implement or upgrade to at least some part of MOSS 2007 and asked: "Which of the following best describes your organization's time line for implementing or upgrading to Microsoft Office SharePoint Server?". The results? 21% will upgrade immediately and 41% will do so within 6 months.

With this level of adoption the issue of scalability comes up more and more. In one sense you have architectural concerns with any solution that scales horizontally, uses banks of load-balanced Web servers, application servers, and clusters of SQL servers on the back end. Add high availability and you quickly get a complex environment. To Microsoft's credit there is quite a bit available on performance guidelines. But looking through these, and coping with notions of site collections, lists, file arrangements, performance of folder hierarchies versus flat files, and automatic versus manual partitioning, the bottom line seems to be that even on the new 64 bit architecture with 4 screaming Intel processors, and SQL 5 -- the upper limit of the content repository is 500GB.

Assuming allocation for index data and other overhead this really allows less then 20 million objects to be reasonably managed. For enterprise solutions that may ingest 50,000 document a day, this would last about a year. Basically these are not huge numbers. Contrast this –- just for kicks -– with the DocHarbor repository, which was originally built for indexing computer output like customer statements. For statement archiving, ingestion runs done at month-end can be 10 million. Meeting the index management requirement meant building configurable hashing algorithms that can create up to 64K database tables. Anacomp, owner of the DocHarbor platform, claims 1 billion objects managed in their high-end IBM Unix environment.

The clear message from Microsoft is that MOSS 2007 can be an enterprise solution but firms contemplating this should plan carefully. The point for companies adopting SharePoint to consider is the degree of complexity and attendant soft costs required to manage this environment as it grows. I fear that firms will be seduced by the ease of rolling out SharePoint and at the end of the day may prematurely push this platform too far. But let's be fair. All platforms at the "enterprise" level of performance are complex and require strong engineering to maintain, whether it be EMC, IBM, OpenText, Hyland Software, or others. Yet, these products have had years dealing with approaches to keep table sizes manageable, optimize database schemas, index documents at high rates, and maintain business continuity.

Companies that have these mature environments in place must compare the complexity of scaling with SharePoint to leveraging that back end investment and skilled people they have. Many will chose to link that back-end to elements of SharePoint that makes sense.

I'm doing an AIIM Webinar on April 9 that will discuss this. I'd be very interested in any comments from enterprises -- or suppliers -- with experience wrestling with this decision or that have had success or issues with trying to scale SharePoint. Our recent adoption data makes this a very topical issue for many firms.

March 21, 2008

Virtual Worlds Shows Promise For Collaborative Work -- With Hurdles To Be Overcome

Ericadriver_2By Erica Driver

The Virtual-Worlds Consortium for Innovation and Learning and SRI Consulting Business Intelligence today released the results of an online survey conducted early in March 2008 titled "Virtual Worlds and Collaborative Work: Survey Results." The organization surveyed 81 people who are active users of virtual worlds (e.g., Second Life) about the use of virtual worlds for collaborative work. Most survey respondents (about 85%) were in North America; the rest were in Europe and Asia. Fewer than 20% of respondents are using virtual worlds mostly for pleasure and fun; 58% have a strong interest in how these technologies can serve for work. Some of the key findings:

  • Most respondents believe virtual worlds present a great opportunity for collaborative work. When asked about the potential benefits of virtual worlds for collaborative work, most respondents said they see benefits like: Being together in a virtual world enables a wider range of potentially beneficial interactions with collaborators; presence via avatars brings a new and useful dimension that enables better connections with collaborators; and virtual worlds can give remote workers an opportunity to have informal "water cooler" sessions that they miss by not being co-located with their coworkers (see Figure 1). Slide1_4 (For Forrester thoughts on the water cooler conversations see the Feb. 4th blog post Virtual Offices For All: Return Of The Serendipitous Interaction.) When asked for which types of work situations or scenarios virtual worlds could bring the most benefit, 93% of respondents said when project teams are highly distributed across geographies and/or time zones and 69% said projects that benefit from use of various types of media.
  • But we’ve got some major adoption hurdles to overcome. About 34% of respondents said that major technology and/or worker attitude issues must be addressed before virtual worlds will play much of a role in collaborative work in most organizations. The biggest hurdles will be getting management to recognize the potential benefits of using virtual worlds for collaborative work; making sure every project or team member is comfortable with the use of virtual worlds to work effectively in this new environment; finding ways to integrate virtual worlds with other collaborative tools and technologies that may still be used to complement virtual worlds; and deciding which virtual worlds will best meet the organization's evolving needs in the area of collaborative work (see Figure 2). Slide2_3

Competitive Business Intelligence, Harnessed Through Collaboration And CEP, Harvested Across The Cloud

Jameskobielus_3By James Kobielus

Sometimes ideas for blog posts flow out of everyday conversations with colleagues. I want to thank Leslie Owens and Matt Brown for stimulating the following thought train.

The external competitive environment is the cloud where opportunities and threats hang, sometimes latent, sometimes looming. So it only makes sense that enterprises will outsource more of the competitive surveillance to the cloud of external resources, such as analyst firms, third-party market intelligence subscription feeds, social networking, Web 2.0, etc.

Of course, enterprises realize they don't dare outsource the competitive intelligence function entirely. That explains why they maintain research staff, tools, portals, and informational resources in-house. Mostly, these competitive intelligence teams monitor the prevailing market conditions that impact on their companies' core businesses. But, to a great degree, they also serve as an early-warning system helping their organization respond to specific breaking events -- i.e., the "disrupters" -- that threaten to capsize the corporate boat.

Recognizing this perennial "disrupter pre-emption" requirement, enterprises are concerned with best practices for setting up event-driven competitive intelligence operations. These best practices should help them survey the external horizon more comprehensively and proactively. Best practices should also help them foster, harness, and harvest internal collaboration among competitive-intelligence subject matter experts.

Essentially, competitive intelligence operations of this sort practice CEP in the following senses that I described in a previous post:

  • Each event may be quite complex in its own right, standing for a linked set of data updates, application state transitions, and process status changes (NEW NOTE: a "disrupter" is any extremely complex, perhaps way too vague, but still undeniably important "event" -- disrupters aren't "tagged" as such -- and they may not be easily identified before, during, or after the fact -- maybe, hey Leslie, on-the-fly social tagging is the best way to approach this squishiness).
  • Each event-reliant decision agent (e.g., end user) may access, interact with, and/or consume events through a complex interface (dashboards, analytics, semantic layer, etc.), across multiple devices (desktop, laptop, Blackberry, etc.) and have a complex event-enriched streaming  "experience" (NEW NOTE: competitive intelligence groups make use of the full range of portals, e-mail, IM, search engines, social networking, wikis, blogs, podcasts, workflow, alerts/notifications, etc.).
  • Each event-reliant decision agent may be a complex creature in its own right with its own complex, convoluted, squishy decision-making methodology -- i.e., an individual human being with their own habits and cognitive/psychological dispositions; or a group making decisions collectively and collaboratively through workflow, or social networking; or a half-human/half-automated workflow behaving in the herky-jerky manner one would expect from a split-personality decision agent; or a completely automated orchestration of applications triggered by rules engines, etc.) (NEW NOTE: competitive intelligence teams are very human teaming environments, at heart -- everybody is a "sentinel" on the "lookout" for critical events while others are "sleeping" or attending to something else).

Tying in another observation from that earlier post, I expect that CEP for I&KM (i.e., real-time, event-triggered, under-deadline, continuously-refreshing dashboard-sharing collaboration applications) will play a key role in event-driven competitive intelligence everywhere. CEP, hence BI, will be used to beef up organizations' in-house competitive intelligence/surveillance function, supplementing (hopefully not replacing) the outsourced competitive intelligence/surveillance they get from analyst firms such as Forrester.

In such an environment, the self-service, in-house research portal's the chief presentation layer, and BI operates as an intelligence source and/or target accessible via the portal, with real-time/near-real-time data integration approaches (e.g., ESP, CDC, MOM, trickle-feed ETL, etc.) providing the low-latency plumbing to deliver those feeds to the DW/BI/portal/preso front-end.

The BI vendors are laying the foundation for this emerging best practice. If you look at what vendors such as Business Objects are doing, they're making more external, commercial, competitive intelligence feeds accessible, via partnerships with content aggregators/publishers, from their platforms (e.g., http://www.businessobjects.com/news/press_release.asp?id=20070521_006524). They're also providing text mining/analytics-integrated tools (e.g., http://www.businessobjects.com/news/press_release.asp?id=20070924_006494) for searching across internal and external, unstructured/semi-structured data sources. And they're expanding the social networking and other collaborative features, and mashup offerings, for bringing together real-time feeds of internal/external data/events (e.g., http://www.businessobjects.com/news/press_release.asp?id=20080313_00001).

Business Objects is a bit ahead of the industry curve on all these things. But it's clear that, as market leader, they've laid down the chief challenge for all BI vendors, to make their offerings more pervasive in competitive intelligence use cases, and also to harvest the informational resources of the Web 2.0 cloud to the max.

March 19, 2008

Oh No, Not Another 2.0 -- Database 2.0? Data Warehousing In The Cloud!

Jameskobielus_4By James Kobielus

Boris Evelson's latest post on free BI got me thinking about another type of freedom.

Boris commented on the newly announced beta of a gratis, lightweight, Panorama-powered BI/OLAP-engine add-on to Google's hosted apps. You know, whenever anybody mentions BI/OLAP, I think of analytical databases, hence data warehousing (DW). And when my thoughts turn to DW, I often wonder when these dimensional data stores will be let loose from their earthly tethers and begin to float free in the SaaS cloud. This is no blue-sky speculation, but rather an inevitability in a world shifting to subscription-based SaaS for on-demand delivery of all infrastructure and application services. Where database services are concerned, this trend even has a name in popular circulation: Database 2.0 (aka "cloud databases").

Let it be known that Google is one of the pioneers in Database 2.0, though they haven't tooted their horn or done anything particularly special in this regard (smaller SaaS solution providers such as Trackvia, DabbleDB, and Zoho have more full-featured Database 2.0 offerings than Google, albeit not particularly BI/OLAP/DW-focused). A year or two ago, Google went open beta (still in that phase, actually) with a hosted database service called GoogleBase. Now, from what I've seen, GoogleBase is not a general-purpose transactional or analytical database. And it’s certainly not a DW or data mart in the clouds. Instead, GoogleBase seems to be an online repository -- or rather, depository -- into which external parties submit structured data for Google to crawl and index deeply for access from Google's big whompin' search engine.

Even more noteworthy is Microsoft's recent foray into the Database 2.0 space -- a move that some might consider a "validation" of this approach in the eyes of enterprise I&KM professionals. Microsoft has just rolled out a beta of its hosted SQL Server Data Services. The vendor has started to host services that have heretofore have been available only from SaaS partners. This is, of course, a key piece of the Redmond WA-based vendor's begrudging effort to push more solutions into a Microsoft-hosted SaaS cloud. However, from what I can see so far, Microsoft is simply hosting a subset of the functionality of its general-purpose RDMBS platform for OLTP and OLAP. However, Microsoft has not specifically optimized SQL Server Data Services for OLAP, unlike any truly scalable BI/OLAP/DW platform.

Back to Google for a sec. What I fully expect from them in the coming year or two -- and from every SaaS cloud everywhere before long -- are feature-complete, hosted, subscription-based DW services for high-performance, high-volume, complex analytics. Naturally, this cloud should be called DW 2.0. It should leverage the full virtualized, distributed, scalable, grid-enabled computing fabric that the Googles of this world can bring to bear on the very largest structured data sets, most resource-intensive query-processing tasks, and richest visualizations imaginable. Per Boris's suggestion, it could even serve as a supremely scalable BI, data mining, or predictive analytics "sandbox" for developers and power users who have no other speedy, cost-effective alternatives for procuring the necessary horsepower for various projects and production requirements.

I second Boris’s challenge: Google should consider integrating the Panorama OLAP-engine add-on (remember, it's just a beta) with a more analytics-enabling future version of GoogleBase (which is also still a beta). In so doing, Google -- if it eventually decides to go into full production with all this -- would be able to offer full-featured DW and BI services on a hosted platform that is as infinitely scalable as the concatenated string of Os inside the ever-extensible company name that displays within its multipage search-result screens. I also share Boris's concern that whatever hosted OLAP/BI/DW services Google eventually offers may lack enterprise-grade metadata management, data cleansing, data-source connectivity, security, and other key features.

I also expect Microsoft to evolve SQL Server Data Services in the DW 2.0 direction, an effort that no doubt would intensify if Mr. Ballmer succeeds in grabbing Yahoo. I'd like to see Microsoft cross-synthesize SQL Server Data Services with any hardware-partner-powered OLAP-acceleration approaches it may or may not be developing under its DW appliance initiative. At the very least, I'd like to see Microsoft provision some seriously scalable DW horsepower in its data center, perhaps through a partnership with Teradata.

Clearly, DW 2.0 services will need to be an order-of-magnitude more powerful than what we've come to expect under the first generation of SaaS-based BI/DW offerings on the market. Whether dedicated to a single customer's requirements or divvied up on a shared-tenant basis, DW 2.0 could be the biggest, baddest, most virtual DW "appliance" of them all. And it would be another key step in the progressive virtualization of the entire SOA stack, apps, middleware, hardware, and data services across the  Enterprise 2.0 or Web 2.0 fabric.

Oh yes, yet another 2.0 -- or two -- for you. Wouldn’t it be interesting if Google and/or Microsoft acquired a DW appliance vendor? I would not be at all surprised if announcements such as these precipitated from the cloud of pregnant possibilities.

And is it too far-fetched to imagine that Microsoft might turn around and acquire Teradata if the Yahoo takeover falls through? My crystal ball's still a bit cloudy on the matter.

But, hey, I'm free to speculate.

Free BI!

BorisevelsonBy Boris Evelson

Now that I caught your attention with the title -- it's not what you think. It's not about freeing BI from the constraints and limitations of corporate politics, organizational silos, and lack of proper data governance -- although that's a very worthy topic to write about.

This morning, Google will unveil a beta version of its spreadsheet application with some new advanced features, such as Pivot Table. The Pivot Table is a product developed by Panorama, a small, but upcoming BI vendor (they are currently being evaluated in detail by Forrester BI Wave '08), who were, interestingly enough, the original inventors of Microsoft Analysis Services OLAP (Online Analytic Processing) engine. So now, part of Panorama code will be inside two of the biggest software companies in the world!

Free_sign_medWith this new feature, every Google spreadsheet user will have access to powerful OLAP, as a free BI SaaS add-on to Google Docs. In my opinion -- a very wise move by Google to continue to push Google Docs into enterprises.

When I wrote a research document on the intersection of BI and spreadsheets, I had to defend my thesis that Excel is here to stay in front of several Forrester analysts and research directors. Some of them questioned the long-term viability of Excel vs. Google apps free SaaS model. I still stand by what I said and wrote -- business users are much more concerned with rich functionality, familiar interface, and ubiquitousness of Excel, even given the low- to no-cost Google option and its out of the box collaboration capabilities. With close to 400 million Excel copies out there, Excel and, specifically, Excel for BI is here to stay.

In order to compete with Excel, among many other features, Google spreadsheets lacked specific business functionality needed to analyze vast amounts of data stored in these spreadsheets. With Panorama free add-on Google spreadsheets, users can now take advantage of this lightweight -- but still very respectable and powerful -- OLAP engine. Yes, out of the box native pivot tables in spreadsheets can do basic analysis on rows and columns of data, but Panorama takes it much farther with more powerful OLAP functions like (I don't actually know yet what subset of full Panorama OLAP functionality is available as Google add-on):

  • Drill up/down
  • Drill across/Drill through/Drill anywhere
  • Allocations
  • Combine/merge/split dimensions
  • Exception handling and conditional formatting

Are there any immediate takeaways for our audience, I&KM pros? Probably not, since this offering is far from a large enterprise grade reporting and analytics solution, lacking such necessary features as end-to-end metadata management, data cleansing, connectivity to all corporate data sources, and many others. However, I do hear of an increasing demand among power users for a BI "sandbox" environment, where one could perform all sorts of data analysis without being constrained by corporate security, priorities, system availability, and many other issues that typically keep IT lagging in delivering on the barrage of never-ending business requests for BI enhancements. As a result, I always advise all BI vendors to include BI "sandbox" functionality in their product offerings. Outside of the firewall, sandbox, SaaS BI application could just be the answer.

What's next? Will Google buy www.elastra.com and offer Data Warehouse and BI SaaS services on this infinitely scalable platform? Not sure, given security and WAN bandwidth concerns. But watch for a blog from my colleague, Jim Kobielus, with more on these interesting possibilities.

Regardless of what Google does next or what other BI vendors will do in response, BI users now have yet another free BI option (yes, limited, but you can't beat the price) in addition to Open Source BI tools available from Ecliplse BIRT and other Open Source BI projects sponsored by Actuate, Jaspersoft, and Pentaho.

March 12, 2008

A New Kind Of Meeting Space: Agree/Disagree Floor

Ericadriver_3By Erica Driver

Have you ever been involved in meetings at which a group of people is trying to make a decision (e.g., go/no go on projects, which person to hire, which of several options is the best strategic fit, what a list of priorities should look like, etc.)? You've got 5 or 10 or more people in the room, or on the phone, and you are discussing your options and trying to reach a decision. People make points and raise their opinions and objections, and the group tries to move more toward a decision or consensus. You probably know where some people stand, throughout the conversation -- the extroverts, most likely. At least when you are together in person, rather than on the phone, because you have body language and other visual cues, in addition to peoples' spoken words. When you're on the phone, it's a lot harder. The result of all this? It takes a long time to make decisions, and the decision is imperfect because undoubtedly not everyone's point of view was incorporated. And sometimes the train of reasoning is lost as the conversation flows.

The MIT Media Lab has developed a solution as part of the work it is doing on new kinds of meeting environments (which the lab calls Information Spaces). Information Spaces take advantage of what's new and different about 3D immersive workspaces -- such as people representing themselves with avatars and being able to move around in space. We're not talking about a 3D replica of a corporate conference room, with a dining room-style table in the middle, surrounded by a bunch of cushy chairs. Check this out: One of the lab's efforts is an agree/disagree meeting space. It looks a bit like a football field, with a red-to-green agree/disagree space in the middle, where avatars can stand to express their viewpoint, and a gray area around the outside where avatars can stand without having to express an opinion.

Mit_agree_disagree_board_308_002_2

A few highlights: As avatars move through the space, their position is visualized by a trail left behind that shows where you've come from and how long ago you moved. The space above the floor shows the text chat history. Messages float up and the older ones are the farthest off the ground. To capture the flow of opinion during the course of the meeting, an "average vote bar" rises up above the floor, like the chat messages -- with oldest views being highest off the ground (the red bars in the sky in this snapshot). A cylinder appears beneath an avatar that has been standing in the same place for a long time, and grows slowly over time. When the avatar moves the cylinder slowly fades away. My take: it would take a few minutes for a Second Life resident to be able to figure out how to use this new meeting space, but once you're over that hurdle the value could be enormous.

Mit_agree_disagree_board_308_003

Thanks to Richard Hackathorn of Bolder Technology, Inc. for pointing me to this Second Life location. And here is the Second Life URL, if you want to check out a demo yourself.

March 10, 2008

AIIM Show: Still Serving Core Imaging Needs

Craigleclair_2By Craig Le Clair

I went to the AIIM conference in Boston last week. My first AIIM show was in 1993 — where the ratio of demos to production systems was about a billion to one. For the historians out there, the 1993 show in Chicago had over 33,000 attendees. New optical disk jukeboxes and digital scanners were the rage. So it was good to see how far the industry has come in providing mature and productive solutions. Yet  — AIIM is still something of a chaotic, disorganized, vendor-feeding frenzy that seems to somehow work for most attendees.

It's probably the Boston convention center and not AIIMs fault, but is it really so hard to have something available to eat before 11AM in the morning? I gave a talk on ECM Strategy Tuesday morning and wrongly assumed some protein would be available. I was not looking for something as complicated as an egg sandwich, just perhaps a donut. The Dunkin Donut cart seemed to have more interest then any booth — an impossible line and very poor inventory.

The AIIM show has many moving parts but looking around it was easy to see some differences from last year. There were more search vendors, focus on content analytics, stronger records management presence, and amazingly — a host of new or repositioned imaging and workflow solutions from small companies. It's clear not everyone got the memo on consolidation in the industry.

There was a better overall balance and synergy between the On-demand and the AIIM side then previous years. Perhaps it was just being the same room — perhaps fewer overall exhibitors. Whatever the reason, the big iron was drawing a crowd, with the noise keeping anyone from a having a civil conversation nearby. DOM software suppliers seemed to have good traffic. Overall the DOM crowd had a lift to their step.

It also seemed that the focus for many ECM software providers was as much on VARs and integrators then end users — perhaps reflecting the importance or providing more complete ECM solutions. For the ECM mid market — the show was more an opportunity to retain and recruit resellers then to lure end users into the booth. The Microsoft partnership pavilion looked pretty quiet whenever I looked. Perhaps partners or users were at the SharePoint conference awkwardly run at the same time.

I also noticed a lot of new providers for the little things —such as document conversions — new twists on OCR, smallish ECM providers and a stronger emphasis on ECM outsourcng SaaS providers and even outsourcers like SourceCorp. Apps like invoice processing were more prominent. ReadSoft had a large presence and now has 2500 invoice processing customers. Kofax and othersa are ramping up on a winning application.

All in all, a pretty typical AIIM show which refreshingly lacked the normal technical buzz. This is — at the end of the day — the AIIM crowd where BPM is still a bit abstract. These are mostly real folks solving real problems created by paper. The printers on one side of the show crank it out faster every year while folks on the other side try to keep up with it — running scan centers, getting productivity from OCR, managing records and wondering why the pay scales are so much higher in IT.

The AIIM show always brings out hundreds of people knowing very little about the industry. They wander around — looking like George Bush at a press conference — and wonder what all the fuss is about and why does everything seem to work so well on the show floor.  I'll give AIIM credit for not losing sight of these roots. Despite content analytics, enterprise search, social networks, Web 2.0 influences, and other things that will make ECM a lot better in the next decade, most AIIM attendees get up every morning to fight paper-based processes that seem to defy automation. And for these heroes, AIIM is still a relevant experience.

Enter your email address:

Delivered by FeedBurner

Search this blog