SharePoint For The Enterprise

CraigleclairBy Craig Le Clair

Forrester recently surveyed 233 IT decision-makers who have plans to implement or upgrade to at least some part of MOSS 2007 and asked: "Which of the following best describes your organization's time line for implementing or upgrading to Microsoft Office SharePoint Server?". The results? 21% will upgrade immediately and 41% will do so within 6 months.

With this level of adoption the issue of scalability comes up more and more. In one sense you have architectural concerns with any solution that scales horizontally, uses banks of load-balanced Web servers, application servers, and clusters of SQL servers on the back end. Add high availability and you quickly get a complex environment. To Microsoft's credit there is quite a bit available on performance guidelines. But looking through these, and coping with notions of site collections, lists, file arrangements, performance of folder hierarchies versus flat files, and automatic versus manual partitioning, the bottom line seems to be that even on the new 64 bit architecture with 4 screaming Intel processors, and SQL 5 -- the upper limit of the content repository is 500GB.

Assuming allocation for index data and other overhead this really allows less then 20 million objects to be reasonably managed. For enterprise solutions that may ingest 50,000 document a day, this would last about a year. Basically these are not huge numbers. Contrast this –- just for kicks -– with the DocHarbor repository, which was originally built for indexing computer output like customer statements. For statement archiving, ingestion runs done at month-end can be 10 million. Meeting the index management requirement meant building configurable hashing algorithms that can create up to 64K database tables. Anacomp, owner of the DocHarbor platform, claims 1 billion objects managed in their high-end IBM Unix environment.

The clear message from Microsoft is that MOSS 2007 can be an enterprise solution but firms contemplating this should plan carefully. The point for companies adopting SharePoint to consider is the degree of complexity and attendant soft costs required to manage this environment as it grows. I fear that firms will be seduced by the ease of rolling out SharePoint and at the end of the day may prematurely push this platform too far. But let's be fair. All platforms at the "enterprise" level of performance are complex and require strong engineering to maintain, whether it be EMC, IBM, OpenText, Hyland Software, or others. Yet, these products have had years dealing with approaches to keep table sizes manageable, optimize database schemas, index documents at high rates, and maintain business continuity.

Companies that have these mature environments in place must compare the complexity of scaling with SharePoint to leveraging that back end investment and skilled people they have. Many will chose to link that back-end to elements of SharePoint that makes sense.

I'm doing an AIIM Webinar on April 9 that will discuss this. I'd be very interested in any comments from enterprises -- or suppliers -- with experience wrestling with this decision or that have had success or issues with trying to scale SharePoint. Our recent adoption data makes this a very topical issue for many firms.

Categories:

Comments

re: SharePoint For The Enterprise

Plenty of content, tools, and guidance for SharePoint capacity planning and performance management at: http://blogs.msdn.com/sharepoint/archive/tags/Capacity+Planning/default.... and http://blogs.msdn.com/sharepoint/archive/tags/Performance/default.aspx

re: SharePoint For The Enterprise

There are no hard limits set on the size of a SharePoint 2007 Content database. The key considerations relate to the manageable size of a database from an administration perspective; this includes issues such as the length of time taken to back up or restore a large content database.Check this blog posting for further information:http://blogs.msdn.com/joelo/archive/2006/08/01/how-large-for-a-single-sharepoint-content-database.aspx

re: SharePoint For The Enterprise

The point you bring up about a 50,000 document day (which translates easily into an annual volume of, say 13 million) is your indication of an "Enterprise". So I'm assuming that from this level on up, is what you’re talking about. You have another gentleman from EMC pointing at this opinion as evidence of (implied I would say) SharePoint no being "Enterprise" enough. I would expect this from EMC, who probably is the most threatened by the increasing popularity of the latest SharePoint release, and so this gentleman is protecting his turf by injecting a little thought leadership. Because existing long time favorites like Documentum were never focused on the collaborative activities surrounding producing document based content (their job was to address the storage and indexing at levels you allude to), SharePoint has now repositioned the selling points for document systems to be both collaborative and storage, along with integrating very good development environments for work flow, applications, and business intelligence. Companies that are EMC customers, are also facing a demand for SharePoint that they can no longer parry off (the potential productivity, and organizational benefits are far too compelling). So EMC is now offering up conceptual “Architectures” where SharePoint is viewed as only a means to get documents produced and eventually into EMC’s Documentum product (which, as anyone can easily figure out, also drives demand for storage and storage management). As analysts firms do, they cater to the IT sole who needs simple answers to very complex questions, and don’t possess the background (technically or problem domain) to develop the answer themselves, nor the time. So the 500 gig thought is an easy one to digest and remember, even though it is misleading for several reasons:• Even in products such as Documentum, there is never really a single repository physically, for the same reasons SharePoint is not architected likewise.• As per Microsoft’s documentation, the 500 gig thing is an administrative limit, no a physical one. SharePoint document systems are usually architected to spread document storage over multiple “content” databases. This is driven by many factors both technical and business related.• This is really a “non-answer” in that the discussion of how document repository solutions arrive at physical storage architecture (whatever the underlying technologies are) is a very complex and multi-part answer, and therefore the conclusion that could result from an opinion like this denies the thought required to come up with the suitable answer for any given document solution.There’s this notion of “Enterprise” out there in the analyst and industry press that is used to fend off consideration of other technologies or products. This is mostly based on past experiences, adulation of one vendor over another or widely held beliefs that build up over time. The trouble is (particularly in what I would call the typical IT marketplace) our world changes so frequently due to new technologies, innovation, and the never ending downward price point trend , that comfortable human notions like widely held beliefs and accumulated wisdom need to be tossed out and re-built almost on an annual basis.So, I think you give SharePoint to much short shrift as a document platform, while conveniently not mentioning Documentum’s (and like products) almost total lack of a productive front end for collaboration and information delivery which is driving the lust for SharePoint. The gentleman from EMC identifies some viable approaches for companies faced with supporting a legacy document store and addressing the demand for SharePoint, but misleads one to believe that a documents stay in SharePoint is transient or short term. The potential here is that documents that are no longer useful from a content standpoint (but from an historical or legal standpoint need to be retained and locatable) will persist in the “Enterprise” document store, while the usable and “working level” content remains close to the collaboration facilities and mechanisms. That would be a useful message to deliver, rather than only pointing out the scale issue for a small population of companies that have a massive document management problem to solve, and anointing that as the “Enterprise” example.

re: SharePoint For The Enterprise

MS's recommendation for an upper limit to a site collection is 100. Each of these can have their own content databases. Each DB is recommended to be 100GB but we have heard of databases of 300-400. The limit here is SQL not MOSS - performance and backup times to meet SLA's.So you could theretically have 100 * 100 GB = 10 Terabytes of content stored across a scaled out farm.