Vision
What is ECM?
Written by: Margriet Bruggeman, Nikander Bruggeman.
May 1, 2010
What is ECM?
When we first started working with SharePoint, we were working with the beta version of SharePoint Portal Server 2001 to implement an intranet for a medium-sized law firm. At that time, SharePoint consisted of a digital dashboard, the newest version of MS Search and a bunch of document management features (and a really groovy Windows shell extension that made it possible to interact with SharePoint using Windows Explorer, which we still kind of miss to this day). Our customers used SharePoint Portal Server 2001 mainly as a document management system, migrating documents from their file servers to a SharePoint repository, although it wasn't a very scalable one at the time.
After working intensively with SharePoint Portal Server 2001, we had the honour to become one of the first eleven SharePoint Most Valuable Professionals (MVP) world-wide. Imagine our surprise when we were invited to the MVP summit in Redmond and learned about the new features of SharePoint Portal Server 2003: it had arguably less document management features than SharePoint Portal Server 2001 and it's architecture had been overhauled completely, switching it's repository from a light version of the Exchange Web Storage System (yes, that was what WSS meant originally) to SQL Server. Another thing became quite clear as well: SharePoint Portal Server wasn't just a document management product, it was a portal product. That comprised a lot more. A portal offers a collection of content and application services to employees, customers, partners, and/or suppliers via internet, intranet, and/or extranet. Or to put it another way: the goal of a portal is to enhance collaboration and efficiency. A funny thing happened to end users too. Suddenly they weren't normal employees anymore, they became information workers. A typical portal product delivers on its promise by offering features in the following areas:
- Legacy application life cycle. Portals may extend the life cycle for legacy applications, because they offer a centralized place where legacy application may be reused.
- Consistent user interface. A consistent user interface throughout an entire portal lowers training costs and makes it easier for end users to find information, which makes end users more productive.
- Web-based access. Web-based access is an ideal starting point to reach the goal to work anywhere anyhow.
- Personalization. Portals typically offer personalized user experiences and content.
- Customization. A portal product should be (and normally is) highly customizable according to the customer's needs.
- Syndication. Portals reuse content from other locations, a concept known as syndication.
- Aggregation. Closely related to Syndication, portals allow the aggregation of content and services via a universal access point.
- Search. Portals offer advanced search mechanisms that allow information workers to search, at a minimum, within the portal. Usually a portal search mechanism will allow you to search other locations within the enterprise as well.
- Taxonomy. Stemming from the Greek word "taxis", meaning "order", taxonomy allows you to structure unstructured or semi-structured content.
- Business intelligence. Business intelligence is about handing companies advanced tooling for analyzing business data.
- Integrating external content. This refers to any tooling a portal product has to integrate external content that has to stay external, but at the same time can be manipulated via the portal.
- Collaboration. A portal absolutely has to provide tools that make it easier for information workers to work together and share information with each other.
- Publishing. This refers to tooling offered by the portal to publish content.
- Subscription. In today's world of information overload, it is essential that a portal offers some type of subscription mechanism that allows information workers to stay up to date without requiring them to log in to the portal just to check if something is new.
- Workflow. Portals handle semi-structured content, and workflows related to this content (such as a simple approval workflow) is a normal part of the activities of information workers.
- Business Process Integration. Business Process Integration is all about connecting enterprise applications by streamlining the transfer of business information. Business Process Integration in itself doesn't belong to the realm of a portal, but it is important that a portal is able to integrate with a Business Process Integration product. In the case of SharePoint, this product is BizTalk Server.
- Multimedia distribution. Any multimedia assets owned by an enterprise can be shared via a portal. Ideally, a portal offers some specific features for working with specific multimedia types as well.
- Security. Information disclosed via a portal should be done in a secure way using fine-grained authorization techniques.
- Administration. Portals that offer a centralized and unified way of administrating it can cut costs tremendously.
- Synergy. The term synergy may be a bit vague. It is about the whole sum (the portal) being more than all it's separate parts (the portal features). Bringing all separate portal features together as a whole adds value.
As you have seen, a document management product is a quite different beast than a portal product. In it's 2007 release, the contours of something else became visible: SharePoint Portal Server became more than a portal product: it became the first release that was able to compete on the Enterprise Content Management (ECM) market. This has even become more clear in SharePoint Server 2010. Not only does it show in the feature set that is available out-of-the-box, even the word "Portal" is dropped from it's name.
Most contemporary sources about SharePoint acknowledge that SharePoint is an ECM platform and that it's goal is to deliver ECM for the masses, but they fail to provide a good explanation of what ECM is exactly and why ECM is something more than a portal. This has lead us to believe that a discussion of what ECM is really all about would be a good idea for the blog post you are reading right now.
So: what is Enterprise Content Management?
First of all, let's be clear about this: the functionality offered by a portal has great overlap with the functionality offered by an ECM product. You're not entering a complete new realm of solutions here.
Now, let's take a step back and get down to the basics… Most companies store data in a way that matches the existing structure or architecture of business applications in use. Ultimately, this leads to a data structure that matches the structure of a company itself. The sales department then uses sales data, the software department uses data about software design, and so on. The result of this is a set of department-oriented data structures in a company. This means that data consumption is optimized for departmental use, but not for multi-department use. This leads to a data landscape that is fragmented, which in result leads to data redundancy. A prominent trait of redundant data is that it isn't synchronized. In general this means that a document that is modified in one location, won't be modified in the same document stored elsewhere, which leads to a loss of information quality. And, needless to say (but we're going to do it anyway), that's a bad thing.
To battle the loss of information quality, companies typically create procedures that need to prevent this loss. That leads to hidden costs that is the result of information fragmentation and process redundancy. It's not rare that this leads to a "fix it back" scenario, where a certain procedure ordains data to be fixed, whereas another procedure commands the contrary and results in fixing the data back to its original situation.
You can certainly say that companies are managing digital information and have been trying to do this for years. One of the things that have become quite clear is that integrating data per application is not the best long term strategy. Doing this forces companies to keep data within the company semantically consistent, which leads to sky high data maintenance costs. On the other hand, companies increasingly need to have information available within the entire company, not just within a department. Defining information metadta plays a crucial role in doing this, although it's not the only way.
EIM, which stands for Enterprise Information Management, is a framework that offers the possiblility to develop and execute an enterprise-wide strategy concerning information management. EIM is a collection of disciplines, technologies, and solutions that help to create and maintain a consistent interpretation of business data that can be consumed by everyone (users, applications, and services)! EIM focuses on structured data, the data stored in business applications, as well as unstructured data, such as Word or Excel documents, pictures, videos, paper documents. Usually this type of information is scattered across a company like a diaspora. The management of unstructured data is also known as ECM, which stands for Enterprise Content Management.
As a side note, software are usually the primary (and more suitable) consumer of structured data, humans are usually the primary (and more suitable) consumer of unstructured data.
If we need to come up with a definition of ECM, it would look something like this: ECM is an approach that tries to control all unstructured data within a company, which is accomplished by adding structure to unstructured data, most notably in the form of metadata. ECM is more than just a collectoin of products and technologies, it's also an architecture that allows companies to manage and reuse content.
As we've seen previously, the goal of a portal is to make collaboration easier and more efficient. by offering a collection of content and application services. This goal is very similar to the goal of ECM, which tries to make it possible to collaborate on a multi-departmental level, do this in a more efficient way, and offer access to unstructured data within the company This leads to a repeating pattern: usually the implementation of a portal strategy and an ECM strategy are executed at the same time using the same platform. Or, to put it in other words, most companies don't really make a difference between their portal and ECM strategies.
To debug one of the myths you may have heard surrounding ECM, your ultimate ECM goal shouldn't be to collect all enterprise content within a single, centralized system, it should be to create a uniform interface that allows you to access all information.
ECM, similar to what we saw when discussing portals, knows many facets and we will discuss them all:
- DM, DM addresses document management features such as check in/check out, document security, versioning.
- WCM, stands for Web Content Management, an approach to manage content for web applications. This also used to be known as: "automating the web master".
- RM, stands for records management. Records management technology allows you to manage the life cycle of a piece of informaiton. RM allows you to manage content from the moment of creation, up to the moment of destruction, adressing topics like archiving, relationships between content (such as compound documents). Companies use RM to be able to abide to laws and regulations. RM can also apply to "soft information" like e-mail and instant messages.
- Document capture and imaging, this facet is about processing paper documents. For instance, you can scan paper documents, have them processed by OCR (Optical Character Recognition), and store the result in SharePoint.
- Document-centric collaboration, this facet focuses on sharing documents within project teams and stimulating intra-team activities. Examples of document-centric collaboration features are calendars, forums, whiteboards, instant messaging, presence awareness, surveys, and project management.
- Workflows and work process management, this facet is about automating business processes. Document centric workflows like sending a document for review and then publishing them are ideal for SharePoint.
- Information retrieval, also known as ESS (Enterprise Search Strategy), focuses on searching enterprise content.
- ECI, stands for Enterprise Content Integration and acts as a software layer between applications and content repositories that allows companies to integrate heterogenous environments.
- Federation, also known as Single Sign On (SSO), allows end users to log in only once and access every piece of information they want without ever having to log in again.
- IDARS stands for Integrated Document Archive & retrieveal Systems. IDARS systems are specialized in maintaining static information like invoices and reports (documents that have been finalized, not the dynamic kind you can generate using technology such as Reporting Services).
- Forms capture and processing, basically an extension to Document capture and imaging. This facet also uses recognition technologies to process information, only it goes a step beyond simple recognition. Forms capture and processing technologies recognize content parts and are able to extract key fields such as a customer ID which is compared automatically with a customer database.
- DAM, Digital Asset Management, concentrates on managing rich media, particulary the image assets within an organization. A Typical DAM feature would be showing a thumbnail of an image asset in an image library, or even allowing thin clients such as a browser to perform simple graphical actions such as resizing or cropping images, batch jobs that convert a set of images to another format.
- MAM, stands for Media Asset Management, a subset of DAM. MAM focuses on videa and audio rich media assets.
- COLD/ERM, stands for Computer Output to Laser DIsk/Electronic Reports Management. This facet focuses on high-volume computer generated reports. COLD/ERM typically offers indexing and report mining capabilities.