Microsoft SharePoint can be a dumping ground for content, like a file share, or it can be a powerful tool to provide structure to content. And if documents have structure (terms that define the content, also known as metadata), SharePoint provides powerful capabilities to improve findability, content navigation, content presentation and content usability.

We find that many companies have thousands if not hundreds of thousands or more of documents sitting in file shares and other legacy applications. Typically, these documents are within folders that are nested within folders, and folders, and folders. Organizations are at risk as these documents may be out of policy relative to compliance standards and retention schedules, and typically these documents are hard to navigate and search.

Bring structure to content is not simply adding metadata to documents. To truly get value out of unstructured data, there is a model that can help not only automatically tag and define content, but move it into SharePoint. There are tools such as Smart Logic that support this activity.

However, it all starts with governance. The taxonomy and the process for classifying, tagging and moving content into SharePoint has to start with a strong governance model. Who owns what elements of the effort, such as: the taxonomy itself; confirming classification/tagging results; provisioning and management of SharePoint sites.

Five key steps to bring unstructured content into SharePoint, and apply structure, include:

  1. Setup Governance Processes and Standards
  2. Define Taxonomy
  3. Discover Content
  4. Classify, Tag , and Confirm Content
  5. Migrate into SharePoint with Tags

Tools can help automate this process by consuming the taxonomy and essentially querying content to look for exact or like tags. The tools will then provide a report to examine. We find that tools that “learn” and improve accuracy during this iterative process, the classification process gets better.

Additionally, some tools can intercept the SharePoint publishing process, query a document, and recommend tags for the user to accept or change.

The ultimate goal is to provide structure to content so the content exists in context of how the business users might access it. Users might use faceted browsing, keyword search, or in SharePoint terms, viewing grouped data by some model (role-based, status-based, activity-based, etc.)

The great thing is that we have seen organizations discovery, classify, tag and migrate vast amounts of content into SharePoint successfully. This brings incredible value to documents that previously were buried inside of file shares.