What's happening in data archiving? It's exciting to see that archiving structured data alongside unstructured, document-centric records is becoming a reality for many organizations - and delivering tangible benefits back to their bottom line.
This value is well articulated in a recent Paragon Business Solutions Blog, 5 Key Ways Structured Archiving Delivers Enterprise Value. In it, there are a number of approaches mentioned regarding archiving structured data that are worth bringing to the forefront once again for discussion.
Full Schema Archiving
Structured data is often considered to be information captured in a complex, relational database. Full schema archiving is essentially taking the entire database with its data and transforming it to a format that is supported by the archive system. Depending on the enterprise archive solution, the data may be searchable via business queries and may return both structured and unstructured records in the search results. This can often be alarming to the archive user as they may not have been familiar with the line of business application. That is the trick in preserving and archiving records - they must be understood as records if they are to have meaning and deliver that value.
When assessing systems and records for archiving, it is imperative that the business record or records be defined alongside how the users and archivists will pull the records out of the system. With full schemas and even with partial schema archiving, it is imperative that organizations Archive the Meaning of the Data. The meaning of the data is a representation and explanation of those backend tables, fields, relationships, linking and other tricks DBAs have used to make the business user's life easier when capturing or entering data. When done well, full schemas may be flattened to csv or xml files alongside instructions for reconstituting the records when required in support of legal, regulatory, financial or other audits.
Table or Partial Schema Archiving
Table or partial schema is as you would expect - namely defining records based on a specific table or shard of the full database. In supply chain manufacturing you might have an instrument calibration and maintenance system - Maximo, ProCal and other type systems. Perhaps the record is defined in terms of plant floor equipment and work orders that can be extracted from the whole and defined in a partial schema or cut of the database whole. Partitioning, both horizontal and vertical partitioning, is a way to limit the size of an otherwise behemoth database by taking only what is needed and taking it in a way where queries may still be run on the data while it is in the archive. Just as in a full schema approach, the record and data model need to be well defined and preserved alongside the data and records themselves.
Print Streaming or Report-based Archiving
Print streaming, or what is sometimes referred to as "report-based archiving" is just that - defining the record in terms of what reports are typically run by the business, whether for business as usual queries, or when audits, investigations and other challenges come in to play. Consider this method the 'print to PDF' method where the record could be reports by time, equipment, location or some other category where the shard is not a table or partial schema, but rather a report based on standard queries and system templates. The value in print streaming is that the reports are unstructured records that then are archived alongside the other records in the archive and as such easy to search, retrieve and use by anyone in the organization that has a legitimate need to use the information. As an example - in some pharmaceutical and other life sciences companies, reports on sunshine act types of spending can be generated and archived from an aggregate spend system and archived as PDF reports by country, state and time.
Keep In Mind
The method selected is determined by the archive solution available, the types of records identified with their associated retention requirements, the office of record for the structured data system and its record stakeholders, as well as the type of database and the options for extraction and report printing. All these together should be used in the appraisal process of the Open Archival Information System (OAIS) standard. Regardless of the methodology for archiving structured data, there are cost savings and benefits in doing so and there are numerous platforms today that will accommodate all three of these smart approaches.