Signate

Signate Document Management
posts - 10, comments - 5, trackbacks - 0

Metadata Changes & Versioning

Daniel Antion has an interesting and well thought out article called “Can Records Change” at the Association for Information and Image Management. His question details what we do about changes in data about a document, or metadata. I’m thrilled about him bringing up this topic; because it’s one I’m passionate about. Let’s think about some reasons this information changing and maybe we can shed some light on his question:

  • The underlying document changed. This is probably one of the most common reasons for metadata changing; people make changes to documents all the time. The contents may have been modified; the subject could have been modified; authors added; review information changed and so on.
  • Linked information changed. This is less common; and many document management systems don’t handle it correctly or at all. Consider a situation where we link to a Person record on our line of business system. We may store some of the fields from that record in the document management system; such as Surname or City; things that may make it easier to find the document down the line. So; we capture an Application form for a “Ms. Jones”, but 6 months later we find out that she’s got married and hew new name is “Mrs. Smith”. Do we leave the original record data as it is? Curse ourselves for storing Line of Business data in our DM system? Change the data; accepting that a search for “Ms. Jones” now won’t find a document that plainly says “Ms. Jones” on it?
  • Information captured incorrectly. Depressingly common; we obviously want the correct information. However, our auditors and lawyers will possibly also want the original metadata; especially if processing or business decisions were made using that information.
  • Extra information added. Our processing workflow might well add metadata to the document; storing information about the processing steps undertaken; approvals gained; signatures affixed and so on. This doesn’t change the original document or metadata but must be accessible as well.
  • Our metadata schema changes. This is also depressingly common, where we change what fields can/must be captured against a document type. Much as we all like to think we can plan perfectly, and much as our clients love to believe they understand their requirements full; the truth is different. Think about a scenario where we’ve been in operation for 3 months when the client comes in and tells us that they need a “Category” field added to the document type. Great; we can add it, but what about the existing documents that don’t have it? Does this mean that we have to add it as an optional field? In too many systems the answer is yes. Now, a couple months later they change their mind. “Get rid of it”, the client commands. What happens to the documents captured with the data? If we restored the field sometime in the future would their data have been lost? Again, too many systems have “yes” as to the answer to that question.

Okay, so now we’ve had a look at some of the reasons that the document can change, we can see some requirements coming out. Our hypothetical metadata system must keep a version history; and must keep it in such a way that previous versions data is still accessible in searches. Needless to say audit information about who, what, when, why must be stored against each metadata change. The system must be flexible to schema changes, allowing fields to be added later - even if mandatory, as well as allowing them to be removed and even restored.

Additionally when we keep a version history, we must also consider whether we want a bitemporal system; a system which not only stored what did happen; but also what should have happened, e.g. we only updated “Ms. Jones” to “Mrs. Smith” yesterday; but she sent us the documentation 2 months ago and we should have done it then. A bitemporal system caters for such a situation; allowing you to see both the “Operational Truth” of how events actually occurred and the “Business Truth” of how events were supposed to happen.

As you can see, what seems like a simple topic of changing information becomes complicated very quickly. It’s important that your document management system handle these complexities in an intuitive manner. Almost every system I’ve ever seen falls over when it comes to metadata. The most usual reason is that most systems are designed around their underlying database; and that database doesn’t handle one or more of the scenarios I’ve outlined above. For example, a relational database like SQL Server can’t cater for schema changes correctly without a great deal of work that frankly isn’t worth the effort. Other systems use a more hierarchical store which handles the schema changes nicely, but struggles with efficient bitemporal access and most importantly tend to have rotten performance.

Do you know of other systems that can efficiently handle all of the above reasons for metadata changing? What about scenarios I’ve left out?

Want to change your metadata reliably, accurately and quickly? Signate 2010 handles all of the above scenarios well due to it’s unique and innovative design.

Print | posted on Tuesday, July 19, 2011 8:34 PM | Filed Under [ Technology Metadata ]

Feedback

No comments posted yet.

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 3 and 1 and type the answer here:

Powered by:
Powered By Subtext Powered By ASP.NET