Signate

Signate Document Management
posts - 10, comments - 5, trackbacks - 0

The Truth About Folders: a rebuttal

In AIIM, Laurence Hart makes a number of comments about the Search vs Folders debate.

Claim 1: People are used to folders, Rebuttal: They are used to search as well

Not a good enough reason to stick with folders. By that logic we shouldn’t use search engines to access the Web, we should organise it into folders instead, just like Yahoo used to do. You see, people are also used to search engines, they use them every day of the week. In fact, more and more people are using search based idioms rather than folder based idioms to access their systems. Look at the search box built into the Windows 7 menu, and every Windows Explorer window.

Claim 2: Search Engines fail, Rebuttal: So does everything else

He claims that we should have folders as a fallback position in case the search engine doesn’t work. Well, if your technology is based around an unreliable bolt-on search engine (looks meaningfully at SharePoint), then yes, this is a valid concern. If your entire system is designed around search, then the search engine is the core and any folder-based view would be the bolt-on, and thus would be the one more likely to fail. All systems can fail from time to time, but that is not a good reason to not use an entire class of technology.

“Cars sometimes break down, so we should all use horses.”

This is a classic example of the logical fallacy of the excluded middle.

Claim 3: Folders help you organise, Rebuttal: Why manually organise?

I’m actually not sure what point he is trying to make here exactly. He goes into taxonomies, and how folders help users create a “well-executed taxonomy”, and how creating a taxonomy without folders sacrifices performance and simplicity. Not one person I’ve ever spoken to about their requirements from a document management system has ever mentioned the word taxonomy. Not one. Ever.

I will not deny that a proper taxonomy is easier to do with folders than without. I will even admit that should a system have a bolt-on taxonomy system this will likely be less performant and simple than a system designed around taxonomies. I deny the need for taxonomies at all. Search and metadata is all that is required to search billions of documents, and requires zero extra effort.

He then admits that these taxonomies change, and systems must be put in place to manage these transitions. I’ve never once had to redesign search, it’s search for goodness sake; and if your data changes, just reindex it! Want to add a metadata field? Reindex. No manual effort, let the computer do it for you.

4. Not using folders cripples systems, Rebuttal: Only if the developers were idiots

This claim boggles my mind. Let me quote: “One of the problems that you get when you don’t use folders is that you can cripple most systems. While few systems claim a limit to the number of documents that can reside in one location, there is a practical limit”. I’m pretty sure that what he’s talking about here is the well-known reality that operating systems struggle when directories file up with more than a few thousand files.

He seems to be conflating the experience of the system from the outside (i.e. the users sees no folders), with the implementation details of the inside (i.e. does that mean the system stores every document in one huge directory). This is utter rubbish. Signate as an example creates an internal directory structure which documents are routed to in a balanced fashion, ensuring that no directory winds up with too many documents. This structure is internal to the system and is a performance and management implementation detail. It is not exposed outside the system at all.

In fact this argument of his is an excellent example of why folder based systems don’t work as well as search based ones. While Signate automatically balances files across a directory structure designed to allow billions of documents per node, no such balancing can be applied when humans are involved. Every folder-based system I’ve ever seen winds up with a “dump” location, sometimes more than one, where documents which don’t fit the taxonomy neatly are placed. This can swiftly grow to thousands of documents, resulting in the very problem that Laurence claims search based systems suffer from. Sure, if the taxonomy was perfect, this would not arise; and this is also a sign that the taxonomy may need to change, resulting in a great deal of manual work. In a search-based system with balancing, this situation never arises. This is not just a punt for Signate; I’ve never seen a search-based document storage without balancing, and I struggle to comprehend that anyone would ever conceive of designing such a system.

He then claims that “You can swear that nobody will ever browse to [the internal storage] location, but unless you remove that capability, someone will do it”. Well, of course we remove it! I consider it a massive security breach if people are able to access the internal document location of the system without passing through the interface to the system. Does he allow users to access his internal company databases directly? Of course not.

5. Search Engines can’t read your mind reliably, Rebuttal: nothing can

Neither can folders. Search engines help you find what you’re looking for; folders let you know where you’re looking. Which would you rather have? We don’t need perfect reliability; you can refine search terms based on the results we see. Too many results? Add search terms. Too few? Remove some. Make some more approximate, tighten up others. Signate allows an enormous range of searching options, including approximate search where words similar to the specified word are found.

Conclusion

Clearly, I’m biased. I’m so convinced of the value of search-based document management that we created one ourselves. Laurence is a specialist in Documentum, a prominent folder-based document management system. So, we’re both biased. But read his article, read mine, and then ask yourself which approach:

  • Will get the benefits of document management into my users hands faster?
  • Will result in the lowest ongoing administration whilst delivering excellent results?
  • Will adapt to my changing business needs?

Folder-based systems are great for rigorously defining the information content your organisation needs; and if you’re working in a top-down company that has an IT department that can easily define a data dictionary for your entire business and enforce it’s consistent usage; then I’d strongly suggest you look at systems that support such an approach. If, however, you work in the remaining 99% of businesses where change is constant, time is precious, and flexibility and turnaround are more important than rigor; then look at systems that support that approach.

Print | posted on Wednesday, July 06, 2011 9:23 AM | Filed Under [ General Search ]

Feedback

Gravatar

# Rebutting the Rebuttal

Sean, I wrote a rebuttal to your rebuttal of my rebuttal of the original post. It is located on my primary blog http://wordofpie.com. (http://wp.me/p4OLk-nc)

Oh, and for the record, while I am a Documentum specialist, I have experience with other systems such as FileNet, OpenText, eDocs (also from OpenText yet separate), Nuxeo, Alfresco, and SharePoint to name a few. Documentum was not my first CMS.

-Pie
7/7/2011 12:18 AM | Pie
Gravatar

# re: The Truth About Folders: a rebuttal

Fair enough. I suspected that was the case, but wasn't sure.
7/7/2011 12:15 PM | sean
Gravatar

# re: The Truth About Folders: a rebuttal

Inigo has an excellent post on this subject at bigmenoncontent.com/.../folders-arent-born-bad/
7/10/2011 8:31 AM | sean

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 3 and 1 and type the answer here:

Powered by:
Powered By Subtext Powered By ASP.NET