Wednesday, October 1, 2014

The Metadata Bubble

In an ideal world, scholars deposit their papers in an Open Access repository, because they know it will advance their research, support their students, and promote a knowledge-based society. A few disciplinary repositories, like ArXiv, have shown that it is possible to close the virtuous cycle where scholars reinforce each other's Open Access habits. In these communities, no authority is needed to compel participation.

Institutional repositories have yet to build similar broad-based enthusiastic constituencies. Yet, many Open Access advocates believe that the decentralized approach of institutional repositories creates a more scalable system with a higher probability for long-term survival. The campaign to enact institutional deposit mandates hopes to jump start an Open Access virtuous cycle for all scholarly disciplines and all institutions. The risk of such a campaign is that it may backfire if scholars should experience Open Access as an obligation with few benefits. For long-term success, most scholars must perceive their compelled participation in Open Access as a positive experience.

It is, therefore, crucial that repositories become essential scholarly resources, not dark archives to be opened only in case of emergency. The Open Archives Initiative (OAI) repository design provided what was thought to be the necessary architecture. Unfortunately, we are far from realizing its anticipated potential. The Protocol for Metadata Harvesting (OAI-PMH) allows service providers to harvest any metadata in any format, but most repositories provide only minimal Dublin Core metadata, a format in which most fields are optional and several are ambiguous. Extremely few repositories enable Object Reuse and Exchange (OAI-ORE), which allows for complex inter-repository services through the exchange of multimedia objects, not just metadata about them. As a result, OAI-enabled services are largely limited to the most elementary kind of searches, and even these often deliver unsatisfactory results, like metadata-only placeholder records for works restricted by copyright or other considerations.

In a few years, we will entrust our life and limb to self-driving cars. Their programs have just milliseconds to compute critical decisions based on information that is imprecise, approximate, incomplete, and inconsistent: all maps are outdated by the time they are produced, GPS signals may disappear, radar and/or lidar signatures are ambiguous, and video or images provide obstructed views in constantly changing environments. When we can extract so much actionable information from such "dirty" information, it seems quaint to obsess about metadata.

Databases automatically record user interactions. Users fill out forms and effectively crowdsource metadata. Expert systems can extract, from any document in any format and in any language, author information, citations, keywords, DNA sequences, chemical formulas, mathematical equations, etc. Other expert systems have growing capabilities to analyze sound, image, and video. Technology is evaporating the pool of problems that require human intervention at the transaction level. The opportunities for human metadata experts to add value are disappearing fast.

The metadata approach is obsolete for an even more fundamental reason. Metadata are the digital extension of a catalog-centered paper-based information system. In this kind of system, today's experts organize today's information so tomorrow's users may solve tomorrow's problems efficiently. This worked well when technology changed slowly, when experts could predict who the future users would be, what kind of problems they would like to solve, and what kind of tools they would have at their disposal. These conditions no longer apply.

When digital storage is cheap, why implement expensive selection processes for an archive? When search technology does not care whether information is excruciatingly organized or piled in a heap, why spend countless hours organizing and curating content? Why agonize over potential future problems with unreadable file formats? Preserve all the information about current software and standards, and start developing the expert systems to unscramble any historical format. Think of any information-management task. How reasonable is the proposition that this task will require direct human intervention in two years? In five years? In ten years?

For content, more is more. We must acquire as much content as possible, and store it safely.

For content administration, less is more. Expert systems give us the freedom to do the bare minimum and to make a mess of it. While we must make content useful and enable as many services as possible, it is no longer feasible to accomplish that by designing systems for an anticipated future. Instead, we must create the conditions that attract developers of expert systems. This is remarkably simple: Make the full text and all data available with no strings attached.

Real Open Access.

67 comments:

  1. Here is the details of B.Sc Respiratory Care Technology colleges in Bangalore. If you are looking to study BSc Respiratory Care Technology in Bangalore, the below will help you to find the best Respiratory Technology colleges in Bangalore.
    BSc Respiratory Care Technology Colleges in Bangalore | Respiratory Care Colleges in Bangalore |

    ReplyDelete
  2. As we know there are many companies which are converting into Big Data Solutions Developer with the right direction we can definitely predict the future.

    ReplyDelete
  3. Writing in style and getting good compliments on the article is hard enough, to be honest, but you did it so calmly and with such a great feeling and got the job done. This item is owned with style and I give it a nice compliment. Better!. Tableau Course in Bangalore

    ReplyDelete
  4. Great post. Articles that have meaningful and insightful comments are more enjoyable, at least to me.

    Java Training in Chennai

    Java Course in Chennai

    ReplyDelete
  5. I am a new user of this site, so here I saw several articles and posts published on this site, I am more interested in some of them, hope you will provide more information on these topics in your next articles.
    Data Analytics Courses in Bangalore

    ReplyDelete

  6. This is an excellent post I saw thanks for sharing it. It is really what I wanted to see. I hope in the future you will continue to share such an excellent post.

    Best Institutes For Digital Marketing in Hyderabad

    ReplyDelete
  7. Wonderful blog found to be very impressive to come across such an awesome blog. I should really appreciate the blogger for the efforts they have put in to develop such an amazing content for all the curious readers who are very keen of being updated across every corner. Ultimately, this is an awesome experience for the readers. Anyways, thanks a lot and keep sharing the content in future too.

    Digital Marketing Training in Bangalore

    ReplyDelete
  8. Truly mind blowing blog went amazed with the subject they have developed the content. These kind of posts really helpful to gain the knowledge of unknown things which surely triggers to motivate and learn the new innovative contents. Hope you deliver the similar successive contents forthcoming as well.

    Cyber Security Course

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete
  10. Good Information,
    The demand for Digital Marketing is growing big every day. With more and more users coming online, the demand for digital marketing is expected to grow even further. Digital Marketing Course with Internship at Digital Brolly.

    ReplyDelete
  11. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!

    business analytics course

    ReplyDelete
  12. Wonderful illustrated information. Thank you. It will certainly be very useful for my future projects. I would love to see more articles on the same topic!
    Digital Marketing Course in Bangalore

    ReplyDelete
  13. This comment has been removed by a blog administrator.

    ReplyDelete
  14. This comment has been removed by a blog administrator.

    ReplyDelete
  15. This comment has been removed by a blog administrator.

    ReplyDelete
  16. This comment has been removed by a blog administrator.

    ReplyDelete
  17. This comment has been removed by a blog administrator.

    ReplyDelete
  18. This comment has been removed by a blog administrator.

    ReplyDelete
  19. This comment has been removed by a blog administrator.

    ReplyDelete
  20. This comment has been removed by a blog administrator.

    ReplyDelete
  21. This comment has been removed by a blog administrator.

    ReplyDelete
  22. This comment has been removed by a blog administrator.

    ReplyDelete
  23. This comment has been removed by a blog administrator.

    ReplyDelete
  24. This comment has been removed by a blog administrator.

    ReplyDelete
  25. This comment has been removed by a blog administrator.

    ReplyDelete
  26. This comment has been removed by a blog administrator.

    ReplyDelete
  27. This comment has been removed by a blog administrator.

    ReplyDelete
  28. This comment has been removed by a blog administrator.

    ReplyDelete
  29. This comment has been removed by a blog administrator.

    ReplyDelete
  30. This comment has been removed by a blog administrator.

    ReplyDelete
  31. This comment has been removed by a blog administrator.

    ReplyDelete
  32. This comment has been removed by a blog administrator.

    ReplyDelete
  33. This comment has been removed by a blog administrator.

    ReplyDelete
  34. This comment has been removed by a blog administrator.

    ReplyDelete
  35. This comment has been removed by a blog administrator.

    ReplyDelete
  36. This comment has been removed by a blog administrator.

    ReplyDelete
  37. This comment has been removed by a blog administrator.

    ReplyDelete
  38. This comment has been removed by a blog administrator.

    ReplyDelete
  39. This comment has been removed by a blog administrator.

    ReplyDelete
  40. This comment has been removed by a blog administrator.

    ReplyDelete
  41. This comment has been removed by a blog administrator.

    ReplyDelete
  42. This comment has been removed by a blog administrator.

    ReplyDelete
  43. This comment has been removed by a blog administrator.

    ReplyDelete
  44. This comment has been removed by a blog administrator.

    ReplyDelete
  45. This comment has been removed by a blog administrator.

    ReplyDelete
  46. This comment has been removed by a blog administrator.

    ReplyDelete
  47. This comment has been removed by a blog administrator.

    ReplyDelete
  48. This comment has been removed by a blog administrator.

    ReplyDelete
  49. This comment has been removed by a blog administrator.

    ReplyDelete
  50. This comment has been removed by a blog administrator.

    ReplyDelete
  51. This comment has been removed by a blog administrator.

    ReplyDelete
  52. This comment has been removed by a blog administrator.

    ReplyDelete
  53. This comment has been removed by a blog administrator.

    ReplyDelete
  54. This comment has been removed by a blog administrator.

    ReplyDelete
  55. This comment has been removed by a blog administrator.

    ReplyDelete
  56. This comment has been removed by a blog administrator.

    ReplyDelete
  57. This comment has been removed by a blog administrator.

    ReplyDelete
  58. This comment has been removed by a blog administrator.

    ReplyDelete
  59. This comment has been removed by a blog administrator.

    ReplyDelete
  60. This comment has been removed by a blog administrator.

    ReplyDelete
  61. This comment has been removed by a blog administrator.

    ReplyDelete
  62. This comment has been removed by a blog administrator.

    ReplyDelete
  63. This comment has been removed by a blog administrator.

    ReplyDelete
  64. This comment has been removed by a blog administrator.

    ReplyDelete
  65. This comment has been removed by a blog administrator.

    ReplyDelete
  66. This comment has been removed by a blog administrator.

    ReplyDelete