I’ve been troubled lately by what I perceive as a fundamental misunderstanding of the nature of our transition from record-based bibliographic metadata to linked data. Although this misunderstanding can be expected, given how long our profession has been invested in a record-based infrastructure and standards, it is potentially disastrous should we prove not up to the task of overcoming it.
Let me break this down for you.
- All of the power of linked data derives from the links. For some reason I think it bears saying that linked data without useful links is simply data. It isn’t linked data.
- Links are only useful if they lead to an authoritative source that has something useful to provide. Some libraries have been creating “linked data” by minting identifiers that lead nowhere worth following. This also is not linked data.
- Simply translating bibliographic data from one format (MARC) to another (BIBFRAME or Schema.org, for example) does not create useful links. This is one of the essential bits that everyone needs to understand. Our bibliographic transition is not one of translating records from one format to another. It will instead involve processes that are devilishly complex and difficult to carry out well. Given this, only some of us in the library world will be capable of doing it as well as it should be done to be truly effective.
- To achieve true library linked data, individual MARC elements must be turned into actionable entities. By this I mean that an individual MARC element such as “author” must be translated into an assertion that includes an actionable URI that leads to an authoritative source that has something useful to say about that author.
- Creating actionable entities will require new kinds of processes and services that mostly don’t yet exist. These are still early days in this transition, but we are already beginning to see the kinds of services that will be required to take our static, inert library data and turn it into a living part of what we are beginning to call the Bibliographic Graph. At OCLC Research, where I work, we call this process “entification.” This might encompass the creation of your own linked data entities or the use of those created by others by using a “reconciliation” service.
We need to shake off the shackles of our record-based thinking and think in terms of an interlinked Bibliographic Graph. As long as we keep talking about translating records from one format to another we simply don’t understand the meaning of linked data and both the transformative potential it has for our workflows and user interfaces as well as the plain difficult and time consuming work that will be required to get us there.
Sure, we at OCLC are a long way down a road that should do a lot to help our member libraries make the transition, but there will be plenty of work to go around. The sooner we fully grasp what that work will be, the better off we will all be in this grand transition. No, let’s call it what it really is: a bibliographic revolution. Before this is over there will be broken furniture and blood on the floor. But at least we will be free of the tyrant.
Note: portions of this post originally appeared as a message on the BIBFRAME discussion.
Does OCLC have an example of a Bibliographic Graph that it can share?
Rocki, sure thing. There are a couple ways to experience the graph. One way is to go here and follow your nose: http://worldcat.org/entity/work/id/2406166 . Another way is to go here, choose a collection, do a search and then follow your nose: http://experiment.worldcat.org/entityjs/
Roy, thank you for articulating this. There is so much misguided talk about linked data in the bibliographic world. Ultimately, it will not be a matter of libraries populating their metadata with URIs, but simply pointing their holdings to a single URI which itself is an authoritative representation of a bibliographic entity (ideally, at the manfestation level, in FRBR-speak). Libraries should only have to generate data for material unique to their collection. Of course, generating and exposing this data is complex, and should not be something metadata creators should have to worry about – it is the responsibility of the metadata system. Thanks again.
What are the “transformative potential it has for our workflows and user interfaces”? And what does that even mean for the users?
What are bibliographic records other than contextualizations of atomic (linked) data?