In an attempt to free scholarship from the bindings of Adobe Acrobat, there is an ad hoc group working to define a “scholarly” version of HTML5. Why does there need to be a special version of HTML5, you ask? Good question. It is so the data in the article can be marked up with semantic information that can allow the data to be understood and re-purposed using software.
For example, markup like this tells you that what is between the tags is the name of the creator of the piece according to the use of that term by the Dublin Core community:
Here is an example of what a “full” ScHTML document might look like:
<html property=”http://scholarly-html.org/schtml“>
<body>
<span property=”http://purl.org/dc/terms/title“>Title</span>
<span property=”http://purl.org/dc/terms/creator“>Author</span>
- ScHTML is declarative about the information it contains and not imperative about the way the information is displayed or consumed by the scholar.
- ScHTML is scholarly because it addresses the every-day problems scholars have in conveying the outputs of their work and providing education. Fundamental to the standard is a community-led process of creating broad range of tools for producing or consuming ScHTML for each specific requirement.
- ScHTML is a domain-specific application of the W3C HTML standard and tracks that standard so long as it supports the requirements of the communicating knowledge in a declarative way within the scholarly community.
- ScHTML is not owned by anyone but is developed by the community through a democratic process. The community will be continually invovled in developing vocabularies. ScHTML will define a very small core – the absolute minimum infrastructure (perhaps author and date) and how to define conventions.
It will be interesting to see where this group goes, and how rapidly it gets there.














What happens when the black hat folks get a hold of the spec and exploit it?
Viagra
Not to sound like an old beard but couldn’t this be done with LaTEX http://www.latex-project.org/
That is markup that leads to non proprietary PDFs or HTML or Postscript for that matter. Scholarly HTML is a good idea but couldn’t it be just another output from a more basic markup. After all, we are still stuck with the “bindings” of Microsoft Word.
@edward
Scholarly HTML is a format – so of course you could create it with LaTeX. I look forward do seeing what the LaTeX communitiy can contribute to this effort. Another back end format that is perfect for creating Scholarly HTML would be the NLM XML schema.
I’m just starting to learn about RDA – but isn’t that what that schema is trying to solve also?