I'm going to try to avoid the philosophical questions and just provide
some feedback on what we have been planning and starting to do at
Perseus as pertains to CTS URNs and their use in HTTP URIs. I really do
think CTS can be compatible with Linked Data concepts.
urn:cts:greekLit:tlg0012.tlg001.perseus-grc1
Identifies the unique resource which is Perseus' TEI XML version of
Homer's Iliad that is identified in the Perseus CTS inventory as
'perseus-grc1'. (This TEI XML version was based on the Oxford 1920
edition, which information can currently be found out by looking at the
description of the version in the GetCapabilities response, but may soon
also be reported in the header of the GetPassage response. But see more
below on this topic).
urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1.1 identifies line 1.1 of
this version
If you make a GetPassage request directly to the Perseus CTS API (at
http://www.perseus.tufts.edu/hopper/CTS) for this resource you will get
back a CTS GetPassage response which contains the TEI XML for line 1.1
of this version.
If you make a GetPassage request directly to the Perseus CTS API (at
http://www.perseus.tufts.edu/hopper/CTS) for line 1.1 of the notional
work the Iliad (identified by urn:cts:greekLit:tlg0012.tlg001:1.1) you
will also get back a CTS GetPassage response which contains the TEI XML
for line 1.1 of this same edition, because Perseus has decided to make
this specific greek edition the default version it returns if a version
hasn't specifically been requested. We don't currently identify the
specific version returned in the GetPassage response, but I agree with
Hugh and Neel that we should and so we will make that change.
The stable URI for this specific line of this resource is
http://data.perseus.org/citations/urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1.1
If your request for this resource includes an HTTP Accept header which
includes text/html then the response is currently an HTTP 302 redirect
to the HTML display for the page that contains this line of text in the
Perseus interface.
If your request for this resource does NOT indicate via its HTTP Accept
header that it accepts text/html, then currently the response is an HTTP
200 that contains the results of the CTS GetPassage request, but we will
be changing this to be an HTTP 302 redirect to the GetPassage response,
in order to be consistent with linked data recommendations.
We also intend *soon* to support the following alternative request
syntax which doesn't require use of the HTTP Accept header:
http://data.perseus.org/citations/urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1.1/html
http://data.perseus.org/citations/urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1.1/xml
As well as uris like the following which will return the entire XML for
the text (or the stating HTML page for the text depending):
http://data.perseus.org/texts/urn:cts:greekLit:tlg0012.tlg001.perseus-grc1/html
http://data.perseus.org/texts/urn:cts:greekLit:tlg0012.tlg001.perseus-grc1/xml
Now, what happens if you issue a request for a resource via CTS urn that
Perseus does NOT have? Currently you get an error, but our plan is to
redirect the user (via an HTTP 303 See Other redirect) if at all
possible to a bibliographic resource for the requested work and/or
edition, if we have it. That bibliographic resource (which again would
be available in both HTML and XML) could potentially contain links to
places where the user might find that resource.
Adding a subreference (e.g. @μῆνιν[1]) to any of the above requests
doesn't change way the redirect to the response happens.
Now, the use of special chars like [] and the greek unicode in the URI
may need to be escaped, and this is a little ugly, it doesn't invalidate
their use in a URI. Hugh, I understand your desire to break the CTS
components up into a more path-friendly syntax, but I'm still not
persuaded that it's necessary.
I do think we need to work out how we want to make clear relationships
between different versions of texts which may be based on the same
print-published edition. I think it was becoming clear that the use of
the exemplar component of a CTS urn for this introduced more problems
than it solved, and we are now planning on using that 4th component to
enable us to support different versions of a version (e.g.
urn:cts:greekLit:tlg0012.tlg001.perseus-grc1 is actually identifies
whatever is the currently published version of that TEI XML edition, and
iterations on that TEI XML will be identified by versions
urn:cts:greekLit:tlg0012.tlg001.perseus-grc1.1
urn:cts:greekLit:tlg0012.tlg001.perseus-grc1.2 etc.). I *think* the best
way to handle this question of provenance may be via rdf relationships
to thinks like worldcat uris, etc. but this is something that I think
definitely needs further discussion.
Hugh, what do you think? Does the above address any of your concerns?
Bridget
On 04/23/2013 10:47 AM, Hugh Cayless wrote:
> On Apr 22, 2013, at 7:54 , Neel Smith <[log in to unmask]> wrote:
>
>> Part of the appeal of URN notation is that they do not need to refer to digial resources: I can cite Sandys's reading specifically whether or not I know of an online version reachable via the CTS protocol. This is, IMHO, an important separation of concerns: the scholar correctly citing evidence with URNs does not have to address the question of how URNs are to be resolved, today or in the future. It is possible that in 2013, the way I resolve urn:cts:greekLit:tlg0086.tlg003.ap03:1 is to walk to my shelf, pull a volume off, and page through to chapter 1. (This is what I meant in an earlier post when I said that URNs pass the "paper napkin test".) If, in the future, a digital version of Sandys' edition is recognizable by a CTS resolver, then my digital references to that URN suddenly become even more valuable.
>>
> Well, HTTP URIs don't need to refer to digital resources either. They do implicitly offer a way to get information about resources whether they're digital or not (which URNs don't absent a resolver), but they might not point to anything. XML Namespaces are usually HTTP URIs and may not point to any resource. The URI RFC (http://tools.ietf.org/html/rfc3986#section-1.2.2) is quite clear on this point:
>
>> A common misunderstanding of URIs is that they are only used to refer
>> to accessible resources. The URI itself only provides
>> identification; access to the resource is neither guaranteed nor
>> implied by the presence of a URI. Instead, any operation associated
>> with a URI reference is defined by the protocol element, data format
>> attribute, or natural language text in which it appears.
> http://example.com/cts/greekLit/tlg0086/tlg003/ap03/1 (if a text wasn't available) might resolve (via a redirect) to a bibliographic citation or to worldcat, enabling me to find the physical book. Or it might 404 and I'd have to search for it to discover what it meant. And HTTP URIs can be plugged into resolvers too. See http://web.archive.org/web/19990429154513/http://www.stoa.org/ for example.
>
> Apart from the authority part of the URI (the domain + port), it would be simple to devise an HTTP URI scheme that's isomorphic to CTS URNs (and that's a solvable problem—just buy a domain and make it a CTS registry). So I don't see a killer advantage to URNs over URIs, only a theoretical one (HTTP URIs might go away). But the means to resolve a CTS URN might go away too. Impermanency isn't a problem that can be solved just by not using current technology. What would be lost if CTS were broadened to use URIs?
>
> Thanks for being willing to discuss this at such length!
>
> Hugh
|