Thanks for the thoughtful reply. I want to emphasize two of your comments that are spot on.
>
> I appreciate, this is not the perspective of Perseus, or the CTS project, that you're much closer to the perspective of the data and the format of its representation.
Yes, CTS and CTS URNs were probably designed with long-term data preservation given priority. Closely related to that...
>
> I guess I'm just trying to impress upon you the fact that the choices which are made over data formats have real consequences to the way they can be used. You may think they are trivial, however, what I am saying is that currently they are not. I don't of course expect anyone to fall over themselves to correct themselves just for my convenience, but what I am reporting, is what I flatly regard as a *bug* in the data specification: no URN field.
I agree that the GetCapabilities is an ugly thing that's not well adapted to dynamic use. That's due, I think, to the way CTS evolved. As an example of how I think we're trying to get to the same end: I've recently written an implementation of CTS that uses a SPARQL endpoint as its back end. This requires [1] converting the information in the GetCapabilities to RDF; [2] converting all the information in all the inventoried texts to RDF. This is a one-time batch process: all the interaction with the service works directly with a SPARQL endpoint. No XSLT database, XPath, aut sim.
IOW, I see TEI XML as a good archival format for texts; GetCapabilities' text inventory as a reasonable way to catalog the texts; neither as a very practical format for working with directly in a working service. The CTS protocol nicely abstracts away any need to work with archival XML to retrieve passages of text: that was a major goal of its design. It doesn't provide the same abstraction for the capabilities metadata. It's probably a good moment historically to rethink what catalog data people need from the CTS service and in what formats for practical use they could get it.
|