From: OXVAXD::LOU "Lou Burnard" 4-DEC-1996 14:46:20.93
To: MX%"[log in to unmask]"
CC: LOU
Subj: RE: The Element Set: One Page Version
Although I see there are a dozen comments by others on this already, I think
I'll send mine in before I read them and get my opinion altered!
| * TITLE
|
| The name given to the work by the creator or publisher.
|
Trivial suggestion: capitalize or otherwise make explicit the fact that
"creator" and "publisher" are defined below
| * CREATOR
|
| The person(s) primarily responsible for the intellectual content
| of the object. For example, authors in the case of written documents,
| artists, photographers, or illustrators in the the case of visual
| resources.
|
| * DESCRIPTION
|
| The topic of the work, or keywords that describe the subject or
| content of the resource, whether text-based or visual.
"Description" is sooo vague! It's what the whole collection of DC elements
provides. For my money, the "description" of a book is just as likely to be
concerned with where it was published or how many pages it has as it is with
what it's about. I suggest "Keywords" since that makes explicit the probable
content of the field. (I have more to say on the subject of syntax here,
but I'll reserve it for the next phase)
| * PUBLISHER
|
| The organization responsible for making the resource available in its
| present form. Generally a publisher, an institution (university
| department, for example) or a corporate entity. The intent of this field
| is to identify organizations that fulfill a publishing role, rather than
| individuals that simply provide informal access to a resource.
Hmm, so the "publisher" is the agency that "publishes" -- *now* I see! I think
we will be fighting the current if we try to stop Joe Home Page claiming to be
his own publisher. Since however lots of people think that a "publisher" is a
(usually commercial) institution, using this term could be a recipe for
confusion. And I'm not at all clear what is meant by "individuals that simply
provide informal access": is the implication that they are providing a link to
something really "published" by someone else, or that they are primarily
responsible for its dissemination but that it's only informally distributed?
For what it's worth: the TEI makes the following 3-way distinction:
The <publisher> is the person or institution by whose authority a given edition
of a file is made public. The <distributor> is the person or institution from
whom copies of it may be obtained. If a file is not considered formally
published, but is nevertheless available for circulation by some individual
or organization this person or organization is termed a release authority,for
which a <authority> tag is proposed. All three tags behave identically within
the <publicationStmt> where they can be repeated ad lib. I think what you want
here is Authority.
| * OTHER AGENT (CONTRIBUTOR?)
\
\ The person(s) other than author(s) who have made significant
| intellectual contributions to the resource but whose contribution is
| secondary to the individuals specifed in the CREATOR field (for
\ example, editors, transcribers, illustrators, convenors).
I don't like these terms much, but the concept is essential. This is what
librarians call the secondary statement of responsibility, so how about
"RESPONSIBLE"?
| * DATE
|
| The date the resource was made available in its present form. [If possible,
| a default format of wide international acceptance should be specified
| here. Any suggestions?]
I thought we weren't discussing syntax here, but so long as we are: ISO 8601 #
dates please.
| * RESOURCE TYPE [used to be TYPE]
|
| The genre of the resource, such as home page, novel, poem, working
| paper, technical report, essay, dictionary, etc. It is expected that
| RESOURCE TYPE will be chosen from an enumerated list of types.
This is really tricky to label, partly because TYPE is used everywhere, but
also because the domain it's meant to cover is so ill defined. Things like
"novel" and "poem" are (fairly) well defined literary genres, but I don't know
that I'd be able to distinguish a "tech report" from an "essay", even if I
could think of a reason for doing so. It might be more useful to distinguish
original work from derivative, or continuous prose text from fragmentary
records, but we don't have names for these categories. I'm still leaning to
dropping this category altogether.
| * FORMAT [used to be FORM]
|
\ The data representation of the object, such as text/html, ASCII,
\ Postscript file, Windows executable file, JPEG image, etc. The
| intent of this element is to provide information necessary to
| allow people or machines to make decisions about the usability of
| the encoded data (what hardware and software might be required to
\ display it, for example). As with RESOURCE TYPE, FORMAT will be
| assigned from enumerated lists sucha s registered Internet Media
| Types (MIME types).
Why not gladden the hearts of SGML geeks everywhere and call this NOTATION
(since that's what it now is)
| * IDENTIFIER (RESOURCE IDENTIFIER?)
|
| String or number used to uniquely identify the object. Examples
| for networked resources include URLs, URNs (when implemented). For
| non-networked objects, one might have an ISBN, Library of Congress
| Catalog Number, or other formal name.
Yes. Stick with IDENTIFIER please.
| * SOURCE
|
| Object, either print or electronic, from which this object is
| derived, if applicable. For example, an html encoding of a
| Shakespearean sonnet might identify the paper version of the
| sonnet from which the electronic version was transcribed.
For real brownie points, SOURCE should be able to recurse, so that pages
derived from other pages could say so. As per the TEI Header...
| * LANGUAGE
|
| Language of the intellectual content of the resource. The default
| expression of natural languages is according to the ISO 639 two
| letter language codes.
I think these have now been superceded by 3 letter codes.
| * RELATION
|
| Relationship to other resources. The intent is to provide a means
[snip]
| A formal specification of RELATION is
| currently under development. Users and developers should
| understand that use of this element should be currently considered
| experimental.
I agree with the second bit. Given that there will probably be other, much
better, ways of linking and relating HTML and XML documents/document fragments
together momentarily, I suspect that this is best left as a placeholder, or
used solely as a bucket into which people can drop notes like "There's a better
version of this page at http://foo.bar/"
| * COVERAGE
|
| The spatial locations and temporal durations characteristic of the
| resource. Formal specification of COVERAGE is currently under
| development. Users and developers should understand that use of
| this element should be currently considered experimental.
This one snuck in when I wasn't reading the list. Serves me right.
| * RIGHTS MANAGEMENT [need a snappy, single word element name here]
|
| The contents of this field is intended to be a pointer (a URL or
| other suitable URI as appropriate) to a rights management
| statement or a server that would provide such information in a
| dynamic way. The intent of this field is to allow providers a
| means to associate terms and conditions or copyright statements
| with a resource or collection of resources. No assumptions
| should be made by users if such a field is empty or not present.
|
For a snappier name, I suggest AVAILABILITY. I suggest that we ought at the
very least to be able to specify some minimal set of explicit values (the OTA
has one possible model, very similar to that in use at the Essex Data Archive
and other AHDS service providers) and rely on the Warwick Framework approach to
link to the more detailed and complex rights management cases. If we are going
to have the option to add a pointer to some other more complex package of info
here, why can't we have it everywhere? Indeed, don't we have it already?
Ooops, I seem to be syntactisizing again.
[OTA (= Oxford Text Archive) has four categories for availability
P : public domain
U : generally available for academic use
A : available only under conditions specified by the owner
X : not generally available elsewhere
(there is also a fifth, 0, which means "we're not even admitting this exists"
but that's another story)
]
Right, that's my tuppenuth. Now to see whether anyone agrees with me.
#
Lou
|