Thanks very much for the comments.
Just to stress here that by "article citation information", we meant the
bibliographic record for a journal article, not the metadata for any
citation record *to* that article. We also concentrated on where in the
bibliographic record you would indicate the provenance of the article in
the sense of its placement in an issue
within a volume of a journal. The full DC record for an article would look
something like this (note: I'm not going to get into syntax here, and apart
from the Relation qualifiers, I'm keeping it to DC Simple; also, the
identifiers are not necessarily the real ones pertaining to this article):
dc.title = What do service planners and policy-makers need from research?
dc.creator = Mary Marshall
dc.subject = geriatric psychiatry
dc.description = A study of dementia care policies in Europe.
dc.publisher = John Wiley & Sons Ltd
dc.contributor = M.G. Downs
dc.contributor = S.H. Zarit [these last two are Editors of the special
issue of which the article is part]
dc.date = 19990218 [but this is arguable - see below]
dc.type = text
dc.format = application/pdf
dc.identifier = 0885-6230(1999)14:2<86:WDSPAP>2.0.TX;2-X
dc.identifier = 10.1002/(SICI)0885-6230(1999)14:2<86:WDSPAP>2.0.TX;2-X
dc.identifier = http://journals.wiley.com/0885-6230/v14n2p86.html
dc.source = [N/A]
dc.language = en
dc.coverage = [N/A]
dc.relation = IsPartOf "International Journal of Geriatric Psychiatry,
Volume 14, Issue 2, pp. 86-96"
dc.relation = IsPartOf "0885-6230(1999)14:2<>1.0.TX;2-X"
dc.relation = IsPartOf "10.1002/(SICI)0885-6230(1999)14:2<>1.0.TX;2-X"
dc.relation = IsPartOf "http://journals.wiley.com/0885-6230/v14n2.html"
Since this is an article in a special issue, we could also have:
dc.relation = IsPartOf "Proceedings of the 'What Works in Dementia Care'
Symposium held in Stirling, Scotland, June 1998 - Part 1"
and if this special issue was to be sold as a separate item, we could have:
dc.relation = IsPartOf "0123456789" [i.e. the ISBN].
**************************
The WG deliberately didn't address the issue of citation-as-reference
metadata because we felt there were already enough other committees looking
into this, including the NISO/DLF/SSP/NFAIS one that you mention. But I am
intrigued by your statement that the recommended minimum set of data
elements to be used for reference lookup have to be "present in both the
citation and [the] reference lookup database". Citations often don't
include article titles (especially prevalent in certain disciplines such as
chemistry); they never contain ISSNs or CODENs and often abbreviate the
journal title in one of a variety of quasi-standard ways; they don't
usually contain a chronology beyond the year of publication; they rarely
include issue enumeration unless each issue starts with page 1. Thus, the
bibliographic record as above contains much information that is not often
contained in a standard reference citation. Surely, the *minimum* metadata
you need to identify an article uniquely is i) journal title (possibly
inferred from an abbreviation), ii) volume, and iii) page number. All other
data elements are either useful as extra checks (e.g. if you have the
volume number, you don't need the year as well, but this could be a useful
confirmation), or are only needed in the relatively rare cases where the
(i) to (ii) set above is insufficient (e.g. when two articles start on the
same page).
So, the WG's DC record is not specifically designed for reference matching,
but I would argue that it contains everything you need for reference
matching.
Let's go back to dc.date. What are the options when it comes to journal
articles? The date that I've indicated above is the date that the issue was
published, i.e. 18 February 1999. I know that because of our internal
records, but no-one outside the company knows it. As far as anybody else is
concerned, it's the February issue. But this month-indicator is an *issue
attribute*, not a date in any real sense. This issue was indeed published
in February, but if had been published in January or March or April, or
whenever, it would *still* be the February issue - that's just the name of
the issue. Think of issues that are identified by season - Spring, Summer,
etc. - what do we put in the publication date for these?
I assume that dc.date is meant to refer to a time when something actually
happened, but I contend that journal issue dates tell you no such thing.
"Chronology", in the SICI sense of the term, is not a date field, in the DC
sense of the term.
If I'm way off-beam on this, and you all think that issue chronology *is* a
date, then I guess it could be expressed as "199902". There are a whole
load of other dates that I might want to list in the DC record of an
article - e.g. received date, revised date, accepted date, published date -
but I can't distinguish these in DC Simple. (BTW, the publication date is
also open to varying interpretation, often governed by internal audit
recommendations. For example, is an issue published on the day that the
publishing house approves it, or the day the printing house despatches it
to the expediter, or the day the expediter despatches it, or the day the
first subscriber gets it, or the day the last subscriber gets it , or the
day the mean or median subscriber gets it? These are all possible meanings
of *publication date* in the print world; in the electronic world, is it
the day the print was released or the day the electronic was released? Or
the day the electronic issue was loaded into staging after final approval?
Throw in the possibility of article-by-article pre-release and you've got
another load of possibilities. The only "date" that is objectively known is
the cover date, and the one thing this doesn't tell you is whenever
anything actually happened!)
Finally, you were right to pick me up on whether we were recommending page
ranges or first page number only. In the example, I didn't actually have
the page range, so that's why I used first page number only. (You can get
the latter from the SICI, but you can't get the former.) In practice, I
think it best to indicate the whole range since that gives more
information, and it may be critical, not to citation-matching but to
resource retrieval. (For example, you might decide not to retrieve a review
article if you knew that it was over 50 pages long; also, if you were using
the metadata to generate a document delivery order, the extent would be
important.) As for non-page indicators such as article numbers, we did
recommend "the page range (or equivalent locational information in a
non-page-based resource)".
Regards
Cliff Morgan
Priscilla Caplan <[log in to unmask]> on 02/07/99 20:20:20
To: Cliff Morgan/Chichester/Wiley
cc: [log in to unmask]
Subject: Re: Draft Proposal from the Working Group on Bibliographic
Citations
This is a very sensible proposal and I am very happy to see it. We dearly
need some standardization of the way article citation information is
represented in the DC.
Our own local project guidelines at the University of Chicago specify "When
describing an article from an electronic journal, use DC.Relation.IsPartOf
for a citation to the journal, including when available the title of the
journal, enumeration and chronology, and the pagination of the article."
This is very close to what the workgroup has come up with, except that you
omit chronology.
Over the last several months NISO, the Digital Library Federation, the
Society of Scholarly Publishers, and the National Federation of Abstracting
and Indexing Services have co-sponsored a series of workshops on reference
linking (specifically, getting from a citation to a journal article.) The
results will be appearing soon in an upcoming article in D-Lib Magazine.
There is a less comprehensive workshop report at
http://www.lib.uchicago.edu/Annex/pcaplan/reflink.html.
One of the products of that discussion is a recommended minimum set of data
elements to be used for reference lookup (meaning they must be present in
both the citation and in any refernece lookup database). This set is
consistent with what is being done for the DOI lookup service that the IDF
is currently developing. There is a great desire on the part of the
participants that the DC community accomodate these elements in
well-defined places in the Dublin Core.
Your Proposal goes a long way toward doing this. The current draft
recommendation describes these elements (this is taken from the workshop
report noted above):
-- Title: In this case, Title of the journal article.
-- Creator(s): In this case, Author(s) of the journal article. The first
author at a minimum should be included; subsequent authors may be included
at the discretion of the metadata provider.
-- Journal Title: Title or title equivalent of the journal in which the
article is published. An unambiguous key number, such as ISSN or CODEN,
could function as a title equivalent.
-- Date: Publication date of the article or the official chronology of the
journal issue containing the article. Chronology is the published
designation or ?issue date? (e.g. May/June 1999).
-- Enumeration: The numbering designation of the journal issue containing
the article. Enumeration generally includes volume and issue number, and
may include other designations such as Part, Series, etc. This can be
omitted only if the journal itself has no official enumeration, as is the
case with a currently small number of electronic-only journals.
-- Location: Starting page number of the article, or, if there is no
pagination, assigned article number.
-- Type: Type of material, in this case probably ?journal article?.
It looks to me that there are only a few discrepancies between the Proposal
and the reference linking recommendation above:
1. In the recommendation, a key number like ISSN uniquely identifying the
journal can be used instead of the journal title. This could be
accomodated by adding JournalIdentifier to the recommended subelements for
this Relation in the Proposal.
2. In the recommendation, date (either publication date or issue
chronology) is required. There is no place for date in the Proposal. I
suppose one could make the case this date would go in DC.Date so does not
need to be in Relation. If that is the case, then it might be worth
explaining that explicitly in the Proposal, and also addressing whether one
can put issue chronology (e.g. May/June 1999) in DC.Date.
3. In the recommendation, location is either starting page number or
article number. In the Proposal, you refer to page ranges (e.g. 37-40)
instead of starting page number (though the example shows starting number)
and don't include article number. I think the former is not much of a
problem, because the starting page number could be derived from a page
range. However, it would be nice if the definition of JournalPages could
be broadened to include article number when there is no pagination.
Thanks for your consideration,
Priscilla Caplan
At 03:08 PM 7/2/99 +0100, you wrote:
>I understand from the Guidelines for Dublin Core Working Groups (draft
1.4)
>that once a Chair has established consensus within the WG, the Group's
>recommendations should be circulated to the dc-general list for comment.
>Below, you will find our Recommendations. (Please refer to
>http://www.mailbase.ac.uk/lists/dc-citation if you would like to see who
is
>in the Working Group and the discussions behind the recommendations.)
>Comments should be posted to this list by Monday 19 July.
>
>i) SCOPE OF THE WORKING GROUP
>
>We agreed to focus on the metadata of the bibliographic record of the
>resource, not the metadata of citations (references) to the resource.
>
>ii) MATTERS TO BE ADDRESSED
>
>We agreed that we should limit ourselves to two specific questions raised
>in the meta2 discussion lists last year, namely a) how to indicate journal
>article metadata in a bibliographic record, covering the article's
location
>within a journal title, volume, issue and pages, and b) how to indicate
>edition/version/release information in a resource's bibliographic record.
>
>iii) RECOMMENDATIONS FOR JOURNAL ARTICLE INFORMATION
>
>We recommend that the most appropriate place for this information is
>DC.Relation.
>
>We also considered Title, Description, Identifier and Source, but these
>were rejected in favour of Relation. DC.Title should contain the article
>title but no other locational information. DC.Identifier should contain
one
>or more identifiers for the article itself (e.g. the article SICI, PII,
>DOI, URL, etc.) but should not contain identifiers to the issue, volume or
>journal.
>
>The major discussion centred around whether the most appropriate DC tag
was
>Relation or Source. Some arguments were put forward that, for the
>electronic version of an article, DC.Source could be used to identify the
>print "original" (i.e. with Journal, Volume, Issue and Pages), and this
is
>a common implementation practice, but we rejected this argument on the
>basis that you couldn't say which version was derived from which other
>version. The electronic version *may* be derived from the print (e.g. by
a
>process of back-conversion from typeset files to HTML) or the print may
>derive from the electronic: how do you know what processes have taken
>place? The print may be released before the electronic, or vice versa, or
>they may be released simultaneously. And what if there is only the one
>version - only print or only electronic?
>
>Some implementers made a distinction between using Source when the
>electronic was derived from print and Relation when the resource only ever
>appeared in an electronic version. However, we regarded this distinction
as
>essentially arbitrary and reliant upon information that wouldn't always be
>available, so we recommend that Relation is used, whether the material is
>published first in print or not.
>
>The Working Group was not constrained into considering DC Simple (DC 1.0)
>solutions only, which would be very restrictive as far as specifying
>Relations go. On the other hand, DC Qualified is of course not yet stable,
>so any recommendations we make that use qualifiers are subject to future
>stabilisation.
>
>We recommend using the "IsPartOf" construct. The full location information
>should be given as both a text string and one or more identifiers *to the
>resource that the article is a part of*. The text string should include
the
>page range (or equivalent locational information in a non-page-based
>resource) - even though it could be argued that logically the article is
>not a *part* of a page range (it *spans* a page range rather than is
>subsumed within it), we recommend this practice because a) the page range
>appears naturally at the end of journal bibliographic information, b) we
>suspect implementers will put it there anyway, and c) they'll do this
>because there's nowhere else for it to go.
>
>For example, let's say we have an article in the Journal of the American
>Society for Information Science, Volume 47, Issue 1, starting on Page 37.
>The SICI for this article is
1097-4571(199601)47:1<37::AID-ASI4>3.0.CO;2-3.
>The DOI is the SICI preceded by 10.1002/(SICI). The URL is the DOI
preceded
>by http://doi.wileynpt.com/.
>
>In the DC record for the article, we would have the above SICI, DOI and
URL
>all entered under DC.Identifier (with the appropriate Schemes indicated in
>DC Qualified).
>
>The text string for DC.Relation "IsPartOf" would be "Journal of the
>American Society for Information Science, Volume 47, Issue 1, Page 37".
>(The complete page range could also be included.) DC Qualified might break
>this down into subelements. (We would recommend explicit subelements such
>as JournalTitle, JournalVolume, JournalIssue, and JournalPages.)
>
>The identifiers within DC.Relation "IsPartOf" could be (again with
>appropriate Scheme designations): "1097-4571(199601)47:1<>1.0.CO;2-T" (for
>the SICI of the Issue that the article is a part of);
>"10.1002/(SICI)1097-4571(199601)47:1<>1.0.CO;2-T (for the DOI of the Issue
>that the article is part of); and "http://doi.wileynpt.com/10.1002 [and so
>on]
>on]" for the URL of the Issue that the article is part of.
>
>iv) RECOMMENDATIONS FOR EDITION/VERSION/RELEASE
>
>We recommend that this information should go into DC.Title.
>
>Other options that we considered were Description, Identifier, and
Relation
>"IsVersionOf".
>
>Recommended subelement would be DC.Title.Release, whether we were
referring
>to editions, versions or releases, since this was felt to be the most
>generic term.
>
>DC.Identifier should contain the relevant identifier of the release
itself,
>e.g. the ISBN of a 2nd edition of a title, but would not indicate release
>enumeration (e.g. it would not say "2nd edition" or "edition 2" or "2" in
>the Identifier field - this goes into Title).
>
>DC.Relation "IsVersionOf" can be used to refer back to previous versions
>but not to indicate the edition/version of the current resource.
>
>
>v) CONCLUSIONS
>
>a) We limited our scope to bibliographic records.
>
>b) We concentrated on two issues that had been specifically raised in
>previous discussion groups, and for which no conclusions had been reached.
>
>c) We recommend the use of DC.Relation "IsPartOf" for journal article
>placement information, i.e. for indicating which journal, volume, issue
and
>pages an article belongs to. This tag should be used whether the article
>started life as a print product or as an electronic one. The Relation can
>also refer to various Identifiers of the journal issue of which the
article
>is part.
>
>d) Edition/version/release information ought to be part of DC.Title. As a
>subelement, we recommend DC.Title.Release (which recommendation has
already
>been passed on to the Title Working Group).
>
>If there are no further comments by 19 July, I will pass the
>recommendations on to TAC (or, more correctly, the recently constituted
>DC-AC).
>
>Thanks
>
>Cliff Morgan
>
>Publishing Technologies Director
>
>John Wiley & Sons Ltd
>
>Chichester, UK
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|