JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DC-GENERAL Archives


DC-GENERAL Archives

DC-GENERAL Archives


DC-GENERAL@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DC-GENERAL Home

DC-GENERAL Home

DC-GENERAL  February 1998

DC-GENERAL February 1998

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Names in Dublin Core

From:

Andrew Waugh <[log in to unmask]>

Reply-To:

dc-general

Date:

Tue, 03 Feb 1998 15:48:52 +1100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (570 lines)


Dear all,

Before Christmas the issue of how to represent names came up. Stu
suggested that I might like to draft something on names. I didn't want
to disturb the important discussion of DC Simple, but that seems to
have died down now, so...

(I'd be particularly interested in comments from people involved with
name authorities, and with AARC.)

andrew waugh

-------------------------8<-----------------------8<-------------------

Representing People's Names in Dublin Core

This note provides some guidance on representing people's names in
metadata.

1. The Problem

While most people only have one name, that name may be written down in
many different ways. The name may be written in full (e.g. 'John Stuart
Mills'). Components of the name can be abbreviated (e.g.
'John S. Mills', or 'J.S. Mills'), or omitted (e.g. 'John Mills').
Names may be extended by titles or honorifics (e.g. 'Mr. John Mills').
The components of the name may be reordered (e.g. 'Mills, John Stuart').

Complexity is added by the fact that people frequently do not use their
'official' name. People often prefer to use shortened or alternative
forms of their name (e.g. 'Kathy' for 'Kathryn', and using 'Jack' for
'John' was once common in Australia). Some people prefer to use their
second name instead of their given name (e.g. 'Margaret Read' instead of
'Frances M. Read').

The final dimension of complexity occurs when names from many cultures
must be handled. Appendix A summarises the name forms of some of the
cultures commonly found in Australia. The range of different components
which can be found in names is astounding, as is the number of ways
these components can be ordered. To make handling names even more
difficult, it is common for migrants to alter their name when
integrating into another culture. In the booklet from which Appendix A
was drawn, it was noted, for example, that people with names which start
with the family name often move the family name to the end to fit with
the dominant name form in Australia.

2. The Uses of Names

Given that name forms are not internationally consistant, and that
individuals often vary their name to suit themselves, what is the best
way of representing names in metadata? To answer this question, it is
worth considering how names are used in a metadata system:

*	As a piece of information. Often, the user is interested in
	using the name as a piece of information in its own right. A
	user might ask, for example, "Who wrote 'The Lion, the Witch,
	and the Wardrobe'?". The user has some reason for wanting this
	information, such as to check that the returned entry is the
	correct one, or to carry out further work.

	When using the name as a piece of information, it does not
	matter how the name is expressed, *provided the user understands
	the convention*. For example, a library catalog would return
	the name 'Lewis, Clive Staples, 1898-' and the user is expected
	to understand (by convention) that the first part is the family
	name, and that the string '1898-' is not part of the name at
	all.

*	As a search key. The user is interested in searching for entries
	associated with the name.

	Most current search engines search the full text of the entry,
	it is consequently irrelevant *to the search* engine how the 
	components of the name are ordered.

	In some situations, however, it is relevant to the user. Full
	text searching will match any occurance of the search string in
	the names. A search for the surname 'Andrew', for example, will
	match all names with 'Andrew' in any part of the name (the
	family name, or any of the given names). This will return
	significantly more matches than if it had been possible to
	limit the search to just the family names. In English derived
	names it is not common to use family names as given names. But
	this may not be true for names from other cultures.

*	As a sorting key. Name are often used to sort a list of results,
	and there is usually a convention on how names are to be sorted
	within a culture. With Australian names, for example, the
	convention is to sort by family name.

	Unfortunately, it is difficult to construct a general algorithm
	to extract the primary and secondary sort keys, particularly
	when the system must handle names from many cultures. A common
	approach (used, for example, in library catalogs) is to
	re-order the components so that the primary key comes at the
	start of the name.

The different approaches to representing names trade off the various
uses. For example, representing a name in the 'natural' order (i.e. the
order in which it is spoken) is probably best if the name is being used
as a piece of information, particularly if names from many cultures are
going to be mixed in together. But such a representation would make it
difficult to sort Anglo Saxon names which should be sorted by family
name.

4. Approaches to Expressing Names

It is unlikely that there will be agreement on a single common way of
representing names. The following are the prefered methods, in order
of preference.

a.	Use whatever you already have

	In many cases, the metadata will be a view on an existing
	database (e.g. a library catalog or HR database). Simply
	adopting the name representation policy used in that database
	has the following advantages:
	*	The names are compatable with other databases that
		share the same format.
	*	You do not have to expend resources in entering or
		maintaining the names. (This is a very significant
		cost.)

	The disadvantages are:
	*	The existing database might not be designed with
		international scope in mind. Does it, for example,
		assume that every name has given name, an initial, and
		a family name?
	*	The names may not be compatible with other databases
		that you wish to work with (e.g. library catalogs).

b.	Adopt a existing naming authority

	If you don't have existing data, or it is not appropriate to
	use the existing format, it is possible to adopt an existing
	naming authority. These are simply long lists of names in a
	standardised representation. An example is the (US) Library of
	Congress Name Authority File, but most national libraries would
	maintain a similar name authority file.

	The advantages of using an existing name authority file are:
	*	Compatibility with other databases. Standard name
		authority files are very widely used.
	*	Consistency in application.

	The disadvantages of using an existing name authority file are:
	*	The names you need may not be in the authority. This is
		particularly true if the authority tends to specialise.
		For example, the Library of Congress Name Authority File
		would have a very good coverage of US authors, but might
		not cover Japanese authors or US union leaders as well.
		To add extend the authority, you *must* fully
		understand the rules that were used to produce the
		authority (otherwise names will be inconsistent).
	*	Name authorities must be purchased
	*	To be effective, name authorities must be used. That is,
		when it is necessary to add an entry, the name authority
		must be consulted to obtain the official representation.
		This obviously takes time, and will not be economic for
		some applications.

c.	Adopt an existing naming guidelines

	If there is no suitable naming authority for you to use, you may
	be able to adopt existing guidelines on how to represent names.
	Appendices B and C contain summaries of two such guidelines.

	The advantages of adopting an existing guidelines are:
	*	Compatibility with other databases (both existing and
		future) that share these guidelines
	*	Completeness of rules. There are many complex issues in
		representing names, and a widely adopted set of
		guidelines is more likely to address these issues than
		one developed in house.
	*	Lower cost of development of the guidelines.

	The disadvantages are:
	*	The guidelines may be more complex than is actually
		required in your application. For example, the AACR
		include sections on titles of nobility and terms of
		honour.
	*	The guidelines may require considerable training and
		resource material to apply consistently. The naming
		rules for naming non Anglo/US names in the AACR, for
		example, assume access to reference books in the native
		language of the person being named. Most organisations
		are unlikely to have access to such resources, nor would
		staff be skilled in using those resources.

d.	Develop your own naming guidelines

	If there is no suitable naming guidelines that you can use, you
	will have to develop your own.

	If you can, simplify an existing naming scheme. At least you
	will know what flexibility and power you are removing from your
	scheme.

	If you have to develop your own naming scheme, think carefully
	about:
	*	Which of the three uses for names (section 3) are
		important to you. 
	*	What will be the cost of collecting the names in the
		format you choose.

	The conventional method in the US, UK, and Australia is to store
	the family name separately from the given names. This can cause
	problems with non Anglo Saxon names, as different data entry
	staff may enter the name is different ways thus fragmenting
	your records.

	An alternative method is to enter the preferred name in one
	field and the sort key in a second. The preferred name is
	often easily obtainable from person (indeed, they may be much
	happier to give it than their full official name). The sort
	key is the part of the name used as the primary sort key
	(usually the family name), again normally easily obtainable
	from the person. The preferred name is presented when the
	entry is used, and the sort key when the entry is sorted with
	other entries. If necessary, this approach can be extended to
	include a full official name.

Acknowledgements

Elizabeth Cherhal started the ball rolling with two very sensible
questions. Stu Weibel, Simon Cox, Ann Apps, Daniel Brickley, Jon Knight,
Michael Jost, Karen M. Hsu, and John A. Kunze chimed in with helpful
suggestions.

Appendix A: National Name Forms

The purpose of this appendix is not to give the definitive list of name
forms (in particular, all cultures will have names that don't conform)
but to give readers an idea of the wide range of name forms in use in
the world. Hopefully, this will encourage metadata designers to move
away from designs which assume that names can be crammed into
<first name><initial><family name>.

This summary of national name forms is based on the Australian booklet
'Naming systems of ethnic groups: A guide for Social Security staff and
community workers', Department of Social Security, 1990,
ISBN 0 644 12167 X

Many cultures use the common name form of one or more Given Names
followed by a family name. These include: Armenian, Cypriot, Estonian,
Finish, Greek, some Indian (Hindi, Gujerati, and Bengali), Latvian,
Lithuanian, Macedonian, Maltese, Maori, Russian, Slovenian, Tongan, and
Ukranian. Arabic is similar, but the name may include a prefix
(e.g. 'El') which is not part of the family name. (The guide did not
discus British, American, French, German, or Dutch names).

The second most common name form is where the family name precedes the
given names. Such names are found in the following cultures: Chinese,
Croatian, Czeck, Hungarian, Italian, Khmer, Korean, Laotian, Polish,
and Serbian. Some of these cultures only have one given name. Where
two given names are present, some cultures use the first given name
as the primary name, others the last.

Many cultures use neither of these two name forms. Examples of the 
more complex name forms include:

Assyrian

<Personal Name><Father's Personal Name><Grandfather's Personal Name>

In Iran it is customary to put the village name before the family name
(Grandfather's name) on all official documents. This is *not* part of
their name.

Chinese

<Family Name><Generational Name><Given Name>

The name order must always be checked; sometimes names have been
reversed to suit English custom. Most women attach their husbands name
before their own upon marriage. Family names may be composed of two
components.

Fijian

<Honorary Title> <Given Name>+ <Family Name>

Filipino

<Baptismal Name><Given Name><Mother's Family Name><Father's Family Name>

Baptismal name is not often used. Names are often abbreviated (both
dropping components, and shortening components). Married women usually
drop their maternal family name and add their husband's paternal family
name aftern their own. Widows usually add 'Vedova' (abbreviated 'Vda')
before their husband's family name.

Hungarian

<Family Name> <First Given Name> <Second Given Name>

When widowed, women may add 'ozvegy' (abbreviation 'ozv') before family
name.

Indian (Sikh)

<Given Name> Singh <Family Name> (Male)

<Given Name> Kaur <Family Name> (Female)

'Singh' and 'Kaur' are religious names. Some Sikhs may include this as
part of their family name (perhaps hypenated). 'Singh' and 'Kaur' may
be abbreviated to an initial.

Indian (South)

<Father's Given Name> <Given Name>

Father's given name may be written as an initial. The father's given
name may be replaced by (or supplemented by) birthplace, mother's house
name, or patronymic name depending on region (and may be abbrieviated
as initials).

Indonesion

<Given Name>+ <Clan Name>*

There may be no clan name (i.e. the name is just a single given
name). The clan name may be their father's name, or it may be shared by
the whole community. Name components may be abbreviated as initials.

Italian

<Family Name> <Given Name>

Married women may add their husbands family name to their name:
<Maiden Family Name> <Given Name> in <Husband's Family Name>

Korean

<Family Name> <Given Name>

Every given name has two parts (syllables) written as two words which
may be hypenated. Both parts must be used (it is not correct to
abbreviate or drop the second). Some Koreans use an English given name
for everyday use.

Laotian

<Given Name> <Family Name>

Some given names may be used for either sex, so the name may be
preceeded by the titles 'Thao' or 'Chao' (Male) or 'Nang' (Female) to
indicate sex.

Malaysian

<Given Name>+ Bin <Family Name> (Male)

<Given Name>+ Binte <Family Name> (Female)

'Bin' and 'Binte' mean 'Son/Daughter of' and will not be present for a
non Muslim Malay. Married women traditionally add 'Puan' before their
given names.

Portuguese

<Given Name>+ <Mother's Family Name> <Father's Family Name>

Nearly every women has 'Maria' as her first name, and the second is
used in everyday use. In Australia, many Portugese have dropped one
family name and added 'Da', 'Das', 'Dos', or 'De' to the other family
name. Married women add their husband's paternal family name to the
end of their name.

Sinhalese

<Father's names> <Given name>

Children are given their father's first two names at birth. The father's
names are usually abbreviated as initials.

Spanish

<Given Names> <Father's Family Name> <Mother's Family Name>

Married women traditionally drop their maternal family name and add the
husband's paternal family name prefixed by 'De'.

Turkish

<Middle Name> <Given Name> <Family Name>

The middle name is not used on a day to day basis.

Vietnamese

<Family Name> <Sex Indicator> <2nd Given Name> <1st Given Name>

The sex indicator is normally 'Thi' for a female, and 'Van' for a male.

Appendix B. EULER Project name conventions

Euler (European Libraries and Electronic Resources in Mathematical
Sciences) is a European project to provide network access to
mathematical publications (see http://www.emis.de/projects/EULER/).
The following text describing naming practices was provided by
##.

Author(s), Editors, Author References

Author names have been implemented in a form common to all
STN databases, i.e. last name, first name, middle name. First
and middle names can or cannot be abbreviated. When
searching for an author's name, it is recommended to use only
the first initial. You will get all forms of the first names, because
the system adds automatically a truncation symbol. Thus, the
following forms of implementation are possible: Examples:

  Friedrich Wilhelm Mahle    as    Mahle, F.
                                   Mahle, F. W.
                                   Mahle, Friedrich
                                   Mahle, Friedrich W.
                                   Mahle, Friedrich Wilhelm
  and as an Editor (e.g.)          Mahle, Friedrich W. (ed.)

The recommended search form is:

                                   Mahle, F

The system will automatically search for

                                   Mahle, F*

Names containing a preposition (von, van), article (le),
combination of article and preposition (du, vander),
relationships or attributes (Fitz, Mac, Jun., III) have the format
common to their country of origin. It is therefore recommended
to search for names with the supplements placed in front of
the name and after the name.

     Document                            Database Record

     Peter von der Muehl         as      Muehl, Peter von der
     Fritz von Heyden (DE)       as      Heyden, Fritz von
     Fritz Von Heyden (US)       as      Von Heyden, Fritz
     A. De L'Aigle               as      L'Aigle, A de
     C. M. Di Bari               as      Di Bari, C.M.
     Michel Del Pedro            as      Del Pedro, Michel
     L. C. MacLean               as      MacLean, L.C.
     John Fitz Gerald            as      Fitz Gerald, John
     A. Miller jun.              as      Miller, A.jun.

Names in cyrillic letters have been transcribed according to the
ISO standard. In some cases this transliteration will differ from
that one on the translated document or in the western journal.
In that case we include the different form of spelling as an
author reference displayed in braces in the author field.

Appendix C Summary of Naming Rules in AACR

The Anglo American Cataloging Rules [1] is the standard which describes
how objects in Canadian, US and UK libraries are cataloged. It is also
used in other countries (e.g. Australia). Part of the these rules
describe how people's names are represented in catalog entries. This
appendix summarises those naming rules.

The general principles are that:

1.	The name used should be the one the person is commonly known by.
	For example, 'Mark Twain' not 'Samuel Clemens'. (In library
	catalogs, other names are added as cross references.)

	Accents and diacritical marks should be included, as should
	hypens between given names if they are used by the person.

	Normally, the preferred name is obtained from the works the
	person authored, but it may be obtained from references issued
	in the person's language or country of residence or activity.

	If the name is from a non-roman script, and there exists a
	well known English language version, use the English language
	version. For example, 'Confucius' not 'K`ung-tzu'. Other
	versions are added as cross references as necessary. (This
	rule would almost certainly not be adopted in libraries whose
	language is other than English!)

2.	The names are arranged so that the components used to sort the
	name (the 'entry element') appear first. In Anglo Saxon names,
	for example, names are sorted by family name. The 'entry
	element' of 'Clive Staples Lewis' is 'Lewis' and the name would
	be represented as 'Lewis, Clive Staples'.

	An authoritative alphabetic list in the language of the person's
	country of residency or activity is used to determine the
	entry elements of a name. An authoritative alphabetic list is a
	'Who's who' (or similiar), not a telephone directory. (The
	difference seems to be that one is sorted by humans, or at least
	checked by humans, while the other is sorted mechanically).

	If the entry element is a family name (surname) it is followed
	by a comma *even if the family name normal comes first* such as
	Chinese names. 

	Some special rules:
	*	For compound family names (names which contain two or
		more name components), the following rules apply (in
		order):
		
		1. The entry element is the name the person prefers to
		be listed even if this is longer than the family name
		(e.g. 'Lloyd George, David' even though his father's
		family name was 'George').

		2. If the compound names are normally hyphenated, the
		name is entered under the full compound name. 

		3. Unless the person is Portugese OR a woman whose
		family name consists of a her maiden name and husband's
		family name, enter under the first element of the
		compound surname.

		4. If the person is Portugese, enter under the second
		element of the compound surname.

		5. It the person is a woman whose family name consists
		of a her maiden name and husband's family name, enter
		under the first element of the name if the woman is
		Czech, French, Hungarian, Italian, or Spanish. Otherwise
		enter under the husband's surname.

		If the name appears to be a compound name, but cannot
		be checked, it should be treated as a compound name
		unless the language is English or one of the
		Scandinavian languages. For a Scandinavian name, add a
		crossrefence under the compound name.

	*	A place name connected to the surname by a hypen is
		considered to be part of the surname.

	*	Relationship terms (Jnr, fils) are not considered part
		of the surname unless it is a Portugese name. If it is
		necessary to distinguish between two identical names,
		the term is appended to the name after a comma (e.g.
			Smith, John, Jnr
	
	*	If the surname includes an article or preposition as
		a prefix (e.g. van, du, le) enter it under the article
		or preposition if this is the way it is sorted in the 
		person's language or country of residence or activity.

		If the surname includes a prefix which is not an
		article or preposition (e.g. 'Ap' in Welsh names or
		'Mac'), enter it under the prefix.
	
	*	If the name does not include a surname, list under
		the given name.

		If the name does not contain a surname, but does include
		a patronymic (a name derived from their father's name),
		do not consider the patronymic as a surname; list the
		name under the given name. If the patronymic comes first
		(e.g. in Mongolian names), rearrange the name so that
		the given name comes first.


References

[1] Anglo-American Cataloging Rules, Second Edition, 1988 Revision,
Amended 1993, Michael Gorman & Paul W. Winkler (eds), published by
American Library Association, Canadian Library Association, and
Library Association Publishing Ltd.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

February 2024
May 2022
April 2022
March 2022
March 2020
February 2019
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
March 2001
February 2001
January 2001
December 2000
November 2000
October 2000
September 2000
August 2000
July 2000
June 2000
May 2000
April 2000
March 2000
February 2000
January 2000
December 1999
November 1999
October 1999
September 1999
August 1999
July 1999
June 1999
May 1999
April 1999
March 1999
February 1999
January 1999
December 1998
November 1998
October 1998
September 1998
August 1998
July 1998
June 1998
May 1998
April 1998
March 1998
February 1998
January 1998
December 1997
November 1997
October 1997
September 1997
August 1997
July 1997
June 1997
May 1997
April 1997
March 1997
February 1997
January 1997
December 1996
November 1996
October 1996
September 1996
August 1996
July 1996
June 1996
May 1996
April 1996
March 1996


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager