One other thought on this: A couple years ago the Linguistic Society of
America's Ethics Committee came up with an ethics statement, and a chunk
of the discussion centered around issues related to this (including the
IRB stuff discussed elsewhere in this thread).
The statement itself is available as a PDF at
The discussions on the draft of the statement are available at
On 13/Jun/11 8:40 AM, Damien Hall wrote:
> I admit to not having read this through thoroughly; but it seems to be
> an online database of linguistic data and analysis tools to which
> linguists are invited to submit their data. It would then be available
> to be shared by the scholarly community under a Creative Commons licence.
> I feel that initiatives like this are an excellent idea in principle, to
> be supported wholeheartedly. However, in practice it seems to me that
> there will have to be a big change in linguists’ usual practices before
> we can make much data available under such schemes, because, in
> sociolinguistics at least, the default position in the past has been
> (hasn’t it?) that data recorded from people will be kept confidential
> and only listened to by the researcher and people working with them.
> Thousands, if not millions, of permissions to record must have been
> signed with those provisions. This means, for example, that the vast
> majority of the data I have recorded myself cannot be shared under these
> What do people here think about this kind of commonly-accessible
> database? They’re becoming increasingly popular; should we adapt our
> practices? I think that in principle we should, but only on condition,
> of course, that informants are happy with it. This might mean offering
> them the option to be in the publicly-accessible database or not to be
> there. In other words, the principle is good, and ideally to be
> followed, but the practice might well mean that can’t be done.
> This message is forwarded from the Speech Prosody list simply because
> that’s where the person who sent it to me got it from. I don’t think
> that means that they want only prosodic tools and data: the name and
> first paragraph refer to ‘oral and linguistic’ data generally.
> Let’s have a debate! In the next week I’m not going to be at my
> computer that much to contribute, but I think it’s an interesting and
> currently-relevant problem for variationists.
> Damien Hall
> University of Kent (UK)
> Leverhulme Early Career Fellow, 'Towards a New Linguistic Atlas of France'
> English Language and Linguistics, School of European Culture and Languages
> ---------- Forwarded message ----------
> From: *Bernard Bel* <[log in to unmask] <mailto:[log in to unmask]>>
> Date: Fri, Jun 10, 2011 at 4:55 AM
> Subject: [speech_prosody] Submit prosodic data/tools for
> To: speech-prosody <[log in to unmask]
> <mailto:[log in to unmask]>>
> Dear colleagues,
> I have the pleasure to announce the full operation of CRDO-Aix, a centre
> for the long-term preservation and sharing of oral/linguistic resources.
> We welcome all contributions of laboratories and individual scholars to
> sharing their productions on a non-commercial basis with the scientific
> community at large. CRDO is dedicated to resources worth promoting in
> terms of scientific relevance and/or cultural heritage.
> (Please skip the following report and jump to 'INVITATION' if you feel
> ready to join this venture!)
> REPORT AND STRIKING FEATURES
> Currently, CRDO (Centre de Ressources pour la Description de l'Oral = a
> Resource Center for the Description of Oral) is operated by two
> 'submission sites' in an OAIS framework (Open Archival Information
> System) initiated by TGE-Adonis (a CNRS/INSHS unit in France).
> Submission site CRDO-Paris is operated by LACITO
> (http://lacito.vjf.cnrs.fr) whereas submission site CRDO-Aix is operated
> by LPL (http://lpl-aix.fr) under CNRS and two regional Universities.
> In January 2011 we published a report on the pilot project coordinated
> by TGE-Adonis:
> We recommend reading a slideshow exposing features and implementation of
> Our report and proposal deal only with CRDO-Aix (http://crdo.fr).
> Although we were eager to merge CRDO-Aix and CRDO-Paris before setting
> long-term preservation to the production mode, LACITO ignored our
> technical recommendations (summarized in
> <http://crdo.fr/wiki/Developpement/PointSurPassageEnProduction>) and
> initiated this irreversible process without prior notice on 22 June
> 2010. We had no other option than initiating our own on 16 July 2010!
> The OAIS framework implies two major computing centres:
> 1) An institutional archive hosted by CINES (Centre informatique
> national de l'Enseignement supérieur, Montpellier, France,
> http://www.cines.fr) under an agreement with CNRS and the French
> National Archive (SIAF);
> 2) A distribution site hosted by CC-IN2P3 (Centre de calcul de
> l'Institut national de physique nucléaire et de physique des particules,
> Lyon, France, http://cc.in2p3.fr).
> CRDO-Aix multi-tier architecture would make it possible to deal with
> several archive and distribution sites in addition to the existing ones.
> In addition, any site can get its queries redirected by CRDO-Aix to a
> distribution site and process/display results via the activated
> datastreams. CRDO-Aix can in fact remain 'invisible' in this process.
> In the current framework, CRDO-Aix is a unique submission site handling
> generic 'items' of unlimited size. The nature of a linguistic item
> (corpus/resource/tool/collection...) is solely determined by its
> descriptive metadata.
> The whole structure of folders and files is encoded to meet the
> technical requirements of the archive and distribution sites. Original
> file names are restored at the time they are downloaded by users.
> 'Exotic' file names in non-European languages are supported. Large items
> (beyond 50 Gb and/or 60,000 files) are automatically chunked to several
> segments in an invisible way.
> Descriptive metadata are designed to cover all categories defined by
> OLAC, including information about detailed tables of contents and
> version history. Metadata can be entered in the four navigation
> languages of CRDO-Aix (English/Chinese/Spanish/French) plus an optional
> language of the producer's choice. Confidential metadata can be kept
> safe until a date stated by the owner of the resource (or by Law).
> We are now planning to implement access to ISOcat and extend our
> metadata model to the CMDI approach:
> At this stage it is important for us to demonstrate the versatility of
> this device by accomodating a great diversity of 'real-size' projects.
> We welcome speech corpora (including audio, video and all sorts of
> primary documents collected in the field or lab experiments) and their
> annotations, including prosodic analyses, lexica, frequency tables,
> grammars... We also preserve/distribute research tools, preferably the
> ones shared in open access under a Creative Commons license.
> CRDO-Aix is designed for on-going projects. Several versions of an item
> can be sent to the archive and distribution sites. Further, descriptive
> metadata and access rights (on a whole item or individual file) can be
> modified via simple metadata updates (no versioning required). Any file
> can be declared open-access irrespective of access restrictions on the
> item it belongs to, and the URL pointing at this file on the
> distribution site can be made persistent (independent on versioning and
> location of the site).
> Once an item has become 'stable' (new versions becoming unlikely) it can
> be deleted from CRDO-Aix as we implemented a safe procedure for
> retrieving the entire set of files from any version on the distribution
> site. Consequently we have almost no space limitation for accepting data
> up the (extendable) approx. 100 Tbytes managed by the distribution site.
> Please get in touch with [log in to unmask] <mailto:webmaster%40crdo.fr>
> if you are keen to preserve and share linguistic data of some relevance
> to the research community. We would be delighted to face the challenge
> of adapting our model to project specifications that have not yet been
> figured out during its 2-year test phase.
> To start with, sign up on CRDO-Aix <http://crdo.fr>, read CRDO
> guidelines and create a record describing your data (primary data,
> resource, tool). Our team will assist you in completing descriptions,
> deciding on access rights and uploading the files.
> With best regards
> Bernard Bel <[log in to unmask] <mailto:bernard.bel%40lpl-aix.fr>>
> Tél. +33 (0)4 42 95 36 39 <tel:%2B33%20%280%294%2042%2095%2036%2039>
> Laboratoire Parole et Langage
> UMR 6057 CNRS - Université de Provence
> 5, avenue Pasteur
> BP 80975
> 13604 Aix-en-Provence Cedex 1 (France)
> Founder sec'y, Special Interest Group on Speech Prosody (SProSIG)
> Centre de Ressources pour la Description de l'Oral (CRDO-Aix)
> Bol Processor project
> Co-Ed., Communication Processes (3 volumes), Sage Publications
> Reply to *sender*
> <mailto:[log in to unmask]>
> | Reply to *group*
> <mailto:[log in to unmask]>
> | Reply *via web post*
> | *Start a new topic*
> Messages in this topic
> *Recent Activity:*
> Visit Your Group
> Switch to: Text-Only, Daily Digest • Unsubscribe • Terms of
> The Variationist List - discussion of everything related to variationist
> To send messages to the VAR-L list (subscribers only), write to:
> [log in to unmask]
> To unsubscribe from the VAR-L list, click the following link:
The Variationist List - discussion of everything related to variationist sociolinguistics.
To send messages to the VAR-L list (subscribers only), write to:
[log in to unmask]
To unsubscribe from the VAR-L list, click the following link: