I'd like to add my support to what Catherine has just said about the importance of the content of GL reports (I quote the extract from her posting to which I am responding below).
 
And as a further comment, I feel we need to think about GL reports as we do databases. If we use standardised terminology to describe the things that we most want to access from the reports, then data retrieval, and therefore access, is made much easier.
 
When recording data in Historic Environment Records, for example, it is second nature to use nationally agreed thesauri from the INSCRIPTION list of wordlists (http://www.fish-forum.info/i_lists.htm), however, how many of those writing GL reports would think to describe what they are recording/writing about using the same thesauri?
 
For my own Masters research into archaeological grey literature a few years ago, before the NLP project Catherine refers to, I carried out a case study to demonstrate how, through the application of CSS and XSL stylesheets, reports and their content may be displayed in different ways and how selected data may be extracted from the report text to be repurposed for input into other systems, such as Historic Environment Records and the OASIS Project database. I found not only that the absence of certain data within report content was a limiting factor, but that it would have been a lot quicker and easier if standardised terminology had been used in the report text when describing types of monument, event and artefact, as well as dates/periods etc. (see Falkingham, G, 2005 Internet Archaeology 17: A Whiter Shade of Grey: A new approach to archaeological grey literature using the XML version of the TEI Guidelines - http://intarch.ac.uk/journal/issue17/5/falkingham_summary.html)

So, let us all call a spade a spade! If the profession as a whole can see, and reap, the benefits of using the same terminology so that we all describe things in a similar way, whether in metadata, reports, HERs, or published articles, then perhaps we will be a few steps closer to achieving the enhanced access we all desire.
 
Gail

 

________________________________________
Gail Falkingham,
Historic Environment Team Leader
Countryside Service
Economic and Rural Services
Business and Environmental Services
North Yorkshire County Council
County Hall
Northallerton  DL7 8AH

Direct Dial:  01609 532839
Customer Service Centre: 08458 727374
Office Fax: 01609 532558
[log in to unmask]
www.northyorks.gov.uk/archaeology
 
 

Quoting from >>> csh3 <[log in to unmask]> 17/08/2010 11:09 >>>

"The second area where a standard structure to reports may be prescient is in the use of technologies to create meaningful rich indices to grey literature, thereby reducing the need for manual input. We have recently undertaken a R&D project looking the use of Natural Language Processing (NLP) to index grey literature and the results have been promising. For those of us who have a limited handle on technology this means that the computer programme trawls through digital versions of the grey lit records (in volume) and extracts indexing information in an ‘intelligent way’. The computer learns that, for example when it comes across the phrase ‘Church Lane’ it indexes this under location information rather than monument type because the word Church is suffixed by Lane. Similarly though the position of information in the report is important for meaningful indexing using NLP; we can teach the programme to give greater importance to location information in the title rather than the body of the text so that if in the grey lit report the author is comparing the finds from two different sites in the text that particular report is not incorrectly indexed under the site used for comparison. So you can see that if you want to use NLP technology to index grey lit (perhaps especially important for indexing scans of grey lit) then the structure (standards) within the report is quite important. But that’s a big if."

 


Access your county council services online 24 hours a day, 7 days a week at www.northyorks.gov.uk.

WARNING


Any opinions or statements expressed in this e-mail are those of the individual and not necessarily those of North Yorkshire County Council.


This e-mail and any files transmitted with it are confidential and solely for the use of the intended recipient. If you receive this in error, please do not disclose any information to anyone, notify the sender at the above address and then destroy all copies.


North Yorkshire County Council’s computer systems and communications may be monitored to ensure effective operation of the system and for other lawful purposes. All GCSX traffic may be subject to recording and/or monitoring in accordance with relevant legislation.


Although we have endeavoured to ensure that this e-mail and any attachments are free from any virus we would advise you to take any necessary steps to ensure that they are actually virus free.


If you receive an automatic response stating that the recipient is away from the office and you wish to request information under either the Freedom of Information Act, the Data Protection Act or the Environmental Information Regulations please forward your request by e-mail to the Data Management Team ([log in to unmask]) who will process your request.

North Yorkshire County Council.