Hi all,
Looking at it from a different angle surely this is a similar situation
to Libraries (or any published material) i.e. how do users find
information pertinent to what they are interested in. If we consider
libraries as an example the standards exist for the cataloguing and not
the format and presentation. If I am searching for information on 'x'
then I'm not really interested whether it's a CD, Book, film or Image.
Similarly we do not expect the whole of the library resources to be in a
form to allow textual searching, we have to trust the indexing and
cataloguing to have done that . So may be in the first instance the
standards should be about the minimum of cataloguing / indexing that
should be assoicated to any grey literature produced by toutsde
contractors. This could then be captured within the HER and searched as
part of any process.
cheers
Simon
>>> Crispin Flower <[log in to unmask]> 18/08/2010 23:06 >>>
Hi Nick
Units do use complex software tools for collecting and manipulating
digital data, and usually have significant skills in this area. However
what they lack is any standardised means of preparing the reports and
gazetteers and piping the data back to the HERs and digital archives. So
much of this data gets converted back into relatively useless document
formats or paper... the grey lit. Equally there is no widely adopted
standard way of taking in digital data from the HERs at project start,
or tools for viewing/enhancing/analysing this data during the project.
Each unit does its own thing here, and sometimes each project manager
within each unit does their own thing. And they work with the
capabilities (for providing/receiving digital data) of the HERs in their
patch, which are also pretty variable (even among those who use HBSMR).
This situation prevails from the smallest watching brief up to massive
regional EH-funded projects worth £LOTS - I've rarely seen this dealt
with efficiently, and many current large projects still seem to get
stuck on this when it comes to transferring the results, undermining
their stated purpose.
So perhaps rather than just saying this is a long way off, one of the
outcomes of this discussion might be to agree that this situation can
and should be improved for the benefit of all sides. We are discussing
standards for grey lit, but (I feel) only because we see these as the
way at getting the real data, which in turn is because we cannot get at
that real data before it got put in a document format; so we could
expand the debate to cover the entire means of transmission of data from
fieldwork to HER/archive.
I'm not advocating a single imposed solution for all units (whether a
desktop software package or a web site like Oasis), but a clear
definition of expectations/standards of what data should go where and in
what structure (from event spatial data through to summary synthesis
texts), in order to achieve the intended outcomes of the fieldwork
(which often involves destruction of the primary resource). Then useful
software tools will be required to achieve these standards (on all
sides), and such tools could also make life easier and more productive
for the unit staff in other ways, but that's another matter (and I'm not
saying this would be easy to achieve by the way).
As an aside, although it is an interesting topic, I'm finding
computerised document language processing a bit of a red herring here.
These datasets and documents all go through intensive expert human
mediation and indexing, and the backlogs are really not that enormous.
Do we need a computer to learn that "Church Lane" means a street and not
a church, when HER staff already know that instantly, plus where to find
it? And pushing an NLP approach might even undermine the role of us
humans (oops now at serious risk of being labelled a luddite).
Clarify... if a digital fieldwork dataset has an element for the event
location, and that contains "Church Lane" - fine, let the com
puter
import that into the right place, geocode it, and make it searchable* as
locational info (once it has been validated by a human), but do we
really need the computer to try to figure this out from raw waffle and
get it wrong half the time? Isn't that only needed when you have a true
mountain of digital documents with no humans to process them?
Over-stretched HER officers might say "yes please" at this stage, I'm
not saying don't give it a whirl, but I suspect this phase of having
digital documents being the main transport mechanism for fieldwork data
is going to be mercifully brief.
Anyway that's now more than enough from me,
yours
Crispin
* on semantic web, domestic web, desktop system etc.
________________________________
From: The Forum for Information Standards in Heritage (FISH) on behalf
of Nick Boldrini
Sent: Wed 18/08/2010 13:30
To: [log in to unmask]
Subject: Re: [FISH] HEGEL - access and standards
I think Crispins comments are interesting, but this scenario is a way
off. Not all units (especially the one man bands) have fancy data
management tools, which undermines the vision of a brave new world
somewhat. This is likely to become increasingly useful, though.
The problem is though, that the more diffused the data sources are, the
harder it is going to be to ensure standardisation.
The idea of automating the indexing, with, VERY importantly, a human
interface is a good one. However, leaving this to contractors may be
problematic - experience from OASIS on how well this works would be very
important to draw on e.g. how well Contractors use Thesauri etc and how
comprehensively they index.
I also think Martin Locock hits the nail on the head - what are HER's
and curators needs and how would a standard meet those needs?
Bearing in mind that PPS5 good practice is meant to apply to all
archaeological endeavours (as I understand it) and not just DC work, and
that PPS5 refers repeatedly to HER's (and notably not NMR's, ADS, OASIS
etc etc) then it would seem to me fairly clear where the focus of
finding out user needs should lie. And yes I am an HER officer so that
helps guarantee my post, but it is also Government Policy.
-----Original Message-----
From: The Forum for Information Standards in Heritage (FISH)
[mailto:[log in to unmask]] On Behalf Of Leif Isaksen
Sent: 18 August 2010 12:18
To: [log in to unmask]
Subject: Re: [FISH] HEGEL - access and standards
Hi all
I'm generally very much in agreement with Crispin's point here. I
suspect that the limitations of print-bound literature (space,
linearity, etc.) will see it ultimately replaced by more flexible
digital formats. General social trends suggest that it's likely to be
a question of how long this process will take rather than whether it
will happen. (For the horrified, no doubt a vestigial printed volume
will XSLT'd, printed and filed as well ;-) ). I'm not suggesting that
there will be no editorial process but its concerns may differ
significantly from those today.
At risk of moving to the technical however (apologies Ed, I know you
want to hold that discussion tomorrow) I'd recommend strongly that the
emphasis should be on making the grey-lit-slash-data directly
available, ideally as XML (albeit with server access restrictions
where appropriate). There will be a vital need for tools and
mechanisms which can index, parse, search, browse, visualize and
analyze what will inevitably become a digital mountain, but we should
try to avoid walled gardens that require specialist technical
knowledge or software to use them. Grey literature is valuable
precisely because anyone can engage with and understand it without
additional apparatus. This would also have the additional benefit of
making persistent HTTP identifiers (URIs) easier to introduce which
are more or less fundamental to making any of the ontology/semantic
approaches mentioned machine-readable and thus viable on a large
scale.
Best
Leif
On Wed, Aug 18, 2010 at 11:35 AM, Crispin Flower <Cris
[log in to unmask]>
wrote:
> Hi all
> I agree with the Martin's comments and similar from other writers,
and will forward some remarks I posted to Ed off-list yesterday, but
with apologies that I've only had time to read a small proportion of the
messages, so may be behind the curve.
> I'd ask whether the unpublished/able grey lit report is a useful
thing at all here. Is it the correct target for this debate, or just a
by-product of the process? Of course the report is necessary at the
point in time of assessing and signing off a project, and it fulfils an
essential purpose for producer/clients at that time, but for the medium
and longer term, as the means of communicating the results of a project
from those who undertook it (the contractor), to those who need the data
both within and beyond the immediate casework scenario, it is rather
inefficient. Perhaps instead of trying to promote the importance of
this stuff with beefed-up technical standards etc, we could acknowledge
how ephemeral it is, and find better ways of moving the real data
around; we could aim for a scenario in which the grey lit thing can be
dropped in the bin without loss, or perhaps retained only as part of the
planning or project management history, because all the significant data
it contained has been transmitted to the HER/NMR (or other accessible
repository) in a more efficient manner (by digital transfer with human
quality control and enhancement of indexing). We have in the UK a very
strong network of organisations and professional staff positioned to do
this essential human part, and this would work even better if the
spadework could happen automatically, rather than them wasting time
retyping stuff and piling up backlog. Then we achieve truly accessible
data, without having to worry about the medium. And to see this from
another angle, the grey lit report can be generated almost automatically
from the tools the contractor is using to manage their data, as a glossy
by-product that brings out the essentials for the primary consumers
(e.g. planning archaeologists, EH project managers, etc).
> I do agree there must be standards governing what should be the
output from fieldwork, and IfA is a good place for this particularly if
it can truly encompass build heritage recording. But for making the
primary data available where it's needed, I think it may be more useful
to improve direct data transfer mechanisms between HERs and fieldworkers
(in both directions). Incidentally, I don't know if anyone has mentioned
the Scottish "ASPIRE" project, which aims to do precisely this. I'm note
sure new standards are needed here, just new tools (after all the data
content is all covered by MIDAS isn't it?).
> And then keep up the progress on getting all HERs online and
cross-searchable (which has come on in leaps and bounds recently).
> Yours
> Crispin
>
>
>
>
> -----Original Message-----
> From: The Forum for Information Standards in Heritage (FISH)
[mailto:[log in to unmask]] On Behalf Of Martin Locock
> Sent: 18 August 2010 10:00
> To: [log in to unmask]
> Subject: Re: [FISH] HEGEL - access and standards
>
> There has been some overlap in the discussions between metadata
about
> grey literature (for cross-searching etc) and data: the bulk of GL
> contents is data, not metadata.
>
> For metadata we can fairly freely identify elements that might
promote
> searchability and re-use, but for data, we must accept that the
prime
> determinant of a project report contents will be the *project's*
purpose
> not the *report's.*
>
> One concern I would have from the GLADE user comments is that they
> assume that searching a corpus of grey literature is the best way to
> find out about archaeological data. We should, I hope, recognise
that
> this is a workaround arising from the ease with which GL can be added
to
> OASIS. In the long term, the best way to find archaeological data
> should be by examining the structured, consistent and validated data
> sets comprising the HERs, online or not.
If there is currently a
> problem that needs fixing, I would say the problem is that HERs have
> backlogs of published and unpublished sources which have not been
> analysed and added to the record, of which GL is only a subset, if
the
> most visible. Therefore we should be looking to HERs to tell us
what
> *they* find most troublesome about current GL reports.
>
>
> Martin
>
>
>
>
> --
> Martin Locock
> Rheolwr Cymorth y Project Project Support Manager
>
> Llyfrgell Genedlaethol Cymru National Library of Wales
> [log in to unmask] Ffôn / Phone 01970 632885
>
> Un o lyfrgelloedd mawr y byd One of the great libraries of the
world
> http://www.llgc.org.uk/
>
Help protect our environment by only printing this email if absolutely
necessary. The information it contains and any files transmitted with it
are confidential and are only intended for the person or organisation to
whom it is addressed. It may be unlawful for you to use, share or copy
the information, if you are not authorised to do so. If you receive this
email by mistake, please inform the person who sent it at the above
address and then delete the email from your system. Durham County
Council takes reasonable precautions to ensure that its emails are virus
free. However, we do not accept responsibility for any losses incurred
as a result of viruses we might transmit and recommend that you should
use your own virus checking procedures.
|