On 21/08/13 19:48, Christopher J. Walker wrote:
> 2) On file transfer, actual checksum at destination should match (b) at
> source and also (c) - providing it exists, and (d) if provided by the
> experiment. This should be turned on by default in the FTS. If there is
> a failure, the FTS should flag this (which it presumably does) and
> something sensible should happen.
>
> The something sensible is probably (but I'm open to suggestions here):
> i) Gather stats to see how frequent this is.
> ii) Try the transfer again, and try from other copies in the LFC????
> (thoughts from others welcome here).
> iii) Check integrity of source file (compare (a) and (b) and (c)).
> iv) Tell someone - somehow.
Writing this has made me think Jens had a point about checksums with the
FTS and using hadoop to mine them.
If you have this information on transfers that fail due to a checksum
problem, you can probe the extent of data corruption. One off corruption
may be network transfer, but repeated corruption suggests a source file
problem.
Mining this information probably does tell us something really quite
interesting about the prevalence of corrupt files on the storage -
where they are, are they from a particular site (or even gridftp
server). Hadoop would probably be a good way of extracting that
information from the logs.
Data should probably be normalised against successful transfers.
Chris
--
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Please note:
Following a global issue associated with some anti-virus software,
the above message has been recovered, reconstructed and re-sent by
an automated process. However, some original content may have been
altered or omitted inadvertently, including file attachments.
Please contact the original sender with any replacement requests.
Apologies for any inconvenience caused.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
|