Print

Print


Hi Jens,

I need to add one more functionality to the script then I am ready. That will 
happen tomorrow. Then I will keep on going ....

Lydia


On Wed, 23 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:

> Hi Lydia,
>
> Great stuff. So you will be moving data over the Christmas break? This
> will be good...
>
> ... we also need to get Jon started; whether he wants to run your
> script, too, or do something else. And we need to clear out the old
> data, but I'd ask Brian to look into that once he's back in the new year.
>
> Merry and Happy to you too.
>
> Cheers
> -j
>
>
>
> On 23/12/2015 15:01, Lydia Heck wrote:
>> Hi Jens and all,
>>
>> I have a script now that tars up directories per DiRAC project into
>> tar files of specific size. Once one tar file is complete it is
>> archived to RAL. Once the archive is complete the tar file is deleted
>> and the next set  of files is being archived.
>>
>> The chunk size at present is 256 GByte or slightly bigger depends on
>> the size of files.
>>
>> The transfer of such a file takes ~15 minutes.
>>
>> The script needs some polishing and once I am totally happy I can run
>> it non-interactively. I currently still have an interactive element in
>> the script as the last debugging and other idea stages are not fully
>> completed.
>>
>> Merry Christmas and a Happy New year.
>>
>> Lydia
>>
>>
>>
>>
>>  On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>
>>> Hooray for downgrading!
>>>
>>> On 22/12/2015 13:49, Lydia Heck wrote:
>>>>
>>>> Done it. I have down-graded to 3.3.3-x and now the lot works.
>>>>
>>>> Lydia
>>>>
>>>>
>>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>
>>>>> Weird! Which version are you using?
>>>>>
>>>>> We seem to have fts-rest-3.3.3-2 and fts-rest-cli-3.3.3 and
>>>>> fts-rest-cloud-storage-3.3.3 and python-fts-3.3.3 but every other fts
>>>>> package on the server is 3.3.2. (There is both a python-fts and an
>>>>> fts-python - weird).
>>>>>
>>>>> Cheers
>>>>> --jens
>>>>>
>>>>> On 22/12/2015 12:22, Lydia Heck wrote:
>>>>>> Hi Jens,
>>>>>>
>>>>>> I have a script that I could test. However I now have an issue
>>>>>> that the
>>>>>>
>>>>>> fts-transfer command does not work anymore with the error message
>>>>>>
>>>>>>
>>>>>> fts client is connecting using the gSOAP interface. Consider changing
>>>>>>           your configured fts endpoint port to select the REST
>>>>>> interface
>>>>>>
>>>>>> I am currently rebooting the system, but have you seen something
>>>>>> similar once before?
>>>>>>
>>>>>> Best wishes,
>>>>>> Lydia
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, 18 Dec 2015, Jens Jensen wrote:
>>>>>>
>>>>>>> Right, so the find suggestion at least would do a depth first
>>>>>>> listing of
>>>>>>> files-to-add, and tar I am guessing would also add files depth
>>>>>>> first,
>>>>>>> which I think meets your requirement, or close enough, of putting
>>>>>>> related files into the same chunk.
>>>>>>>
>>>>>>> Using find-and-then-tar you could avoid building the following
>>>>>>> archive
>>>>>>> until the current one has been sent off to RAL. You'd just need
>>>>>>> space
>>>>>>> for the filelist.
>>>>>>>
>>>>>>> What I am thinking is:
>>>>>>> 1. find <folder to be backed up> -newer <timestamp file> |<list
>>>>>>> size and
>>>>>>> full filename> >filelist
>>>>>>> 2. Walk through filelist one line at a time adding up sizes and
>>>>>>> filenames till a certain threshold size has been exceeded (say
>>>>>>> 20GB or
>>>>>>> 100,000 files, whichever comes firsts) or adding the next file will
>>>>>>> take
>>>>>>> us above a higher threshold (say 50GB)
>>>>>>> 3. Once a list has been found, tar it up, compress it, optionally
>>>>>>> store
>>>>>>> the contents (list) somewhere, send the tarball to RAL, and then
>>>>>>> delete it.
>>>>>>> 4. Go back to step 2 until the filelist has been completed.
>>>>>>> 5. Touch the timestamp file
>>>>>>> 6. sleep 24 hours (or whatever) and go to step 1.
>>>>>>>
>>>>>>> This would meet all our requirements and would be stupidly easy
>>>>>>> to do.
>>>>>>>
>>>>>>> Cheers
>>>>>>> --jens
>>>>>>>
>>>>>>>
>>>>>>> On 17/12/2015 12:43, Lydia Heck wrote:
>>>>>>>>
>>>>>>>> Hi Jens,
>>>>>>>>
>>>>>>>> it took longer than I thought to tidy up the results from the
>>>>>>>> meeting
>>>>>>>> last week (I spent a full day on a spreadsheet :-) )
>>>>>>>>
>>>>>>>> However I am now going to look at the transfers again.
>>>>>>>>
>>>>>>>> I looked over the presentation you shared with us. And yes, that is
>>>>>>>> the way it should go. There are some provisos:
>>>>>>>>
>>>>>>>> If I create 3 TB chunks, I need to have space for several of them:
>>>>>>>>
>>>>>>>> One being transfered, one in waiting and one being prepared. This
>>>>>>>> will
>>>>>>>> add 10 TB to the storage that is not available for the users;
>>>>>>>> can be
>>>>>>>> done, but needs to be factored in.
>>>>>>>>
>>>>>>>>
>>>>>>>> If there is indeed a failure, then I need to identify where the
>>>>>>>> data
>>>>>>>> are that have been deleted, corrupted or whatever. If I "just"
>>>>>>>> chunk
>>>>>>>> the whole filesystem, that would be difficult, if not impossible to
>>>>>>>> find. So I would need to arrange transfers by project, and even
>>>>>>>> then
>>>>>>>> the retrieval might physically not be possible, depending of how
>>>>>>>> many
>>>>>>>> of the chunks I would have to retrieve.
>>>>>>>>
>>>>>>>> I believe that currently the biggest top folder is ~500 TB.
>>>>>>>>
>>>>>>>> There would not be lots of jobs running, simply because there is
>>>>>>>> not
>>>>>>>> enough space to chunk that much.
>>>>>>>>
>>>>>>>> On the storage that I would like to archive there are more than 64M
>>>>>>>> files.
>>>>>>>>
>>>>>>>> So would  a "flat" chunking tar of all the filesystem be a "good"
>>>>>>>> idea? I am not sure.
>>>>>>>>
>>>>>>>> I need to think about this a bit more.
>>>>>>>>
>>>>>>>> Lydia
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Thu, 10 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>
>>>>>>>>> Hi Lydia,
>>>>>>>>>
>>>>>>>>> That's great. I am actually on leave tomorrow (travelling) and out
>>>>>>>>> Monday (at Royal Holloway) but the others on the list can
>>>>>>>>> follow up.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> --jens
>>>>>>>>>
>>>>>>>>> On 10/12/2015 10:21, Lydia Heck wrote:
>>>>>>>>>>
>>>>>>>>>> Dear all,
>>>>>>>>>>
>>>>>>>>>> sorry for my silence. I have a meeting in London on Tuesday and
>>>>>>>>>> attended CIUK yesterday. Just back and I have to tidy up some
>>>>>>>>>> spreadsheets from Tuesday's meeting and I will be busy today as
>>>>>>>>>> well
>>>>>>>>>> with local tasks. So I should get back to this tomorrow.
>>>>>>>>>>
>>>>>>>>>> Best wishes,
>>>>>>>>>> Lydia
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  On Wed, 9 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Here is the proposal the third option. Would also be worth
>>>>>>>>>>> looking
>>>>>>>>>>> into.
>>>>>>>>>>> It is written in python AFAIK.
>>>>>>>>>>>
>>>>>>>>>>> Overall we are trying to deploy something that meets the
>>>>>>>>>>> requirements
>>>>>>>>>>> and saves us time in the long run.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> --jens
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>