Hi Jens, I need to add one more functionality to the script then I am ready. That will happen tomorrow. Then I will keep on going .... Lydia On Wed, 23 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote: > Hi Lydia, > > Great stuff. So you will be moving data over the Christmas break? This > will be good... > > ... we also need to get Jon started; whether he wants to run your > script, too, or do something else. And we need to clear out the old > data, but I'd ask Brian to look into that once he's back in the new year. > > Merry and Happy to you too. > > Cheers > -j > > > > On 23/12/2015 15:01, Lydia Heck wrote: >> Hi Jens and all, >> >> I have a script now that tars up directories per DiRAC project into >> tar files of specific size. Once one tar file is complete it is >> archived to RAL. Once the archive is complete the tar file is deleted >> and the next set of files is being archived. >> >> The chunk size at present is 256 GByte or slightly bigger depends on >> the size of files. >> >> The transfer of such a file takes ~15 minutes. >> >> The script needs some polishing and once I am totally happy I can run >> it non-interactively. I currently still have an interactive element in >> the script as the last debugging and other idea stages are not fully >> completed. >> >> Merry Christmas and a Happy New year. >> >> Lydia >> >> >> >> >> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote: >> >>> Hooray for downgrading! >>> >>> On 22/12/2015 13:49, Lydia Heck wrote: >>>> >>>> Done it. I have down-graded to 3.3.3-x and now the lot works. >>>> >>>> Lydia >>>> >>>> >>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote: >>>> >>>>> Weird! Which version are you using? >>>>> >>>>> We seem to have fts-rest-3.3.3-2 and fts-rest-cli-3.3.3 and >>>>> fts-rest-cloud-storage-3.3.3 and python-fts-3.3.3 but every other fts >>>>> package on the server is 3.3.2. (There is both a python-fts and an >>>>> fts-python - weird). >>>>> >>>>> Cheers >>>>> --jens >>>>> >>>>> On 22/12/2015 12:22, Lydia Heck wrote: >>>>>> Hi Jens, >>>>>> >>>>>> I have a script that I could test. However I now have an issue >>>>>> that the >>>>>> >>>>>> fts-transfer command does not work anymore with the error message >>>>>> >>>>>> >>>>>> fts client is connecting using the gSOAP interface. Consider changing >>>>>> your configured fts endpoint port to select the REST >>>>>> interface >>>>>> >>>>>> I am currently rebooting the system, but have you seen something >>>>>> similar once before? >>>>>> >>>>>> Best wishes, >>>>>> Lydia >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, 18 Dec 2015, Jens Jensen wrote: >>>>>> >>>>>>> Right, so the find suggestion at least would do a depth first >>>>>>> listing of >>>>>>> files-to-add, and tar I am guessing would also add files depth >>>>>>> first, >>>>>>> which I think meets your requirement, or close enough, of putting >>>>>>> related files into the same chunk. >>>>>>> >>>>>>> Using find-and-then-tar you could avoid building the following >>>>>>> archive >>>>>>> until the current one has been sent off to RAL. You'd just need >>>>>>> space >>>>>>> for the filelist. >>>>>>> >>>>>>> What I am thinking is: >>>>>>> 1. find <folder to be backed up> -newer <timestamp file> |<list >>>>>>> size and >>>>>>> full filename> >filelist >>>>>>> 2. Walk through filelist one line at a time adding up sizes and >>>>>>> filenames till a certain threshold size has been exceeded (say >>>>>>> 20GB or >>>>>>> 100,000 files, whichever comes firsts) or adding the next file will >>>>>>> take >>>>>>> us above a higher threshold (say 50GB) >>>>>>> 3. Once a list has been found, tar it up, compress it, optionally >>>>>>> store >>>>>>> the contents (list) somewhere, send the tarball to RAL, and then >>>>>>> delete it. >>>>>>> 4. Go back to step 2 until the filelist has been completed. >>>>>>> 5. Touch the timestamp file >>>>>>> 6. sleep 24 hours (or whatever) and go to step 1. >>>>>>> >>>>>>> This would meet all our requirements and would be stupidly easy >>>>>>> to do. >>>>>>> >>>>>>> Cheers >>>>>>> --jens >>>>>>> >>>>>>> >>>>>>> On 17/12/2015 12:43, Lydia Heck wrote: >>>>>>>> >>>>>>>> Hi Jens, >>>>>>>> >>>>>>>> it took longer than I thought to tidy up the results from the >>>>>>>> meeting >>>>>>>> last week (I spent a full day on a spreadsheet :-) ) >>>>>>>> >>>>>>>> However I am now going to look at the transfers again. >>>>>>>> >>>>>>>> I looked over the presentation you shared with us. And yes, that is >>>>>>>> the way it should go. There are some provisos: >>>>>>>> >>>>>>>> If I create 3 TB chunks, I need to have space for several of them: >>>>>>>> >>>>>>>> One being transfered, one in waiting and one being prepared. This >>>>>>>> will >>>>>>>> add 10 TB to the storage that is not available for the users; >>>>>>>> can be >>>>>>>> done, but needs to be factored in. >>>>>>>> >>>>>>>> >>>>>>>> If there is indeed a failure, then I need to identify where the >>>>>>>> data >>>>>>>> are that have been deleted, corrupted or whatever. If I "just" >>>>>>>> chunk >>>>>>>> the whole filesystem, that would be difficult, if not impossible to >>>>>>>> find. So I would need to arrange transfers by project, and even >>>>>>>> then >>>>>>>> the retrieval might physically not be possible, depending of how >>>>>>>> many >>>>>>>> of the chunks I would have to retrieve. >>>>>>>> >>>>>>>> I believe that currently the biggest top folder is ~500 TB. >>>>>>>> >>>>>>>> There would not be lots of jobs running, simply because there is >>>>>>>> not >>>>>>>> enough space to chunk that much. >>>>>>>> >>>>>>>> On the storage that I would like to archive there are more than 64M >>>>>>>> files. >>>>>>>> >>>>>>>> So would a "flat" chunking tar of all the filesystem be a "good" >>>>>>>> idea? I am not sure. >>>>>>>> >>>>>>>> I need to think about this a bit more. >>>>>>>> >>>>>>>> Lydia >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, 10 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote: >>>>>>>> >>>>>>>>> Hi Lydia, >>>>>>>>> >>>>>>>>> That's great. I am actually on leave tomorrow (travelling) and out >>>>>>>>> Monday (at Royal Holloway) but the others on the list can >>>>>>>>> follow up. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> --jens >>>>>>>>> >>>>>>>>> On 10/12/2015 10:21, Lydia Heck wrote: >>>>>>>>>> >>>>>>>>>> Dear all, >>>>>>>>>> >>>>>>>>>> sorry for my silence. I have a meeting in London on Tuesday and >>>>>>>>>> attended CIUK yesterday. Just back and I have to tidy up some >>>>>>>>>> spreadsheets from Tuesday's meeting and I will be busy today as >>>>>>>>>> well >>>>>>>>>> with local tasks. So I should get back to this tomorrow. >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Lydia >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, 9 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Here is the proposal the third option. Would also be worth >>>>>>>>>>> looking >>>>>>>>>>> into. >>>>>>>>>>> It is written in python AFAIK. >>>>>>>>>>> >>>>>>>>>>> Overall we are trying to deploy something that meets the >>>>>>>>>>> requirements >>>>>>>>>>> and saves us time in the long run. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> --jens >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> >