Ok, that's it working now.
What's the last entry in the srmv1*.gz logs? It might tell us what went
wrong.
Thanks,
Greig
On 03/01/08 14:35, Simon George wrote:
> Ok, I've restarted srmv1, srmv2, srmv2.2, dpm-gsiftp and dpm on se1,
> and dpm-gsiftp on gridraid2 for good measure (although I did this earlier).
>
> /var/log/srmv1/log is now quite active, so that could be a (the?)
> problem solved. Please could you try again Greig?
>
> Cheers,
> Simon
>
> Greig Alan Cowan wrote:
>> It's very strange that there is nothing in the srmv1 logs. This is
>> your head node, right? I can telnet to port 8443, but can you check
>> that the srmv1 server is running. In fact, just restart it. Similarly
>> for the gridftp server.
>>
>> Greig
>>
>> On 03/01/08 13:22, Simon George wrote:
>>> This is what I found in the logs on se1.
>>>
>>> /var/log/dpm/log
>>>
>>> roughly every minute, this is repeated:
>>>
>>> 01/03 13:12:15 3547,24 dpm_srv_getpoolfs: DP092 - getpoolfs request
>>> by
>>> [log in to unmask]
>>> (0,0) from se1.pp.rhul.ac.uk
>>> 01/03 13:12:15 3547,24 dpm_srv_getpoolfs: returns 0
>>>
>>> /var/log/srmv1/log
>>> is empty
>>>
>>> /var/log/dpns/log
>>> has entries for my recent dpns-ls and that's it.
>>>
>>> On gridraid2:
>>> /var/log/dpm-gsiftp/dpm-gsiftp.log is empty.
>>>
>>> Anywhere else I should look?
>>>
>>> Cheers,
>>> Simon
>>>
>>> Greig Alan Cowan wrote:
>>>> Something still isn't right, I can't copy a file into your DPM. What
>>>> are the /var/log/dpm, dpns and srmv1 logs saying?
>>>>
>>>> Greig
>>>>
>>>> On 03/01/08 12:59, Simon George wrote:
>>>>> Yes, I have disabled the pool with the expired cert.
>>>>>
>>>>> On the head node (se1) I can do:
>>>>> > dpns-ls /dpm
>>>>> pp.rhul.ac.uk
>>>>>
>>>>> dpm-qryconf and dpm-modifyfs also work.
>>>>>
>>>>> Greig Alan Cowan wrote:
>>>>>> Hi Simon,
>>>>>>
>>>>>> Have you made the change yet? Something still isn't right. Are you
>>>>>> sure that everything is OK on the DPM head node?
>>>>>>
>>>>>> Can you run commands like
>>>>>>
>>>>>> dpns-ls /dpm
>>>>>>
>>>>>> on it as the root user?
>>>>>>
>>>>>> Cheers,
>>>>>> Greig
>>>>>>
>>>>>> On 03/01/08 12:14, Simon George wrote:
>>>>>>> btw the pool node with the out of date cert is gridraid3 which is
>>>>>>> currently read-only. So the one used by the SAM test should be
>>>>>>> gridraid2 which does have an up-to-date cert.
>>>>>>>
>>>>>>> Do you think this could still cause the error?
>>>>>>>
>>>>>>> I wonder if I should completely disable gridraid3 until the cert
>>>>>>> is fixed.
>>>>>>>
>>>>>>> Greig Alan Cowan wrote:
>>>>>>>> Hi Simon,
>>>>>>>>
>>>>>>>> Best thing to do is re-run the YAIM configuration step.
>>>>>>>>
>>>>>>>> /opt/glite/yaim/bin/yaim -c -s /path/to/site-info.def -n
>>>>>>>> SE_dpm_mysql
>>>>>>>>
>>>>>>>> That should do the trick and make sure all the certificates are
>>>>>>>> in the right place with the right permissions etc.
>>>>>>>>
>>>>>>>> You should also think about upgrading to the latest version of
>>>>>>>> DPM (1.6.7). More about that in a forthcoming email...
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Greig
>>>>>>>>
>>>>>>>> On 03/01/08 11:49, Simon George wrote:
>>>>>>>>> Hi Greig,
>>>>>>>>>
>>>>>>>>> thanks for your quick reply.
>>>>>>>>> I've checked, all the CRLs are up to date. The host
>>>>>>>>> certificates are up to date as in:
>>>>>>>>>
>>>>>>>>> openssl x509 -in /etc/grid-security/hostcert.pem -dates
>>>>>>>>> -text|head -2
>>>>>>>>> notBefore=Oct 5 11:25:32 2007 GMT
>>>>>>>>> notAfter=Nov 3 11:25:32 2008 GMT
>>>>>>>>>
>>>>>>>>> But I've noticed that on the pool node, this certificate is not
>>>>>>>>> propagated to /etc/grid-security/dpmmgr/dpmcert.pem
>>>>>>>>> (nor the corresponding key file). The files there are old and
>>>>>>>>> have expired.
>>>>>>>>>
>>>>>>>>> Is it just a case of copying these new certificate files to
>>>>>>>>> dpmmgr/dpmcert.* or do I need to do something else too?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Simon
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Greig Alan Cowan wrote:
>>>>>>>>>> Hi Simon,
>>>>>>>>>>
>>>>>>>>>> It looks like a security issue. Are the certificates and
>>>>>>>>>> CRLs up to date?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Greig
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 03/01/08 10:44, Simon George wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I have the problem described here:
>>>>>>>>>>> http://www.gridpp.ac.uk/wiki/Random_DPM_errors_in_SAM#Error_reading_token_data
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Does anyone know the solution to this?
>>>>>>>>>>>
>>>>>>>>>>> For example:
>>>>>>>>>>> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=TestResult&nodename=ce1.pp.rhul.ac.uk&vo=ops&testname=CE-sft-lcg-rm-cr&testtimestamp=1199352207
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Simon
|