Hi,
I tried to narrow down the problem, it seems that blahpd has some
difficulty to execute the /usr/libexec/pbs_status.sh file if the
blahpd is called by the tomcat webserver.
However if the blahpd command is called by either tomcat/root user
from the terminal, the command has no problem to give the correct
result. I was suspecting that there were
a racing condition where the blahpd tried to get the lrms info from
cream (via BLPClient), however since the cream is not ready yet the to
accept the query, it just simply "reject" the query
hence blahpd is also giving error message. But this shouldn't be the
case since it will keep trying to make query to cream...
Regards
On Tue, Nov 10, 2015 at 6:59 PM, Muhammad Farhan SJAUGI
<[log in to unmask]> wrote:
> Dear Jeff,
>
> The blahpd run as user tomcat...also user tomcat has operator
> privileges on the torque server:
>
> set server operators += [log in to unmask]
>
> However, when I tried to run/execute the blahpd command as user
> tomcat, it seems return correct result:
>
> [root@khaldun ~]# su - tomcat
> -sh-4.3$ blahpd
> $GahpVersion: 1.8.0 Mar 31 2008 INFN\ blahpd\ (poly,new_esc_format) $
> BLAH_GET_HOSTPORT 0
> S
> RESULTS
> S 1
> 0 0 pbs/khaldun.biruni.upm.my:56554
>
>
> Regards
>
> On Tue, Nov 10, 2015 at 5:14 PM, Jeff Templon <[log in to unmask]> wrote:
>> Hi
>>
>> as which user does blahpd run? I see you ran it as root by hand, but when run normally, which user? and does this user have operator privileges on the torque server?
>>
>> JT
>>
>>> On 10 Nov 2015, at 01:25, Muhammad Farhan SJAUGI <[log in to unmask]> wrote:
>>>
>>> Greetings,
>>>
>>> I found something interesting.. apparently CREAM didn't get the
>>> correct result from BLAH:
>>>
>>> 10 Nov 2015 00:19:18,366 INFO
>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLParserClient -
>>> initializeConnection: getting info about BLParser (pbs) from BLAH
>>> (retry count=97/100)
>>> 10 Nov 2015 00:20:18,368 DEBUG
>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLAHExecutor -
>>> BLAH_GET_HOSTPORT 0
>>> 10 Nov 2015 00:20:19,370 DEBUG
>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLAHExecutor -
>>> getBlahOutput: S
>>> 10 Nov 2015 00:20:20,371 DEBUG
>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLAHExecutor -
>>> getBlahOutput: S 1
>>> 10 Nov 2015 00:20:20,371 DEBUG
>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLAHExecutor -
>>> getBlahOutput: 0 0 pbs/Error\ reading\ host:port
>>>
>>> However, when I tried to query BLAH manually it seems return correct answer:
>>>
>>> [root@khaldun ~]# blahpd
>>> $GahpVersion: 1.8.0 Mar 31 2008 INFN\ blahpd\ (poly,new_esc_format) $
>>> BLAH_GET_HOSTPORT 0
>>> S
>>> RESULTS
>>> S 1
>>> 0 0 pbs/khaldun.biruni.upm.my:56554
>>>
>>> Perhaps this is the main issue?
>>>
>>> Regards
>>>
>>> On Tue, Nov 10, 2015 at 7:37 AM, Muhammad Farhan SJAUGI
>>> <[log in to unmask]> wrote:
>>>> Dear Steve,
>>>>
>>>> Thank you for your feedback. I can confirm that the new blparser is
>>>> used instead the old one.
>>>>
>>>> I'm wondering how the cream communicate with blparser? is it via
>>>> socket or merely call the programming api?
>>>>
>>>> Regards
>>>>
>>>> On Mon, Nov 9, 2015 at 10:13 PM, Stephen Jones <[log in to unmask]> wrote:
>>>>> Hi Muhammad,
>>>>>
>>>>> Here's something to check.
>>>>>
>>>>> http://grid.pd.infn.it/cream/field.php?n=Main.CREAMAndBlparserConfiguration
>>>>>
>>>>> If the "blparser" service is used by the "Old Blah Parser", perhaps you are
>>>>> accidentally starting the wrong parser?
>>>>>
>>>>> Note: I think the "BNotifier" and "BUpdaterPBS" processes belong to the "New
>>>>> Blah Parser". Maybe...
>>>>>
>>>>> So check which parser you are using.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>>
>>>>> On 11/08/2015 10:03 AM, Muhammad Farhan SJAUGI wrote:
>>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> One of cluster shows strange behavior... CREAM unable to submit the
>>>>>> job to BLAH because the blparser service is not alive:
>>>>>>
>>>>>> 08 Nov 2015 09:54:06,375 WARN
>>>>>> org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor -
>>>>>> submission to BLAH failed [jobId=CREAM524062606; reason=The job cannot
>>>>>> be submitted because the blparser service is not alive; retry
>>>>>> count=3/3]
>>>>>>
>>>>>> I can confirm that the blparser service is up:
>>>>>>
>>>>>> [root@khaldun etc]# ps ax | grep BNotifier
>>>>>> 3155 ? Sl 0:00 /usr/libexec/BNotifier
>>>>>>
>>>>>> [root@khaldun etc]# ps ax | grep BUpdaterPBS
>>>>>> 3167 ? S 0:00 /usr/libexec/BUpdaterPBS
>>>>>>
>>>>>> Also I found from the cream log another info as below (but im not sure
>>>>>> whether it is related or not)
>>>>>>
>>>>>> 08 Nov 2015 09:55:21,782 INFO
>>>>>> org.glite.ce.cream.jobmanagement.cmdexecutor.blah.BLParserClient -
>>>>>> initializeConnection: getting info about BLParser (pbs) from BLAH
>>>>>> (retry count=95/100)
>>>>>>
>>>>>> I have tried restarting the service and even re-run yaim, both were
>>>>>> not able to solve the problem...
>>>>>>
>>>>>> Is there anyone can help me to fix this problem?
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steve Jones [log in to unmask]
>>>>> Grid System Administrator office: 220
>>>>> High Energy Physics Division tel (int): 43396
>>>>> Oliver Lodge Laboratory tel (ext): +44 (0)151 794 3396
>>>>> University of Liverpool http://www.liv.ac.uk/physics/hep/
>>>>
>>>>
>>>>
>>>> --
>>>> Muhammad Farhan Sjaugi, S.Kom. M.Sc
>>>>
>>>> Technical Coordinator
>>>> Academic Grid Malaysia
>>>> c/o UNITEN
>>>> email: [log in to unmask]
>>>>
>>>> Lecturer/Programmer
>>>> Perdana University Centre for Bioinformatics
>>>> email: [log in to unmask]
>>>
>>>
>>>
>>> --
>>> Muhammad Farhan Sjaugi, S.Kom. M.Sc
>>>
>>> Technical Coordinator
>>> Academic Grid Malaysia
>>> c/o UNITEN
>>> email: [log in to unmask]
>>>
>>> Lecturer/Programmer
>>> Perdana University Centre for Bioinformatics
>>> email: [log in to unmask]
>
>
>
> --
> Muhammad Farhan Sjaugi, S.Kom. M.Sc
>
> Technical Coordinator
> Academic Grid Malaysia
> c/o UNITEN
> email: [log in to unmask]
>
> Lecturer/Programmer
> Perdana University Centre for Bioinformatics
> email: [log in to unmask]
--
Muhammad Farhan Sjaugi, S.Kom. M.Sc
Technical Coordinator
Academic Grid Malaysia
c/o UNITEN
email: [log in to unmask]
Lecturer/Programmer
Perdana University Centre for Bioinformatics
email: [log in to unmask]
|