Dear List (Sophie?),
we have trouble with LFC. In order to support a local VO, an LFC-mysql
node has been installed/configured via yaim (lcg-yaim-2.4.0-4,
LFC-server-mysql-1.2.6-1sec_sl3). Most things, such as
registering/creating/removing/linking files work all right, the lcg-*
commands from various UIs also interact with the LFC well.
lgc-infosites also returns the LFC and SE for the VO. There is a
problem with jobs, however: when using InputData requirements for some
existing files (see below), the jobs cannot be matched. Sample JDL file:
[
requirements = ( other.GlueCEStateStatus == "Production" );
RetryCount = 3;
MyProxyServer = "grid153.kfki.hu";
Executable = "RemoteInput.sh";
StdOutput = "stdout";
OutputSandbox = { "stdout","stderr" };
VirtualOrganisation = "hungrid";
rank = -other.GlueCEStateEstimatedResponseTime;
StdError = "stderr";
InputSandbox = { "RemoteInput","RemoteInput.sh" };
InputData = { "lfn:/grid/hungrid/szabi/test.txt" };
DataAccessProtocol = { "gsiftp" }
]
and edg-job-list-match cannot find a suitable CE. Without the
InputData/DataAccessProtocol constraints, the CEs that support the VO are
returned. The -debug option of edg-job-list-match did not reveal anything,
but the logs on the RB contain the following (this is the relevant part of
/var/edgwl/networkserver/log/events.log):
06 Jul, 17:27:50 -I- checkRequirement:
grid109.kfki.hu:2119/jobmanager-lcgcondor-hungrid, Ok!
06 Jul, 17:27:50 -E- listReplica(): Replica Manager C++ API: InfoService:
No service found in InfoService
06 Jul, 17:27:51 -E- CommandFactoryServerImpl()::ListJobMatch():
ListJobMatch done
I searched the web and the documentation(?-)) for the "listReplica()..."
error above, but couldn't find anything. What could be the problem? I
suppose some misconfiguration on the RB, but have no clue whatsoever.
For your information, here is the environment of 'edguser' on the RB:
[root@grid151 log]# su - edguser
[edguser@grid151 edguser]$ env
MANPATH=/opt/globus/man::/opt/edg/share/man:/opt/lcg/share/man:/opt/edg/man
GRIDMAP=/etc/grid-security/grid-mapfile
HOSTNAME=grid151.kfki.hu
LCG_LOCATION_VAR=/opt/lcg/var
SHELL=/bin/bash
TERM=rxvt
HISTSIZE=1000
GLOBUS_PATH=/opt/globus
GLOBUS_LOCATION=/opt/globus
EDG_WL_CONFIG_DIR=/opt/edg/etc
QTDIR=/usr/lib/qt-3.1
EDG_TMP=/tmp
GRIDMAPDIR=/etc/grid-security/gridmapdir/
USER=edguser
JAVA_INSTALL_PATH=/usr/java/j2sdk1.4.2_04
LD_LIBRARY_PATH=/opt/lcg/lib:/opt/globus/lib:/opt/edg/lib:/usr/local/lib:/opt/globus/lib:/opt/edg/lib:/opt/globus/lib:/opt/edg/lib:/opt/globus/lib:/opt/edg/lib
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
GPT_LOCATION=/opt/gpt
LCG_LOCATION=/opt/lcg
EDG_WL_TMP=/var/edgwl
CLASSADJ_INSTALL_PATH=/usr
LIBPATH=/opt/globus/lib:/usr/lib:/lib
EDG_WL_USER=edguser
MAIL=/var/spool/mail/edguser
PATH=/usr/java/j2sdk1.4.2_08/bin:/opt/lcg/bin:/usr/kerberos/bin:/opt/globus/bin:/opt/globus/sbin:/opt/edg/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/opt/gpt/sbin:/opt/edg/bin:/opt/edg/sbin:/opt/edg/bin:/opt/edg/sbin:/opt/edg/bin:/opt/edg/sbin:/home/edguser/bin
EDG_WL_LOCATION=/opt/edg
CONDOR_CONFIG=/opt/condor/etc/condor.conf
CONDORG_INSTALL_PATH=/opt/condor
LCG_TMP=/tmp
EDG_LOCATION=/opt/edg
INPUTRC=/etc/inputrc
PWD=/home/edguser
LANG=C
SASL_PATH=/opt/globus/lib/sasl
PERLLIB=/opt/edg/lib/perl
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
EDG_WL_LOCATION_VAR=/opt/edg/var
SHLVL=1
HOME=/home/edguser
EDG_WL_LOG_DESTINATION=grid151.kfki.hu:9002
GLOBUS_TCP_PORT_RANGE=20000 25000
RGMA_PROPS=/opt/edg/etc/rgma
COG_INSTALL_PATH=/usr
EDG_LOCATION_VAR=/opt/edg/var
PYTHONPATH=/opt/edg/lib:/opt/edg/lib/python
LOGNAME=edguser
RGMA_HOME=/opt/edg
LESSOPEN=|/usr/bin/lesspipe.sh %s
SHLIB_PATH=/opt/globus/lib
LOG4J_INSTALL_PATH=/usr
G_BROKEN_FILENAMES=1
_=/bin/env
And the config for the Workload Manager (opt/edg/etc/edg_wl.conf):
[
Common = [
DGUser = "${EDG_WL_USER}";
HostProxyFile = "${EDG_WL_TMP}/networkserver/ns.proxy";
UseCacheInsteadOfGris = true;
];
JobController = [
CondorSubmit = "${CONDORG_INSTALL_PATH}/bin/condor_submit";
CondorRemove = "${CONDORG_INSTALL_PATH}/bin/condor_rm";
CondorQuery = "${CONDORG_INSTALL_PATH}/bin/condor_q";
CondorSubmitDag = "${CONDORG_INSTALL_PATH}/bin/condor_submit_dag";
CondorRelease = "${CONDORG_INSTALL_PATH}/bin/condor_release";
SubmitFileDir = "${EDG_WL_TMP}/jobcontrol/submit";
OutputFileDir = "${EDG_WL_TMP}/jobcontrol/cond";
Input = "${EDG_WL_TMP}/jobcontrol/queue.fl";
LockFile = "${EDG_WL_TMP}/jobcontrol/lock";
LogFile = "${EDG_WL_TMP}/jobcontrol/log/events.log";
LogLevel = 5;
ContainerRefreshThreshold = 1000;
];
LogMonitor = [
JobsPerCondorLog = 1000;
LockFile = "${EDG_WL_TMP}/logmonitor/lock";
LogFile = "${EDG_WL_TMP}/logmonitor/log/events.log";
LogLevel = 5;
ExternalLogFile = "${EDG_WL_TMP}/logmonitor/log/external.log";
MainLoopDuration = 10;
CondorLogDir = "${EDG_WL_TMP}/logmonitor/CondorG.log";
CondorLogRecycleDir = "${EDG_WL_TMP}/logmonitor/CondorG.log/recycle";
MonitorInternalDir = "${EDG_WL_TMP}/logmonitor/internal";
IdRepositoryName = "irepository.dat";
AbortedJobsTimeout = 600;
];
NetworkServer = [
II_Port = 2170;
Gris_Port = 2135;
II_Timeout = 30;
Gris_Timeout = 20;
II_DN = "mds-vo-name=local, o=grid";
Gris_DN = "mds-vo-name=local, o=grid";
II_Contact = "grid152.kfki.hu";
ListeningPort = 7772;
MasterThreads = 8;
SandboxStagingPath = "${EDG_WL_TMP}/SandboxDir";
LogFile = "${EDG_WL_TMP}/networkserver/log/events.log";
LogLevel = 5;
BacklogSize = 16;
EnableQuotaManagement = false;
MaxInputSandboxSize = 10000000;
EnableDynamicQuotaAdjustment = false;
QuotaAdjustmentAmount = 10000;
QuotaInsensibleDiskPortion = 2.0;
];
WorkloadManager = [
PipeDepth = 1;
NumberOfWorkerThreads = 1;
DispatcherType = "filelist";
Input = "${EDG_WL_TMP}/workload_manager/input.fl";
LogLevel = 5;
LogFile = "${EDG_WL_TMP}/workload_manager/log/events.log";
MaxRetryCount = 10;
];
]
Soory for the long post. Any help/hint is appreciated. Thanks,
Cheers
Szabolcs
|