Hi Cristina, Cristina Aiftimiei wrote: > we had a prod-site that had problems with job-submission, solved only by > removing the old files present in the /opt/globus/tmp/gram_job_state/ > directory. > > The simptoms were that a submitted job managed to pass from the WMS to > CE, on the CE - from globus to the batch-system (LSF 7.3), finished > correctly,... and everything stoped here, with non error messeges to the > user. The status presented allways the job in one of the states > "Scheduled" or "Running"... but not the "Done" one. > > The number of the files accumulated in the directory > /opt/globus/tmp/gram_job_state/ was ~31000. Once removed... the > situation improved... but it's still a little slow in presenting the > status "Done" to the user. > I checked the comunication between the CE-WMS - it's working. > > The versions of CE, WMS are the last one released to the production > (Update 41). > Is there any way I could understand what happend - why the huge number > of files in that directory? Please do the following on your CE node: Edit the /opt/globus/etc/globus-gma.conf file and add a "debug 1" line to it (no equals sign, just a space as a separator). Restart globus-gma with `service globus-gma restart' Wait for 20-30 minutes and send me (not to the list as it will be megabytes in size) the log file /opt/globus/var/log/globus-gma.log Please also include the output of `ps auxfww' command from your CE. -- Cheers, Andrey Kiryanov.