Hi Alastair and all,

I think there is typo in the URL, it should be "http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS/Frontier" *not* "frontier" with small f. Now it works for with or without a http_proxy setting.

[root@farm002 tmp]# unset http_proxy
[root@farm002 tmp]# python fnget.py --url=http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS/Frontier --sql="SELECT count(*) FROM ALL_TABLES" 
Using Frontier URL:  http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS/Frontier
Query:  SELECT count(*) FROM ALL_TABLES
Decode results:  True
Refresh cache:  False

Frontier Request:
http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS/Frontier?type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNoLdvVxdQ5RSM4vzSvR0NJUcAvy91Vw9PGJD3F08nENBgCQ9wjs

Query started:  06/16/10 11:44:15 BST
Query ended:  06/16/10 11:44:16 BST
Query time: 1.34605288506 [seconds]

Query result:
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE frontier SYSTEM "http://frontier.fnal.gov/frontier.dtd">
<frontier version="3.22" xmlversion="1.0">
 <transaction payloads="1">
  <payload type="frontier_request" version="1" encoding="BLOBzip">
   <data>eJxjY2Bg4HD2D/UL0dDSZANy2PxCfZ1cg9hBbBYLC2NjdgBW1ATW</data>
   <quality error="0" md5="3c31cc5665b2636e8feb209fafa558f6" records="1" full_size="35"/>
  </payload>
 </transaction>
</frontier>


Fields: 
     COUNT(*)     NUMBER

Records:
     8833

  
Cheers,
Santanu


[log in to unmask]" type="cite">Hi

The current state of the ATLAS frontier service is not ideal.  The SAM tests:
https://lcg-sam.cern.ch:8443/sam/sam.py?CE_atlas_disp_tests=CE-ATLAS-sft-Frontier-Squid&order=SiteName&funct=ShowSensorTests&disp_status=na&disp_status=ok&disp_status=info&disp_status=note&disp_status=warn&disp_status=error&disp_status=crit&disp_status=maint
show several production sites getting a warning.  This warning is normally caused by the backup squid not being configured correctly.

To remind people: WNs should connect to the local squid (normally at the site) which connects to the Frontier server at RAL.  If the local squid is down then the WN will try and connect to a backup squid which is meant to be at a nearby site which will then try and connect to the Frontier server.  There is a similar backup process should the Frontier server at RAL fail then all the squids will try and connect to the frontier server at PIC.

To ease this problem it has been suggested that the default backup for Tier 2 sites is the squid at RAL (The Tier 1 not the Tier 2!).  The squid at the Tier 1 is the same installation as the Frontier server so if the frontier services goes down so will the backup squid.  This does reduce the resilience of the setup slightly but I think this is worth it given it should make things significantly simpler to maintain.  It does also means I will have to get the SAM test modified slightly.  If however there are sites that are happy with the current setup and managing firewall access to their squid from other sites worker nodes then please feel free to respond.

Before committing any change to Tiersofatlas I would like sites to run a test to make sure they can indeed successfully access the RAL squid.

To do this:
Log into a WN
> wget http://frontier.cern.ch/dist/fnget.py
> export http_proxy=http://lcgft-atlas.gridpp.rl.ac.uk:3128
> python fnget.py --url=http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS/frontier --sql="SELECT TABLE_NAME FROM ALL_TABLES"
This should provide a big list of table names and not a python error!

Could sites please reply with the results of the test and any comments are also welcome.

Thanks

Alastair