GEO600
The Search for Gravitational Waves using the GRID
  • Deployment
  • Configuration
  • Testing
  • Workflow
  • Content

    • Introduction
    • Submission Host
    • Storage Resources
    • Output Storage Resource
    • Remote Storage Resources
    • Execution Host
    • Job Submission
    • FAQ
    • Introduction
    • The GEO600 Grid Scenario consists of a range of services and resources depending on one another. It is recommended to test these parts in isolation before attempting to submit GEO600 tasks to grid resources.
    • This section aims to test the GEO600 configuration as described in the section Configuration. In the simplest of all cases you'll need a single grid resource acting as a submission host, storage host and execution host.
    • At this point we assume, that you deployed GEO600 succesfully on at least one grid resource and that you changed various configuration files to reflect your grid environment.
    • Testing the Submission Host
    • The submission host requires the installation of Globus Toolkit 4.x and a client side configuration. A valid grid proxy must exist or be created and access to the GEO600 MySQL database should be possible. For convenience you may simply call check.pl locally on the submission host to test if all requirements can be met:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -submission-host
      2008/03/11 14:21:16 DEBUG> -- checking for external programs ------------------------------------
      2008/03/11 14:21:16 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/11 14:21:16 DEBUG> checking for grid-proxy-info = /usr/local/globus-4.0.5/bin/grid-proxy-info
      2008/03/11 14:21:16 DEBUG> checking for globus-url-copy = /usr/local/globus-4.0.5/bin/globus-url-copy
      2008/03/11 14:21:16 DEBUG> checking for grid-proxy-init = /usr/local/globus-4.0.5/bin/grid-proxy-init
      2008/03/11 14:21:16 DEBUG> checking for gsiscp          = /usr/local/globus-4.0.5/bin/gsiscp
      2008/03/11 14:21:16 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      
      2008/03/11 14:21:16 DEBUG> load of module DBI succesfull
      2008/03/11 14:21:16 DEBUG> connect to DBI:mysql:database=eah;host=buran.aei.mpg.de;port=24999 ...
      2008/03/11 14:21:16 DEBUG> connected!
      2008/03/11 14:21:16 DEBUG> disconnect from mysql eah robert@buran.aei.mpg.de:24999
      
      2008/03/11 14:21:16 DEBUG> DN = "/O=GermanGrid/OU=AEI/CN=Robert Engel"
      						
    • The first section searches PATH for Globus client tools which are mandatory such as globusrun-ws and grid-proxy-info and others which are non-mandatory. If you receive one or more ERRORs at this point you should install Globus Toolkit 4.x or adjust your PATH variable.
    • The following section tries to connect to the GEO600 MySQL database. An ERROR at this point might indicate that an outbound connection to buran.aei.mpg.de:24999 could not be established. In this case a simple telnet buran.aei.mpg.de 24999 will help to verify this assumption and you should ask your side administrator to open access to buran.aei.mpg.de:24999.
    • The last section checks the existence and validity of your grid proxy. Upon receiving an error, the program will try to call grid-proxy-init to create a valid proxy. If you receive an ERROR at this point you should consider installing your grid certificate on the submission host in ~/.globus.
    • Testing the Output Storage Resource
    • The output host is a remote storage location used for staging out the stdout, stderr and log file after the GEO600 application exited. You must be able to access the machine and directory you specified in output.conf using gridftp. To test the output host you can execute:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -output-host
      2008/03/11 16:14:42 DEBUG> -- checking for external programs ------------------------------------
      2008/03/11 16:14:42 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/11 16:14:42 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      
       HOST                                        load[5min]    permissions    usage[G]  access
      ------------------------------------------------------------------------------------------------------
       buran.aei.mpg.de                                  1.20             OK         0.3  public
      
      2008/03/11 16:14:42 DEBUG>  globus_url_copy gsiftp://buran.aei.mpg.de/etc/hosts file:///home/robert/GEO600-devel/log/test.gsiftp
      						
    • The first section searches PATH for gsissh and globusrun-ws. By default gsissh will be used to check the status of the storage resource. If you specified the option GT4 in your output configuration file, globusrun-ws will be used instead. If you receive an ERROR regarding permissions, the directory you specified does either not exist or you do not have write access.
    • The last section checks the status of the gridftp service by transfering a well known file from the output storage location to your local host.
    • Testing Remote Storage Resources
    • Remote Storage Resources are used by GEO600 to store archives of the working directory of a job after the application exited. The archives are created locally at the execution host at the end of the jo run and will be staged out by Globus the storage resource.
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -storage-host
      2008/03/11 16:38:09 DEBUG> -- checking for external programs ------------------------------------
      2008/03/11 16:38:09 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/11 16:38:09 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      2008/03/11 16:38:09 DEBUG> -- checking for external programs ------------------------------------
      2008/03/11 16:38:09 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/11 16:38:09 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      
       HOST                                        load[5min]    permissions    usage[G]  access
      ------------------------------------------------------------------------------------------------------
       a01.hlrb2.lrz-muenchen.de                       245.82             OK         0.0  private
       arminius-grid.uni-paderborn.de                    0.09             OK         5.5  private
       astrodata09.gac-grid.org                          2.00             OK         6.1  public
       damiana.aei.mpg.de                                0.65             OK         0.1  private
       dgrid-globus.rz.rwth-aachen.de                    0.01             OK        17.6  private
       gate01.aglt2.org                                 10.16             OK         0.0  private
       gcwn60.d-grid.uni-hannover.de                     0.18             OK        80.4  private
       gramd1.d-grid.uni-hannover.de                     0.00             OK        80.4  private
       grid3.aset.psu.edu                                0.00             NO         0.0  private
       gridgk01.racf.bnl.gov                             0.00             NO         0.0  private
       gt4-fzk.gridka.de                                 0.07             OK        95.3  private
       hector.zih.tu-dresden.de                         18.03             OK         2.1  private
       hydra.ari.uni-heidelberg.de                       3.86             OK         0.0  private
       iwrgt4.fzk.de                                     0.52             OK        10.6  private
       juggle-glob.fz-juelich.de                         0.04             OK        46.3  private
       juggle-inter.fz-juelich.de                        0.02             OK        46.3  private
       lx32ia1.lrz-muenchen.de                           0.00             NO         0.0  private
       lx64ia2.lrz-muenchen.de                           0.00             NO         0.0  private
       mardschana.zib.de                                 0.21             OK        17.5  private
       medigrid-srv.gwdg.de                              0.12             OK        25.4  private
       nest.phys.uwm.edu                                 2.09             OK        14.8  private
       osg.rcac.purdue.edu                               2.83             OK         0.0  private
       othello.zih.tu-dresden.de                        17.99             OK         3.4  private
       srvgrid01.offis.uni-oldenburg.de                  0.00             OK        30.6  private
       udo-gt01.grid.uni-dortmund.de                     1.42             OK        29.2  private
      						
    • Depending on the number of storage resources listed in storage.conf the status check will take some time. If permissions read NO for one or more resources, it is recommended to check these resources by hand.
    • To perform the status check on a single resource only, you can use the host flag to specify the resource at the command line:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -storage-host -host=astrodata09.gac-grid.org
      2008/03/13 14:25:31 DEBUG> -- checking for external programs ------------------------------------
      2008/03/13 14:25:31 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/13 14:25:31 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      2008/03/13 14:25:31 DEBUG> -- checking for external programs ------------------------------------
      2008/03/13 14:25:31 DEBUG> checking for globusrun-ws    = /usr/local/globus-4.0.5/bin/globusrun-ws
      2008/03/13 14:25:31 DEBUG> checking for gsissh          = /usr/local/globus-4.0.5/bin/gsissh
      
       HOST                                        load[5min]    permissions    usage[G]  access
      ------------------------------------------------------------------------------------------------------
       astrodata09.gac-grid.org                          2.07             OK         6.1  public
      						
    • Testing the Execution Host
    • The execution host needs a fully operational installation of GEO600 and the possibility of opening outbound connections to einstein.phys.uwm.edu:80 and buran.aei.mpg.de:24999. If these requirements can be met can be tested using check.pl:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -execution-host
      2008/03/13 15:30:04 DEBUG> load of module DBI succesfull
      2008/03/13 15:30:04 DEBUG> connect to DBI:mysql:database=eah;host=buran.aei.mpg.de;port=24999 ...
      2008/03/13 15:30:04 DEBUG> connected!
      2008/03/13 15:30:04 DEBUG> disconnect from mysql eah robert@buran.aei.mpg.de:24999
      
      2008/03/13 15:30:05 INFO>  download http://einstein.phys.uwm.edu/
      2008/03/13 15:30:05 DEBUG> remove index.html
      						
    • At first check.pl tries to load the Perl DBI module responsible for interfacing with the MySQL database. An ERROR reported at this point indicates that the build of DBI from source during the deployment phase failed. It is also possible that the Perl version used during deployment differs from the Perl version used above call.
    • In the next step check.pl tries to establish a connection to the database followed by an immediate disconnect. An ERROR upon establishing a connection to the MySQL database buran.aei.mpg.de:24999 indicates that the outbound connection has been blocked. To verify that the local side is indeed blocking your attempt to connect one can also use telnet buran.aei.mpg.de 24999. If telnet also fails it is recommended to kindly ask the side administrator to open access to buran.aei.mpg.de:24999. In rare cases the database itself is blocking your connection:
    • DBI connect('database=eah;host=buran.aei.mpg.de;port=24999','robert',...) failed: 
      Host 'gcwn07.d-grid.uni-hannover.de' is blocked because of many connection errors; 
      unblock with 'mysqladmin flush-hosts' 
      						
    • The database administrator can in this case unblock the database by executing as root@buran.aei.mpg.de
    • mysqladmin -p flush-hosts
      						
    • At last check.pl attempts to download index.html from http://einstein.phys.uwm.edu. This test confirms if LIGO data packets can be downloaded from the web server. An ERROR indicates that outbound http connections are blocked from the local firewall and it is recommended to contact te local resource administrator for details.
    • If outbound http connections may only be established by using a web proxy you can configure BOINC to use the proxy by defining appropiate settings in grid-run.conf:
    • run sigrid.zimt.uni-siegen.de {
      	GEO600_HOME      = GEO600-devel
      	FT               = PBS
      	FT_FORK          = YES
      	TIMEOUT          = 0.10:00:00
      	JOBS_RUNNING_MAX = 1
      	JOBS_QUEUE_MAX   = 1
      	PRESTAGE         = GLOBUS
      	ENV              = ( HTTP_PROXY = 192.168.5.25:3128 )
      }
      						
    • Testing Job Submission
    • Here we assume that you are logged into the submission host. The purpose of this section is to teach you how to determine the factory type of a grid resource. This factory type will then be used to test a simple job submission to a grid resource without using GEO600.
    • A typical Globus installation provides a legacy interface and a web service interface for job submissions. The legacy interface is part of Globus Toolkit 2.x and above and is called GRAM. The web service interface is called GRAMWS and is part of Globus Toolkit 4.x and above. GEO600 relies on GRAMWS for job submissions due to its simplicity and extended file staging capabilities in comparison with GRAM.
    • GRAM and GRAMWS support to put a job into execution by using Fork or by routing the job into a local queuing system such as PBS, SGE, CCS, LSF and Condor. Which service will be used by default depends on the configuration of Globus on the grid resource. In general the grid user has to specify the so called factory type during job submission.
    • The factory types supported by a grid resource are usually only documented, but cannot be querried. For convenience one can use check.pl to detect the supported factory types on a grid resource:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -detect-FT -host=buran.aei.mpg.de
      2008/03/13 17:05:45 DEBUG> Fork         = OK
      2008/03/13 17:05:45 DEBUG> PBS          = NO
      2008/03/13 17:05:45 DEBUG> SGE          = NO
      2008/03/13 17:05:46 DEBUG> CCS          = NO
      2008/03/13 17:05:46 DEBUG> LSF          = NO
      2008/03/13 17:05:46 DEBUG> Condor       = NO
      						
    • The grid workstation above only supports Fork in comparison to the grid cluster below who supports also PBS besides Fork:
    • robert@buran:~/GEO600-devel/main/scripts> perl check.pl -detect-FT -host=udo-gt01.grid.uni-dortmund.de
      2008/03/20 11:31:44 DEBUG> Fork         = OK
      2008/03/20 11:31:44 DEBUG> PBS          = OK
      2008/03/20 11:31:45 DEBUG> SGE          = NO
      2008/03/20 11:31:45 DEBUG> CCS          = NO
      2008/03/20 11:31:45 DEBUG> LSF          = NO
      2008/03/20 11:31:46 DEBUG> Condor       = NO
      						
    • Frequently Asked Questions
    • Robert Engel, Max-Planck Institut for Gravitational Physics