BitTest

Look for stuck channels or bits.
BitTest monitors specified channels for bits that remain in one state and for channels that repeat a value twice or more in succession. The channels to be monitored may produce either integer or floating point data, but the bit checking facility functions only for the integer channels. BitTest accumulates statistics on all requested channels for a fixed number of frames, until a specified time or until an external interrupt is received. Any detected problems are reported at the end of the period and the accumulation cycle is restarted.

The BitTest error reports are produced in two sections with each section written to a separate file. The first section lists the channels found to be missing or to have data errors. The second section gives detailed statistics on each channel.

Triggers may be generated for each erroneous channel found. One trigger is generated for each flagged channel in each statistics accumulation period. The generation of all triggers may be enabled or disabled with a command line argument.

The BitTest Configuration File

The BitTest configuration file specifies the channels to be monitored and parameters affecting the generation of triggers for each channel. Each line of the configuration file specifies the parameters for a single channel and contains the following fields:

<channel-name> <bit-range> <max-repetition>

The meaning of each field is as follows:

<channel-name> Name of the channel to be monitored.
<bit-range> Mask of bits to be monitored.
<max-repetition> Maximum number of times a value may be repeated.

The bits in the mask need not be adjacent. If the bit mask is zero, an automatic error detection algorithm is used that requires that any stuck bits be adjacent high-order bits. Thus, any bit that is stuck in one state, is not the most significant bit and is not adjacent to a higher order stuck bit will generate a trigger. This will give the desired result in cases where the channel measures a signal that varies smoothly over a given range. This condition is satisfied by most LIGO signals.

Repetition count triggers may be disabled by setting the maximum repetition count to zero. The configuration file is reread each time a SIGUSR1 signal is received. This means that the monitor process need not be restarted in order to change the configuration. Instead, the configuration file can be modified, and the process reconfigured with a "kill -USR1 <pid>" command.

Running BitTest

The syntax of the BitTest command is as follows:

BitTest [-partition <pname>] [-infile <file>] [cfile <config>] \
        [ofile <out-file>] [reset <nsec>] [synch hh[:mm[:ss]]] \
        [-debug <dbg-level>] [+trig[ger]] [-toc]
     

Where the arguments have the following meaning:

<pname> Shared memory partition name with data to be read
<file> Input frame file(s) (exclusive of <pname>)
<dbg-level> Debug level
<config> Configuration file name.
<nsec> Accumulation time in seconds
hh:mm:ss Time (in current UTC day) to generate first report
<out-file> Root output file name (defaults to "BitTest.junk")

The partition name is mutually exclusive with the input frame file name. If both are specified, data are read from the specified shared memory partition. The debug level defaults to 0 (no debug messages). Any other value for <dbg-level> will cause debugging messages to be printed to cout/cerr. Reports are produced at hh:mm:ss and every <nsec> seconds after that. If <nsec> is not specified, BitTest will continue to accumulate statistics until it catches either a SIGTERM or SIGUSR1 signal. If hh:mm:ss is not specified, BitTest will produce a report after <nsec> frames have been received.

Modifying the online configuration

The BitTest configuration file can be modified while BitTest is running. This is accomplished by editing the current BitTest configuration file, usually ~ops/pars/BitTest.conf. Once the file has been modified, BitTest can be made to read in the new configuration by signaling the running process with SIGUSR1. When the signal is caught by the process, BitTest will write out the status files with whatever statistics have already been collected, read the configuration file and restart processing. The SIGUSR1 signal is delivered with the following command:

kill -USR1 <pid>

where <pid> is the ID of the BitTest process. The process ID(s) can be found with, e.g.

ps -eopid,comm | grep BitTest

BitTest Output

Triggers

BitTest generates a trigger for each flagged channel at the end of each statistics accumulation period (nominally 20 minutes). The trigger has a trigger ID of BitTest and a sub-ID of the channel name. The trigger user data contains the following double precision float fields:

  1. Number of words read
  2. Maximum repetition count
  3. Number of readout errors
  4. Mask of bits always set
  5. Mask of bits always zero.
  6. Number of Overflows/Underflows
Triggers will not be produced unless the "+trig[ger]" option is specified on the command line.

Alarms

BitTest generates an alarm for each channel with a bit error or data overflow error at the end of each statistics accumulation period. Bit errors are indicated by a Bit_is_Stuck alarm and data overflows are indicated by a Overflow alarm. The Alarms' short descriptions (displayed by holding your pointer over the severity ball) give the channel names and for stuck bit errors, the hex masks of the stuck bits.

In general, BitTest alarms do not indicate serious problems. Nevertheless, channels with consistent alarms should be investigated when time permits.

Reports and Other Output

The BitTest reports are divided into two sections. The first section is a list all channels that were found to have errors. This list is stored in the file named by "<out-file>.Errors" where the file root name is specified on the command line. The error list is further divided into categories containing channels with the following errors:

The second section is a table with statistics and status information for all configured channels. The table is written to a file named "<out-file>.Statistics" and contains the following information for each channel:

Author:
J. Zweizig
Version:
2.0; Modified April 11, 2001

alphabetic index hierarchy of classes


generated by doc++