> home > proposal > subsystems: data analysis and computing
Data Analysis and Computing
Overview
The Advanced LIGO data analysis computational load is increased over
that for initial LIGO due to the broader range of detector sensitivity.
The features of initial LIGO and Advanced LIGO sensitivities that impact
astrophysical data analysis are summarized in Table 1. The frequency
at optimum sensitivity is fmin=130 Hz in initial LIGO and roughly
at this same frequency (dependent upon the signal tuning) for Advanced
LIGO. However, the Advanced LIGO optimum sensitivity will be roughly a
factor 10 better. The enhanced frequency range for Advanced LIGO means
that sources whose characteristic frequency of emission varies with time
will be observable in the detection band for longer periods. Combined,
these enhancements provide both greater range and in-band dwell times.
These improvements imply that the rate of detectable events with Advanced
LIGO will be orders of magnitude greater than initial LIGO. Projected event
rate increases, estimated through scaling laws and anticipated signal
signatures, are discussed in Section 2, Reference Design Baseline Definition.
Table 1 Key parameters of the Advanced LIGO reference design
that affect the data analysis system
| Parameterisation |
AdvLIGO Reference Design |
Initial LIGO implementation |
Comment |
| Effective Seismic Cutoff Frequency |
fsei~20 Hz |
fsei~40 Hz |
Point at which h[fsei]=10h[fmin] |
| Frequency at Optimum Sensitivity |
fmin~100-130 Hz |
fmin~130 Hz |
Minimum of h[f] for Advanced LIGO is broader |
| h[fmin], Hz-1/2 |
2-3x10-24 (tuning dependent) |
3x10-23 |
|
| Data sample word length [bytes] for key channels |
4 |
2 |
Determined by increased dynamic range |
| Maximum sample rate, s/s |
16384 8192 |
16384 8192 |
Upper cutoff fshot is well below fNyquist for both initial LIGO and II |
The impact of exploiting the increased source detection ability on data analysis
strategies and the initial LIGO Data Analysis System depends on the source type being
considered and will be discussed by source type below. Most presently envisioned search
and analysis strategies involve spectral-domain analysis and optimal filtering using
template filter banks calculated either from physics principles or parametric representations
of phenomenological models. The primary channel that is useful for astrophysics is the
instrumental output that is proportional to strain. All the other thousands of channels
in initial LIGO and Advanced LIGO are used to validate instrumental behavior. It is also
expected that relatively few channels (<10) will prove useful in producing improved estimates
of GW strain. This would be done by removing instrumental cross-channel couplings, etc. either
with linear regression techniques in the time domain (Kalman filtering) or in the spectral
domain (cross-spectrum correlation). We assume here that signal conditioning will not be a
driver for LIGO Data Analysis System (LDAS) upgrades. This is certainly the case for the initial
LIGO LDAS and there is no reason to expect this to change.
Functional Requirements
The most significant new development in distributed computing that has occurred during the
commissioning and operation of LIGO I has been the emergence of the concept of the Computational
Grid. LIGO Laboratory and the LSC are active participants in several NSF-sponsored initiatives,
with a goal to adopt grid computing methods for the analysis of LIGO data.
The construction of Advanced LIGO offers an opportunity to begin with a new design for the
Advanced LIGO Data Analysis and Computing subsystem that takes full advantage of the grid
paradigm at the time when Advanced LIGO construction starts. This proposal addresses the LIGO
Laboratory Tier 1 components of LIGO data analysis and computing. At appropriate times in the
future, the Laboratory and the LSC will respond to opportunities for funding that will be needed
in order to also enhance the Tier 2 facilities at the collaboration universities. Such enhancements
will include an increase in the number of Tier 2 university centers serving the LIGO data analysis
community.
Computational Upgrades
For the classes of sources considered (transient "bursts", compact object inspirals, stochastic
backgrounds, and continuous-wave sources), the continuous-wave and binary inspirals place the
greatest demands on the computational requirements. Optimal searches for periodic sources with
unknown EM counterparts (the so-called blind all-sky search) represent computational challenges
that require O[1015 or more FLOPS] and will likely be beyond the capacity of the
collaboration to analyze using LIGO Tier 1 and Tier 2 resources. Alternative techniques have
been developed that lend themselves to a distributed grid-based deployment. Research in this
area has been ongoing during initial LIGO and will continue. The Tier 1 center upgrade will
not be specifically targeted to this class of search, since it is one that will need to be
addressed on a much larger scale within the national Grid infrastructure.
Advanced LIGO will search for compact object binary inspiral events using the same general
technique that will be employed in initial LIGO: a massive filter bank processing in parallel
the same data stream using optimal filtering techniques in the frequency domain. The extension
to lower frequencies of observation allowed by Advanced LIGO means that the duration of observation
of the inspiral is significantly longer, leading to a concomitant increase in the computing power
required. Counterbalancing this trend, however, are emergent theoretical improvements in
techniques applying hierarchical divide-and-conquer methods to the search algorithms. Improvements
in search efficiency as high as 100X should be possible by optimal implementation of these
techniques. While not yet demonstrated with actual data, it is reasonable to expect that
algorithmic improvements will become available by the time of Advanced LIGO turn-on.
The number of distinct templates required in a search depends on many factors, but is dominated
by the low-frequency cutoff of the instrument sensitivity (since compact binaries spend more
orbital cycles at low frequencies) and the low-mass cutoff of the desired astrophysical search
space (since low-mass systems inspiral more slowly, and hence spend more cycles in the LIGO band).
Approximate scaling laws can be used, but in practice the precise number of templates depends on
the specifics of the LIGO noise curve and the template-placement algorithm.
Table 2 provides a comparison between relative computational costs for inspiral searches
down to 1M/1Mbinary systems between initial LIGO and Advanced LIGO. The length of the chirp
sets the scale of fast-Fourier transforms (FFTs) that are required for optimal filtering.
FFT computational cost scales as ~N log2N. On the other hand, the greater duration of the
chirp provides more time to perform the longer calculation. Together a ~7X increase in signal
duration corresponds to a ~2X increase in computational cost. If one were to go to lower mass
systems, the computational costs will scale as (Mmin)-8/3. However, current stellar evolution
models predict that the minimum mass of a neutron star remnant is around 1M. Extending the
template bank below this limit may be of interest in order to cover all plausible sources,
with a margin to allow for discoveries not predicted by current theories.
When one or both of the binary components are spinning black holes, spin-orbit couplings
can significantly modulate the waveform. Exact theoretical templates for these waveforms
do not yet exist, but would involve several additional search parameters, increasing the
size of the template bank significantly. Buonanno, Chen, and Vallisneri have proposed
adopting instead a bank of approximate templates that uses heuristic waveform parameters
(not explicitly tied to the astrophysical properties of the system) to achieve reasonable
overlaps with various competing theoretical models. A two-parameter template family would
be only slightly larger (perhaps by a factor of 2) than the spinless parameter space, and
would have an effective fitting factor (overlap) of better than 90% with almost all proposed
double black hole binary signals. However, it would match black hole/neutron star signals
only at about the 80% level (i.e. 20% loss in signal-to-noise, or about 50% reduction in
event rate). Increasing the fitting factor to above 90% would require adding a third
parameter to the template family, at a significant increase (10X - 100X) in computational
cost compared to non-spinning systems.
At the same time, however, there is much room to improve computational methods to increase
signal-to-noise for fixed computational cost. An 80% fitting factor would be enough for the
first stage of a hierarchical search, which would go on to apply a restricted set of more
accurate templates to candidate events in order to achieve a near-optimal signal-to-noise
ratio. As a rough estimate, we assess a computational cost based on a flat search of a
template bank twice as large as is required for the spinless case, or ~200,000 templates.
Each observatory (Hanford, Livingston) has an on-site Linux cluster. The Hanford subsystem
of LDAS handles data from two interferometers and is designed to be twice as capable in
terms of CPU FLOPS as the one at Livingston (some components do not scale and are
essentially identical at both sites). The quantities appearing in Table 1 correspond
to the Hanford site operating with two interferometers.
Table 2 lists the main features of the parallel cluster at Hanford.
Table 2 Initial LIGO and Advanced LIGO analysis system requirements
for compact object binary inspiral detection using Wiener filtering techniques.
M=1M  provides a reference
to indicate how quantities change with
M min. Quantities were calculated using a spreadsheet model of the
data flow for the inspiral detection analysis pipeline, and assume a 20 Hz
start frequency for observation.
| Parameter |
Advanced LIGO (LHO, 2 IFOs)
1M /1M |
Initial LIGO (LHO, 2 IFOs)
1M /1M |
| Maximum template length, seconds |
280 |
44 |
| Maximum template length, Bytes |
128 MB |
16 MB |
| Calculation of templates, FLOPS |
~4 GFLOPS |
~2 GFLOPS |
| Storage of templates, Bytes |
32 TB |
2TB |
| Wiener filtering analysis, FLOPS |
4970 GFLOPS |
440 GFLOPS |
Table 3 Initial LIGO and Advanced LIGO analysis system specifications
for compact object binary inspiral detection using Wiener filtering techniques.
| Parameter |
Advanced LIGO |
Initial LIGO |
Beowulf cluster size (number of nodes at LHO) |
256 |
96 |
| Memory per CPU, MB |
1024 |
512 |
| Disk per node, GB |
60 |
20 |
| GHz per node |
>3 |
2.1 |
| Total computational power, GHz |
>768 |
200 |
The off-site computing facilities at Caltech support network analysis for follow-up
analyses requiring data from all three interferometers. In addition the computational
facility will support Tier 1 functions of data storage and retrieval functions. The
parallel Beowulf cluster at Caltech will also be upgraded to provide expanded search
and analysis capacity. The Caltech Beowulf cluster has been estimated to require of
order 512 nodes. Similar scaling of the smaller computational facility at MIT will
be undertaken.
Data Archival/Storage Upgrades
The Advanced LIGO acquisition system will generate a ~3X greater volume of data that
needs to be accommodated by the archive and on-line mass storage systems (as explained
in "Data Acquisition, Diagnostics, Network & Supervisory Control (DAQ)" section of the
proposal). At the present time it is not clear the degree to which the additional data
associated with monitoring functions of instrumental performance needs to be accessed
by the collaboration for science and detector characterization functions. However,
experience to date with LIGO I has shown that any data that are acquired are required
to be archived indefinitely. We will use this same data model as a conservative estimate
for Advanced LIGO requirements. In this model, all data are acquired and stored for
several weeks on-line in a disk cache at the observatories. Then the data are staged
to tape media. Two copies of tapes are produced. One copy is held on-site for ~30 days.
The other copy is sent to Caltech where data reduction takes place in the form of keeping
only those channels that are required for data analysis on Reduced Data Sets (RDSs). The
target in initial LIGO will be a 10X reduction in raw data volume for the RDSs. We expect
~3X to come from lossless compression (both in hardware within the tape drives and
algorithmically in filters). Another ~3X will come from re-sampling and reduction in
the number of channels. The net result is a need to upgrade the permanent archive;
Advanced LIGO will require a ~1PB/yr archive capacity.
Handling Greater DAQ Data Rates - Frame Data Archive Growth
Data from the interferometer and PEM subsystems will be accommodated for periods of
3 weeks hours on spinning media. The corresponding volume of data that must be
accommodated is ~10 TB. The on-site disk cache for Advanced LIGO will require expansion
to 20 TB. This volume represents ~100% margin for additional growth, which is comparable
to the initial LIGO design.
Handling Greater Event Rates - Metadatabase Growth
The LIGO metadatabase serves to provide logging of diagnostics triggers that come from
real-time monitoring of the interferometer and PEM channel, and to provide for logging
of frame data and candidate astrophysical events. Depending on the levels of compression
that are ultimately achieved on the raw framed data, metadata generated from frames
(trends, histories, etc.) will grow directly as the volume of frames. If this is assumed
to grow by ~3X, then Advanced LIGO will require an increase of 3X in storage and serving
capacity for frame summary metadata at the Caltech server.
Wide Area and Local Area Network Upgrades
The increased volume of data generated can be expected to generate a concomitant need
to provide increased internet connectivity between the observatories and Caltech and
in general to the larger LSC community. At the present time, LIGO Laboratory has not
been able to obtain OC3 connection to the internet at the observatories due to costs
that cannot be absorbed by the operations budget of the Laboratory. By the time of
Advanced LIGO, the Laboratory will require an upgrade to at least OC3 to provide to
LIGO Laboratory adequate bandwidth between observatories and Caltech.
Software Upgrades
The data analysis software will need to be grid-enabled as part of the upgrade
in support of Advanced LIGO. In some cases, certain interfaces may need to be expanded
to accommodate the greater level of distributed computing being foreseen. It is
expected that the research results provided by other NSF-funded grid computing projects
in which LIGO Laboratory and the LSC is actively participating will provide the guidance
of how this evolution will take place.
Another large impact to LDAS software design will be in the area of database management
systems to handle the greater quantity of data and a growing community of users. Advanced
LIGO will require a greater database size, more powerful and more numerous servers, and
a fully federated implementation of the database system.
Concept/Options
The implementation of Advanced LIGO LDAS is an expansion of initial LIGO LDAS. This
is largely possible because of the highly modular, API-specific, object-oriented paradigm
that initial LIGO is implementing. The desire to enable a greater degree of integration
into a grid computing paradigm than has been possible for LIGO I will determine the
evolution of LDAS in support of Advanced LIGO.
Additional PC clusters will be added to or replace existing clusters. LAN network
infrastructure in place for initial LIGO will be capable of expansion to accommodate
4X bandwidths by combinations of multiple connections (e.g., an increased number of
network fabrics) and higher bandwidth (OC12 or OC48). The RAID disk systems planned
for initial LIGO will be expanded or replaced with improved versions of similar systems
(later generation, larger disk volumes, etc.). These disk systems will support growth
of both metadatabases and framed databases. Data servers will be upgraded to the enterprise
class servers available at the time. Multiple servers may be clustered to provide greater
throughput where this is required.
Tape archive robotic systems will be upgraded or replaced. The growth of the local
short-term archives at the observatories will be possible using the LIGO I SAM-QFS or
a similar software environment on the archive server at the sites. Such products are
licensed based on data volumes, so as the archives at the observatories grow, the net
operational costs will be proportionately increased. The Caltech archive shall be expanded
to accommodate the greater volume of Advanced LIGO data.
WAN access to LIGO data will be provided from each observatory and Caltech at OC3 or
greater bandwidth.
R&D Status/Development Issues
Most of the improvements in hardware performance that are discussed and identified
above should become naturally available through the advance in technology that comes
from market forces. LIGO will continue to meet its needs using commercial or commodity
components.
Software evolution towards a grid-based paradigm will occur through continued participation
by the Laboratory and the LSC in NSF-funded grid computing initiatives.
WBS Definition
This element includes all incremental upgrades to data analysis systems and
computational infrastructure needed to support the analysis of data from Advanced
LIGO. It includes neither software nor computing nor network hardware supported
normally by the LIGO Laboratory operations program (WBS 2.0). It does include the
LIGO Data Analysis System (LDAS) and the End-to-End Model (E2E) infrastructure
development.
Design Requirements
Conceptual Design
Detail Estimate Sheets
Baseline Plan
|