Draft Requirements for S3 Data Reduction (to be modified by input from Kent, Albert, Sam, LSC etc) * Goal is to have RDS script process(es) which run uninterrupted throughout science run, constantly reducing new raw data as it appears in the archive. - Must be robust to downtime in LDAS and/or frameAPI - Must generate status alarms for operators, emails to sysops in case of failures - Must be able to run without tuning or supervision for long periods - Must be able to re-try individual jobs which fail (say, 3 retries) * Data reduction should be significantly faster than real-time (a factor of 2 or better would be nice), preferably in a single thread - Some of this may be achieved in LDAS-0.8.0 through better optimization and inlining - Can be achieved already using multiple parallel createRDS as was done for S2 data, but may interfere with online data analysis * Reduced data should be "binned" into appropriate directories on target filesystem(s) automatically. - Scheme used in S2 was /ide??/S2/rds/H-7338/ (around 6000 files/dir) - Mapping of epoch to filesystem must be retained or rebuilt between restarts of the RDS script - In S2, binning was done using a second script after data reduction. This should be merged with RDS script - As one filesystem fills up, next one in list is chosen, BUT keep all files from same 100,000 second epoch together - Good handling of full filesystems is required * Current GUI is very rudimentary, only has start/stop buttons. GUI needs to have ability to change and save all adjustable parameters using dialog boxes. Would be useful to display status and progress (assuming GUI is retained for S3) * LDR information needs to be published as new reduced data appears - Hourly? daily? something else? - I would prefer that LDR population *not* be done by RDS script, since some users of the script may not be populating LDR, and some users of LDR may not be reducing data. (Is ldas the LDR username also?) - Should be very simple to write an "LDR daemon" which periodically requests disk cache contents and publishes new data * Have "MDC" for data reduction - Establish rate(s) of data reduction - Probably should be done for all three sites (LHO, LLO, CIT) - Include simulated data analysis loads (eg. Isaac's tests) - At CIT, most realistic trial would require data being transferred from tape into SAM-QFS - very labour intensive!! (Maybe raw data transfer will be simpler in S3?). - An open question is whether LHO data will be re-generated at CIT or transferred over internet Possible team members and tasks: Philip Charlton - coordination - coding & testing for bulk of RDS script Scott Koranda - resource for LDR information - LDR publishing daemon? Ed Maros (major involvement, but part of normal activities) Phil Ehrens (probably less involvement) - performance improvement of RDS - coding for changes needed to LDAS, frameAPI, createRDS command, if any Mary Lei (if she can be spared) - GUI implementation, or at least advice to Philip on how to do it. (I think GUI will be simple enough that I can write it with some advice from the more experienced Tcl programmers esp. Mary and Peter Shawhan) Person (perhaps Stuart Anderson or Dan Kozak) to help with MDC by eg. adding data to a filesystem from tape, deliberately cause failures etc.