tablepgm

Table conversion and manipulation utility
Introduction

Tablepgm is a utility program that merges, manipulates and reformats tables. The program will read in flat ascii or ligo-lightweight (xml) and will write out text, xml or html. From the command line you can specify one or more tables to be concatenated, an output format and one or more commands to manipulate the table before writing it out.

Running tablepgm

The tablepgm command has the following syntax:

      tablepgm [-html | -xml] [--help] [-c <command>] [-i <command-file>] \
               [-t <table-name>] [-o <out-file>] [-b] [-h] [-s] [--debug n] \
	       <table1> [... <tableN>]
     

Where table1 ... tableN are files that are merged to form a single table to be manipulated. The input tables may be any supported format or a mixture of supported formats. The first table read defines the columns to be read in and all subsequent files must contain at least the set of columns defined by the first file. After any manipulation commands have been executed, the table is written to the standard output stream, or the file defined by the '-o' argument if it is specified. The output format is inferred from the file name extension unless it is overridden by the '-xml' or '-html' arguments.

After constructing a table from the files listed on the command line, tablepgm executes any commands specified by the '-c' or '-i' arguments. The -c command string is a list of commands separated by semicolons. the file specified by th -i argment contains one or more commands, one command per line. If both arguments are entered, the commands specified by the '-c' argument are executed followed by those commands in the file specified by the '-i' argument. If '-s' was specified, a trailing row is added containing the sum of all rows in the column, '-h' suppresses printing the table header and '-b' suppresses printing the table body.

Table Manipulation Commands

The following table manipulation commands are recognized by the program:


      addcolumn  [-before <col-title>] [-eval <expr>] [-title <new-title>]
     

Add a column with title <new-title> positioned before column <col-title> and optionally filled by evaluating <expr> for each exiting row. For example, in a table containing columns "start_time" and "duration", you can add an "end_time" column with:

      addcolumn -before duration -eval "start_time + duration" -title end_time
     


      addrows  [-table <table-file>]
     

Copy all rows from the table contained in <table-file> to the end of the current table. Columns are copied by name.


      histogram	[-eval <expr>] [-title <histo-title>] [-weight <wt-expr>] \
		[-rows <bool>] [-nbin <#-bins>] [-xlog] [-xmin <x-min>] \
		[-xmax <x-max>] [-ylog] [-ymin <y-min>] [-ymax <y-max>] \
		[-hold] [-file <out>]
     

Histogram the number of rows versus <expr>. Weighting may be requested by specifying a constant or row-expression for -weight. The -nbin argument specifies the number of bins in the histogram, and the -xmin and -xmax arguments specify the range. If -hold is specified, the canvas is retained untile the next plot is made or until tablepgm terminates. The histogram plot is written to the file specified by <out> or to postscript file named "tablepgm.ps" if no file is specified. Note that -hold option is implemented in dol-0.4.3/gds-2.11.3


      merge [-join <join-string>] <column-a> <column-b> [<column-c> ...]
     

Join strings from each cell in columns <column-b>, <column-c>, etc. to the end of the string in the corresponding cell of <column-a> and remove columns <column-b>, etc. A single space is inserted between the strings from different columns unless an alternate string is specified with a -join clause. A column may be merged with itself, in which case, the removal step is omitted.


      mergetable -file <table-file> -key <key-names> -columns <column-list> \
                 [-table <tablename>] [-oper <op-str>] [-ordered]
     

Combine the contents of the specified table with that of the current table. The table named <tablename> is read from file <table-file>. Rows will be combined if the content of the key columns are equal. The key column names are specified either as a single name or as a comma separated pair of names. Columns specified as a comma separated list in <column-list> are combined. Columns may be combined by concatination (<op-str>='|'), addition (<op-str>='+') or replacement (<op-str>='=').


      remove  [-column <col-title>] [-rows <expr>]
     

Remove all columns with the matching name or all rows for which <expr> evaluates to true. For example:

      remove -rows "start_time + duration < 729273000"
     

would delete all rows where the content of the start_time column plus the content of the duration column add up to a number less than 729273000.


      replace  -column <col-title> [-rows <expr>] -with <expr-val>
     

Replace specified columns with the evaluated <expr-val> in rows for which <expr> evaluates to true. For example:

      replace -column start_time -with "start_time - 0.5*duration"
     

would replace the start_time value in all rows with a recalculated start time.


      setcolumn  <col-name> [-title <new-title>] [-extern <ext-format>]
		 [-format <print-fmt>] [-justify <just>]
     

Modify the metadata of column <col-name>. The column title can be replaced with <new-title> or the external format (i.e. the type field in an xml table) can be set to <ext-format>. The -format and -justify keywords are accepted but ignored.


      sort  [-down | -up] -key <col-title1> [-key <col-titleN>]
     

Reorder rows to leave each specified column <col-titleX> in ascending (or descending) order. The columns may be sorted in descending (ascending) order by specifying -down (-up). Sorting takes place in the order specified so that the result will be order according to the last key.


      write  [-xml | -html] [-file <file-name>] [-noheader] [-summary] [-nobody]
     

Write the current table to the standard output or to a specified file. The output format is inferred from the file name extension unless a file format is specified with -html or -xml. If the extension is not recognized, a text file is produced. Text files include a single line header with the column names unless -noheader is specified followed by the full text of the table body unless -nobody is specified. The sum of numeric columns will be printed at the end of the column if -summary is requested.


Note that xyplot command is implemented in dol-0.4.3/gds-2.11.3
      xyplot	[-xval <x-expr>] [-xlog] [-xmin <x-min>] [-xmax <x-max>] \
		[-yval <y-expr>] [-ylog] [-ymin <y-min>] [-ymax <y-max>] \
		[-rows <expr>] [-title <histo-title>] [-color <n>] \
		[-hold] [-file <o-file>]
     

Plot points at {<x-expr>, <y-expr>} from rows specified by <expr>. the -xmin, -xmax, -ymin and -ymax arguments specify the plot range. If they are unspecified, the range is set by the minimum and 1.1 time the maximum of the plotted points. If -hold is specified, the canvas is retained for the next plot command or when tablepgm terminates. Otherwise, the plots is stored in the file specified by <o-file> in the format indicated by the file extension. If <o-file> is not specified, the plot is stored as postscript in "tablepgm.ps".


Expressions

An expression is a combination of symbols (i.e. alphanumeric column names), numeric constants, strings (in quotes), function calls (alphanumeric function name followed by a parenthesized list of arguments), arithmetic operators (*, /, +, -, %) and logical operators (<, >, <=, >=, ==, !=, || and &&). All table data and constants are stored as strings. Arithmetic operations are evaluated by first converting the string to a double precision number and then performing the operation. If the string doesn't represent a number, the results are unpredictable. Logical operations are evaluated by first determining the type of the operands. If both operands can be converted to a double precision number, the logical comparison is made on the two converted numbers. Otherwise the two strings are compared.

Predefined Symbols

The expression evaluator understands the following predefined symbols:

Symbol Value
_next The number of the next row in the table
_nrow The number of rows in the table
_prev The number of the previous row in the table
_row The current row number
false Boolean false ["false"]
true Boolean true ["true"]

Functions

Expressions may contain function calls with the syntax func(<expr>). In addition to the pre-defined functions, each column name is a pseudo-function such that name(i) returns the contents of the ith row of column name. The functions are:

Function Value returned
int(val) The integer part of val
type(val) The type of val {"string", "numeric" or "bool"}
<column>(i) Value of the ith row of column <column>.


Examples

Starting with xml files trig-7300.xml, ..., trig-7309.xml, merge the files and select out playground time epochs:

      tablepgm -t gds_triggers -c "remove -rows (start_time-729273613)%6370>=600" \
              -o playground-730x.xml trig-730*.xml
     

Read a table from an ascii file, remove rows that have invalid numbers in the second column, sort it by the second column, and write it out as html.

      tablepgm -c "remove -rows type(B)!=type(1); sort -key B" -o mytable.html mytable.txt
     


Notes and Warnings:

  1. There may be problems with reingesting tables produced by tablepgm into LDAS because table and column names preserve the table/column name but not the group names. This can be fixed relatively easily if need be.
  2. Column names for tables from flat ascii files are assigned names A, B, C, ... Columns may be renamed and their format set with the setcolumn command. It would be nice if a standard flat ascii table format were adopted with means to specify column names, formats, etc.
  3. The expression evaluator recognizes the usual arithmetic operators and logical comparisons. All data are stored as strings so its possible that some precision may be lost. There are hooks for additional functionality e.g. function calls, so these can be added if someone shows some interest.

Author:
J. Zweizig
Version:
1.6; last modified October 12, 2007

alphabetic index hierarchy of classes


Please send questions and comments to zweizig_j@ligo.caltech.edu


generated by doc++