logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

gbget - Basic data extraction and manipulation tool

Author

       Written by Giulio Bottazzi

Description

       Print  slices  of  tabular  data from files and apply transformations. Data are read from text files with
       fields separated by space (use option -F to specify a different separator). Inside data file, data-blocks
       are separated by two empty lines. File can be compressed with zlib (.gz).

       filename
              is the input file. If not specified it default to stdin or the last specified filename if any.

       index  stands for a data-block index.

       index  stands for a data-block index.

       C,R    stands for columns and rows spec given as "min:max:skip" to  select  from  "min"  to  "max"  every
              "skip"  steps.  If  negative min and max are counted from the end. By default all data are printed
              ("1:-1:1"). If min>max then count is reversed and skip must be negative (-1 by default). Different
              specs are separated by semicolon ';' and considered sequentially.

       trans  is a list of transformations applied to selected data: 'd' take the diff  of  subsequent  columns;
              'D'  remove all rows with at least one Not-A-Number (NAN) entry; 'f' flatten the output piling all
              columns; 'l' take log of all entries, 'P'  print  all  entries  collected  as  a  data-block;  't'
              transpose the matrix of data; 'z' subtract from the entries in each column their mean; 'Z' replace
              the entry in each column with their zscore; 'w' divide the entry in each columns by their mean.

              '<..;..>'  functions  separated  by  semicolons  in  angle  brackets  can be used for generic data
              transformation; the function is computed for each row of data. Variables names are 'x' followed by
              the number of the column and optionally by 'l' and the number  of  lags.  For  instance  'x2+x3l1'
              means  the sum of the entries in the 2nd column plus the entries in the 3rd column in the previous
              row. 'x0' stands for the row number and 'x' is equal to 'x1'

              '<@..;..>' if the functions specification starts with a '@' the functions are computed recursively
              along the columns. In this case the number after the 'x' is the relative column  counted  starting
              from the one considered at each step.

              '{...}'  a  function  in  curly  brackets  can  be  use  to  select  data: only rows that return a
              non-negative value are retained

Examples

       gbget 'file(1:3)ld'
              select the first three columns in 'file', take the log and the difference of successive columns;

       gbget 'file(2,-10:-1)
              <x^2> select the last ten elements of the second' of 'file' and print their squares

       gbget '[2]()' '[1]()' < ...
              select the second and first data block from the standard input.

       gbget 'file(1:3)<x1*x2-x3>'
              select the first three columns in 'file' and in each row  multiply  the  first  two  entries  and.
              subtract the third.

       gbget 'file()<@x1+x2>'
              print the sum of two subsequent columns

       gbget 'file(1:3){x2-2}'
              select the first three columns in 'file' for the rows whose second field is not lower then 2

Name

       gbget - Basic data extraction and manipulation tool

Options

-F     set the input fields separators (default ' \t')

       -o     set the output format (default '%12.6e')

       -e     set the output format for empty fields  (default '%13s')

       -s     set the output separation string  (default ' ')

       -t     define global transformations applied before each output (default '')

       -v     verbose mode

Reporting Bugs

Synopsis

gbget [options] 'filename[index](C,R)trans'

See Also