run_abundance.py - helper script to estimate the abundance at a given taxonomic level

Description

       usage: run_abundance.py [-h] [-v] [-A N] [-P N] [-F N] [--distance DISTANCE]

       [-M DIAMETER] [-S DECOMP] [-p DIR] [-rt] [-o OUTPUT]
              [-d  OUTPUT_DIR]  [-c  CONFIG]  [-t TREE] [-r RAXML] [-a ALIGN] [-f FRAG] [-m MOLECULE] [--ignore-
              overlap] [-x N] [-cp CHCK_FILE] [-cpi N] [-seed N]  [-bt  N]  [-at  N]  [-pt  N]  [-g  N]  [-b  N]
              [-no_trim] [-bin N] [-D] [-C N] [-G GENES]

       This script runs the SEPP algorithm on an input tree, alignment, fragment file, and RAxML info file.

   optionalarguments:-h, --help
              show this help message and exit

       -v, --version
              show program's version number and exit

   DECOMPOSITIONOPTIONS:
              These  options  determine  the  alignment  decomposition size and taxon insertion size. If None is
              given, then the default is to align/place at 10% of total taxa. The  alignment  decomosition  size
              must be less than the taxon insertion size.

       -A N, --alignmentSize N
              max  alignment  subset size of N [default: 10% of the total number of taxa or the placement subset
              size if given]

       -P N, --placementSize N
              max placement subset size of N [default: 10% of the total number of taxa or the  alignment  length
              (whichever bigger)]

       -F N, --fragmentChunkSize N
              maximum fragment chunk size of N. Helps controlling memory. [default: 20000]

       --distance DISTANCE
              minimum p-distance before stopping the decomposition[default: 1]

       -M DIAMETER, --diameter DIAMETER
              maximum tree diameter before stopping the decomposition[default: None]

       -S DECOMP, --decomp_strategy DECOMP
              decomposition strategy [default: using tree branch length]

   OUTPUTOPTIONS:
              These options control output.

       -p DIR, --tempdir DIR
              Tempfile files will be written to DIR. Full-path required. [default: /tmp/sepp]

       -rt, --remtemp
              Remove tempfile directory. [default: disabled]

       -o OUTPUT, --output OUTPUT
              output files with prefix OUTPUT. [default: output]

       -d OUTPUT_DIR, --outdir OUTPUT_DIR
              output to OUTPUT_DIR directory. full-path required.  [default: .]

   INPUTOPTIONS:
              These  options  control  input.  To run SEPP the following is required. A backbone tree (in newick
              format), a RAxML_info file (this is the file generated by RAxML during estimation of the  backbone
              tree.  Pplacer  uses  this info file to set model parameters), a backbone alignment file (in fasta
              format), and a fasta file including fragments. The input sequences are assumed to  be  DNA  unless
              specified otherwise.

       -c CONFIG, --config CONFIG
              A  config  file,  including  options used to run SEPP.  Options provided as command line arguments
              overwrite config file values for those options. [default: None]

       -t TREE, --tree TREE
              Input tree file (newick format) [default: None]

       -r RAXML, --raxml RAXML
              RAxML_info file including model parameters, generated by RAxML.[default: None]

       -a ALIGN, --alignment ALIGN
              Aligned fasta file [default: None]

       -f FRAG, --fragment FRAG
              fragment file [default: None]

       -m MOLECULE, --molecule MOLECULE
              Molecule type of sequences. Can be amino, dna, or rna [default: dna]

       --ignore-overlap
              When a query sequence has the same name as a backbone sequence, ignore  the  query  sequences  and
              keep the backbone sequence [default: False]

   OTHEROPTIONS:
              These options control how SEPP is run

       -x N, --cpu N
              Use N cpus [default: number of cpus available on the machine]

       -cp CHCK_FILE, --checkpoint CHCK_FILE
              checkpoint file [default: no checkpointing]

       -cpi N, --interval N
              Interval  (in  seconds)  between  checkpoint  writes. Has effect only with -cp provided. [default:
              3600]

       -seed N, --randomseed N
              random seed number. [default: 297834]

   TIPPOPTIONS:
              These arguments set settings specific to TIPP

       -bt N, --blastThreshold N
              Minimum query coverage for blast hit to map read to a markerThis should be  a  number  between  >0
              [default : 50]

       -at N, --alignmentThreshold N
              Enough  alignment  subsets  are selected to reach a commulative probability of N. This should be a
              number between 0 and 1 [default: 0.95]

       -pt N, --placementThreshold N
              Enough placements are selected to reach a commulative probability of N. This should  be  a  number
              between 0 and 1 [default: 0.95]

       -g N, --gene N
              Classify on only the specified gene.

       -b N, --blast_file N
              Blast file with fragments already binned.

       -no_trim, --do_not_trim_after_blast
              Trim query sequence if it extends outside marker (BLAST only).

       -bin N, --bin_using N
              Use blast or hmmer for binning [default: blast]

       -D, --dist
              Treat fragments as distribution

       -C N, --cutoff N
              Placement  probability  requirement  to  count  toward  the  distribution. This should be a number
              between 0 and 1 [default: 0.0]

       -G GENES, --genes GENES
              Use markers or cogs genes [default: markers-v3]

Name

       run_abundance.py - helper script to estimate the abundance at a given taxonomic level

run_abundance.py - helper script to estimate the abundance at a given taxonomic level

Contents

Description

Name

See Also

See Also