usage: run_abundance.py [-h] [-v] [-A N] [-P N] [-F N] [--distance DISTANCE]
[-M DIAMETER] [-S DECOMP] [-p DIR] [-rt] [-o OUTPUT]
[-d OUTPUT_DIR] [-c CONFIG] [-t TREE] [-r RAXML] [-a ALIGN] [-f FRAG] [-m MOLECULE] [--ignore-
overlap] [-x N] [-cp CHCK_FILE] [-cpi N] [-seed N] [-bt N] [-at N] [-pt N] [-g N] [-b N]
[-no_trim] [-bin N] [-D] [-C N] [-G GENES]
This script runs the SEPP algorithm on an input tree, alignment, fragment file, and RAxML info file.
optionalarguments:-h, --help
show this help message and exit
-v, --version
show program's version number and exit
DECOMPOSITIONOPTIONS:
These options determine the alignment decomposition size and taxon insertion size. If None is
given, then the default is to align/place at 10% of total taxa. The alignment decomosition size
must be less than the taxon insertion size.
-A N, --alignmentSize N
max alignment subset size of N [default: 10% of the total number of taxa or the placement subset
size if given]
-P N, --placementSize N
max placement subset size of N [default: 10% of the total number of taxa or the alignment length
(whichever bigger)]
-F N, --fragmentChunkSize N
maximum fragment chunk size of N. Helps controlling memory. [default: 20000]
--distance DISTANCE
minimum p-distance before stopping the decomposition[default: 1]
-M DIAMETER, --diameter DIAMETER
maximum tree diameter before stopping the decomposition[default: None]
-S DECOMP, --decomp_strategy DECOMP
decomposition strategy [default: using tree branch length]
OUTPUTOPTIONS:
These options control output.
-p DIR, --tempdir DIR
Tempfile files will be written to DIR. Full-path required. [default: /tmp/sepp]
-rt, --remtemp
Remove tempfile directory. [default: disabled]
-o OUTPUT, --output OUTPUT
output files with prefix OUTPUT. [default: output]
-d OUTPUT_DIR, --outdir OUTPUT_DIR
output to OUTPUT_DIR directory. full-path required. [default: .]
INPUTOPTIONS:
These options control input. To run SEPP the following is required. A backbone tree (in newick
format), a RAxML_info file (this is the file generated by RAxML during estimation of the backbone
tree. Pplacer uses this info file to set model parameters), a backbone alignment file (in fasta
format), and a fasta file including fragments. The input sequences are assumed to be DNA unless
specified otherwise.
-c CONFIG, --config CONFIG
A config file, including options used to run SEPP. Options provided as command line arguments
overwrite config file values for those options. [default: None]
-t TREE, --tree TREE
Input tree file (newick format) [default: None]
-r RAXML, --raxml RAXML
RAxML_info file including model parameters, generated by RAxML.[default: None]
-a ALIGN, --alignment ALIGN
Aligned fasta file [default: None]
-f FRAG, --fragment FRAG
fragment file [default: None]
-m MOLECULE, --molecule MOLECULE
Molecule type of sequences. Can be amino, dna, or rna [default: dna]
--ignore-overlap
When a query sequence has the same name as a backbone sequence, ignore the query sequences and
keep the backbone sequence [default: False]
OTHEROPTIONS:
These options control how SEPP is run
-x N, --cpu N
Use N cpus [default: number of cpus available on the machine]
-cp CHCK_FILE, --checkpoint CHCK_FILE
checkpoint file [default: no checkpointing]
-cpi N, --interval N
Interval (in seconds) between checkpoint writes. Has effect only with -cp provided. [default:
3600]
-seed N, --randomseed N
random seed number. [default: 297834]
TIPPOPTIONS:
These arguments set settings specific to TIPP
-bt N, --blastThreshold N
Minimum query coverage for blast hit to map read to a markerThis should be a number between >0
[default : 50]
-at N, --alignmentThreshold N
Enough alignment subsets are selected to reach a commulative probability of N. This should be a
number between 0 and 1 [default: 0.95]
-pt N, --placementThreshold N
Enough placements are selected to reach a commulative probability of N. This should be a number
between 0 and 1 [default: 0.95]
-g N, --gene N
Classify on only the specified gene.
-b N, --blast_file N
Blast file with fragments already binned.
-no_trim, --do_not_trim_after_blast
Trim query sequence if it extends outside marker (BLAST only).
-bin N, --bin_using N
Use blast or hmmer for binning [default: blast]
-D, --dist
Treat fragments as distribution
-C N, --cutoff N
Placement probability requirement to count toward the distribution. This should be a number
between 0 and 1 [default: 0.0]
-G GENES, --genes GENES
Use markers or cogs genes [default: markers-v3]