|--Parsnp 1.5.6--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest
usage: parsnp [-h] [-c] -d SEQUENCES [SEQUENCES ...] [-r REFERENCE]
[-g GENBANK [GENBANK ...]] [-o OUTPUT_DIR] [-q QUERY] [-U MAX_MUMI_DISTR_DIST | -mmd
MAX_MUMI_DISTANCE] [-F] [-M] [--use-ani] [--min-ani MIN_ANI] [--use-mash] [--max-mash-dist
MAX_MASH_DIST] [-a MIN_ANCHOR_LENGTH] [-m MUM_LENGTH] [-C MAX_CLUSTER_D] [-z MIN_CLUSTER_SIZE] [-D
MAX_DIAG_DIFF] [-n {mafft,muscle,fsa,prank}] [-u] [--use-fasttree] [--vcf] [-p THREADS] [-P
MAX_PARTITION_SIZE] [-v] [-x] [-i INIFILE] [-e] [-V]
Parsnp quick start for three example scenarios: 1) With reference & genbank file: python Parsnp.py
-g <reference_genbank_file1 reference_genbank_file2 ...> -d <seq_file1 seq_file2 ...> -p
<threads>
2) With reference but without genbank file: python Parsnp.py -r <reference_genome> -d <seq_file1
seq_file2 ...> -p <threads>
3) Autorecruit reference to a draft assembly: python Parsnp.py -q <draft_assembly> -d <seq_file1
seq_file2 ...> -p <threads>
optionalarguments:-h, --help
show this help message and exit
Input/Output:-c, --curated
(c)urated genome directory, use all genomes in dir and ignore MUMi?
-d SEQUENCES [SEQUENCES ...], --sequences SEQUENCES [SEQUENCES ...]
A list of files containing genomes/contigs/scaffolds
-r REFERENCE, --reference REFERENCE
(r)eference genome (set to ! to pick random one from sequence dir)
-g GENBANK [GENBANK ...], --genbank GENBANK [GENBANK ...]
A list of Genbank file(s) (gbk)
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
-q QUERY, --query QUERY
Specify (assembled) query genome to use, in addition to genomes found in genome dir
MUMi:-U MAX_MUMI_DISTR_DIST, --max-mumi-distr-dist MAX_MUMI_DISTR_DIST, --MUMi MAX_MUMI_DISTR_DIST
Max MUMi distance value for MUMi distribution
-mmd MAX_MUMI_DISTANCE, --max-mumi-distance MAX_MUMI_DISTANCE
Max MUMi distance (default: autocutoff based on distribution of MUMi values)
-F, --fastmum
Fast MUMi calculation
-M, --mumi_only, --onlymumi
Calculate MUMi and exit? overrides all other choices!
--use-ani
Use ani for genome recruitment
--min-ani MIN_ANI
Min ANI value to allow for genome recruitment.
--use-mash
Use mash for genome recruitment
--max-mash-dist MAX_MASH_DIST
Max mash distance.
MUMsearch:-a MIN_ANCHOR_LENGTH, --min-anchor-length MIN_ANCHOR_LENGTH, --anchorlength MIN_ANCHOR_LENGTH
Min (a)NCHOR length (default = 1.1*(Log(S)))
-m MUM_LENGTH, --mum-length MUM_LENGTH, --mumlength MUM_LENGTH
Mum length
-C MAX_CLUSTER_D, --max-cluster-d MAX_CLUSTER_D, --clusterD MAX_CLUSTER_D
Maximal cluster D value
-z MIN_CLUSTER_SIZE, --min-cluster-size MIN_CLUSTER_SIZE, --minclustersize MIN_CLUSTER_SIZE
Minimum cluster size
LCBalignment:-D MAX_DIAG_DIFF, --max-diagonal-difference MAX_DIAG_DIFF, --DiagonalDiff MAX_DIAG_DIFF
Maximal diagonal difference. Either percentage (e.g. 0.2) or bp (e.g. 100bp)
-n {mafft,muscle,fsa,prank}, --alignment-program {mafft,muscle,fsa,prank}, --alignmentprog
{mafft,muscle,fsa,prank}
Alignment program to use
-u, --unaligned
Output unaligned regions
LCBExtensions:--extend-lcbs
Extend the boundaries of LCBs with an ungapped alignment
--match-score MATCH_SCORE
Value of match score for extension
--mismatch-penaltyMISMATCH_PENALTY
Value of mismatch score for extension (should be negative)
--gap-penaltyGAP_PENALTY
Value of gap penalty for extension (should be negative)
Misc:--skip-phylogeny
Do not generate phylogeny from core SNPs
--validate-input
Use Biopython to validate input files
--use-fasttree
Use fasttree instead of RaxML
--vcf Generate VCF file.
-p THREADS, --threads THREADS
Number of threads to use
-P MAX_PARTITION_SIZE, --max-partition-size MAX_PARTITION_SIZE
Max partition size (limits memory usage)
-v, --verbose
Verbose output
-x, --xtrafast-i INIFILE, --inifile INIFILE, --ini-file INIFILE
-e, --extend-V, --version
show program's version number and exit