logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

parsnp - rapid core genome multi-alignment

Author

        This manpage was written by Nilesh Patra for the Debian distribution and
        can be used for any other usage of the program.

parsnp 1.5.4                                       April 2022                                          PARSNP(1)

Description

       |--Parsnp  1.5.6--|  For  detailed  documentation please see --> http://harvest.readthedocs.org/en/latest
       usage: parsnp [-h] [-c] -d SEQUENCES [SEQUENCES ...] [-r REFERENCE]

              [-g  GENBANK  [GENBANK  ...]]  [-o  OUTPUT_DIR]  [-q  QUERY]  [-U   MAX_MUMI_DISTR_DIST   |   -mmd
              MAX_MUMI_DISTANCE]   [-F]  [-M]  [--use-ani]  [--min-ani  MIN_ANI]  [--use-mash]  [--max-mash-dist
              MAX_MASH_DIST] [-a MIN_ANCHOR_LENGTH] [-m MUM_LENGTH] [-C MAX_CLUSTER_D] [-z MIN_CLUSTER_SIZE] [-D
              MAX_DIAG_DIFF] [-n  {mafft,muscle,fsa,prank}]  [-u]  [--use-fasttree]  [--vcf]  [-p  THREADS]  [-P
              MAX_PARTITION_SIZE] [-v] [-x] [-i INIFILE] [-e] [-V]

              Parsnp quick start for three example scenarios: 1) With reference & genbank file: python Parsnp.py
              -g   <reference_genbank_file1  reference_genbank_file2  ...>  -d  <seq_file1  seq_file2  ...>   -p
              <threads>

              2) With reference but without genbank file: python Parsnp.py -r <reference_genome>  -d  <seq_file1
              seq_file2 ...> -p <threads>

              3)  Autorecruit  reference to a draft assembly: python Parsnp.py -q <draft_assembly> -d <seq_file1
              seq_file2 ...> -p <threads>

   optionalarguments:-h, --help
              show this help message and exit

   Input/Output:-c, --curated
              (c)urated genome directory, use all genomes in dir and ignore MUMi?

       -d SEQUENCES [SEQUENCES ...], --sequences SEQUENCES [SEQUENCES ...]
              A list of files containing genomes/contigs/scaffolds

       -r REFERENCE, --reference REFERENCE
              (r)eference genome (set to ! to pick random one from sequence dir)

       -g GENBANK [GENBANK ...], --genbank GENBANK [GENBANK ...]
              A list of Genbank file(s) (gbk)

       -o OUTPUT_DIR, --output-dir OUTPUT_DIR

       -q QUERY, --query QUERY
              Specify (assembled) query genome to use, in addition to genomes found in genome dir

   MUMi:-U MAX_MUMI_DISTR_DIST, --max-mumi-distr-dist MAX_MUMI_DISTR_DIST, --MUMi MAX_MUMI_DISTR_DIST
              Max MUMi distance value for MUMi distribution

       -mmd MAX_MUMI_DISTANCE, --max-mumi-distance MAX_MUMI_DISTANCE
              Max MUMi distance (default: autocutoff based on distribution of MUMi values)

       -F, --fastmum
              Fast MUMi calculation

       -M, --mumi_only, --onlymumi
              Calculate MUMi and exit? overrides all other choices!

       --use-ani
              Use ani for genome recruitment

       --min-ani MIN_ANI
              Min ANI value to allow for genome recruitment.

       --use-mash
              Use mash for genome recruitment

       --max-mash-dist MAX_MASH_DIST
              Max mash distance.

   MUMsearch:-a MIN_ANCHOR_LENGTH, --min-anchor-length MIN_ANCHOR_LENGTH, --anchorlength MIN_ANCHOR_LENGTH
              Min (a)NCHOR length (default = 1.1*(Log(S)))

       -m MUM_LENGTH, --mum-length MUM_LENGTH, --mumlength MUM_LENGTH
              Mum length

       -C MAX_CLUSTER_D, --max-cluster-d MAX_CLUSTER_D, --clusterD MAX_CLUSTER_D
              Maximal cluster D value

       -z MIN_CLUSTER_SIZE, --min-cluster-size MIN_CLUSTER_SIZE, --minclustersize MIN_CLUSTER_SIZE
              Minimum cluster size

   LCBalignment:-D MAX_DIAG_DIFF, --max-diagonal-difference MAX_DIAG_DIFF, --DiagonalDiff MAX_DIAG_DIFF
              Maximal diagonal difference. Either percentage (e.g. 0.2) or bp (e.g. 100bp)

       -n {mafft,muscle,fsa,prank}, --alignment-program {mafft,muscle,fsa,prank}, --alignmentprog
       {mafft,muscle,fsa,prank}
              Alignment program to use

       -u, --unaligned
              Output unaligned regions

   LCBExtensions:--extend-lcbs
              Extend the boundaries of LCBs with an ungapped alignment

       --match-score MATCH_SCORE
              Value of match score for extension

       --mismatch-penaltyMISMATCH_PENALTY
              Value of mismatch score for extension (should be negative)

       --gap-penaltyGAP_PENALTY
              Value of gap penalty for extension (should be negative)

   Misc:--skip-phylogeny
              Do not generate phylogeny from core SNPs

       --validate-input
              Use Biopython to validate input files

       --use-fasttree
              Use fasttree instead of RaxML

       --vcf  Generate VCF file.

       -p THREADS, --threads THREADS
              Number of threads to use

       -P MAX_PARTITION_SIZE, --max-partition-size MAX_PARTITION_SIZE
              Max partition size (limits memory usage)

       -v, --verbose
              Verbose output

       -x, --xtrafast-i INIFILE, --inifile INIFILE, --ini-file INIFILE

       -e, --extend-V, --version
              show program's version number and exit

Name

       parsnp - rapid core genome multi-alignment

See Also