logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

lambda - the Local Aligner for Massive Biological DatA

Description

       Lambda  is  a  local  aligner  optimized  for  many  query sequences and searches in protein space. It is
       compatible to BLAST, but much faster than BLAST and many other comparable tools.

       Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>

Name

       lambda - the Local Aligner for Massive Biological DatA

Options

-h, --help
              Display the help message.

       -hh, --full-help
              Display the help message with advanced options.

       --version-checkBOOL
              Turn this option off to disable version update notifications of the application.  One  of  1,  ON,
              TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: 1.

       --version
              Display version information.

       --copyright
              Display long copyright information.

       -v, --verbosityINTEGER
              Display  more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time,
              options and statistics]. In range [0..2]. Default: 1.

   InputOptions:-q, --queryINPUT_FILE
              Query sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*],  .fq[.*],  .fna[.*],
              .ffn[.*],  .fastq[.*],  .fasta[.*],  .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the
              following extensions: gz, bz2, and bgzf for transparent (de)compression.

       -d, --databaseINPUT_FILE
              Path to original database sequences (a precomputed index with .sa or .fm needs to  exist!).  Valid
              filetypes  are:  .sam[.*],  .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*],
              .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of  the  following  extensions:
              gz, bz2, and bgzf for transparent (de)compression.

       -di, --db-index-typeSTRING
              database index is in this format. One of sa and fm. Default: fm.

   OutputOptions:-o, --outputOUTPUT_FILE
              File  to  hold  reports  on  hits (.m* are blastall -m* formats; .m8 is tab-seperated, .m9 is tab-
              seperated with with comments, .m0 is pairwise format). Valid  filetypes  are:  .sam[.*],  .m9[.*],
              .m8[.*],  .m0[.*],  and  .bam,  where  * is any of the following extensions: gz, bz2, and bgzf for
              transparent (de)compression. Default: output.m8.

       -oc, --output-columnsSTRING
              Print specified column combination and/or order (.m8 and .m9 outputs only); call -oc help for more
              details. Default: std.

       -id, --percent-identityINTEGER
              Output only matches above this threshold  (checked  before  e-value  check).  In  range  [0..100].
              Default: 0.

       -e, --e-valueDOUBLE
              Output only matches that score below this threshold. In range [0..inf]. Default: 0.1.

       -nm, --num-matchesINTEGER
              Print at most this number of matches per query. In range [1..inf]. Default: 500.

       --sam-with-refheaderSTRING
              BAM  files require all subject names to be written to the header. For SAM this is not required, so
              Lambda does not automatically do it to save space (especially  for  protein  database  this  is  a
              lot!). If you still want them with SAM, e.g. for better BAM compatibility, use this option. One of
              on and off. Default: off.

       --sam-bam-seqSTRING
              Write  matching  DNA  subsequence  into SAM/BAM file (BLASTN). For BLASTX and TBLASTX the matching
              protein sequence is "untranslated" and positions  retransformed  to  the  original  sequence.  For
              BLASTP  and  TBLASTN  there is no DNA sequence so a "*" is written to the SEQ column. The matching
              protein sequence can be written as an optional tag, see --sam-bam-tags. If set to  uniq  than  the
              sequence  is omitted iff it is identical to the previous match's subsequence. One of always, uniq,
              and never. Default: uniq.

       --sam-bam-tagsSTRING
              Write the specified optional columns to the  SAM/BAM  file.  Call  --sam-bam-tags  help  for  more
              details. Default: ASNMZEZIZF.

       --sam-bam-clipSTRING
              Whether  to  hard-clip  or soft-clip the regions beyond the local match. Soft-clipping retains the
              full sequence in the output file, but obviously uses more space. One of hard  and  soft.  Default:
              hard.

   GeneralOptions:-t, --threadsINTEGER
              number of threads to run concurrently.

       -qi, --query-index-typeSTRING
              controls double-indexing. One of radix and none. Default: none.

   AlphabetsandTranslation:-p, --programSTRING
              Blast Operation Mode. One of blastn, blastp, blastx, tblastn, and tblastx. Default: blastx.

       -g, --genetic-codeINTEGER
              The  translation  table  to  use  for  nucl -> amino acid translation(not for BlastN, BlastP). See
              https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c for ids (default is generic).  Six
              frames are generated. Default: 1.

       -ar, --alphabet-reductionSTRING
              Alphabet  Reduction  for  seeding  phase  (ignored for BLASTN). One of none and murphy10. Default:
              murphy10.

   Seeding/Filtration:-sl, --seed-lengthINTEGER
              Length of the seeds (default = 14 for BLASTN). Default: 10.

       -so, --seed-offsetINTEGER
              Offset for seeding (if unset = seed-length, non-overlapping; default = 5 for BLASTN). Default: 10.

       -sd, --seed-deltaINTEGER
              maximum seed distance. Default: 1.

   MiscellaneousHeuristics:-ps, --pre-scoringINTEGER
              evaluate score of a region NUM times the size of the seed before extension (0 -> no pre-scoring, 1
              -> evaluate seed, n-> area around seed, as well; default = 1 if no reduction is  used).  In  range
              [1..inf]. Default: 2.

       -pt, --pre-scoring-thresholdDOUBLE
              minimum average score per position in pre-scoring region. Default: 2.

       -pd, --filter-putative-duplicatesSTRING
              filter hits that will likely duplicate a match already found. One of on and off. Default: on.

       -pa, --filter-putative-abundantSTRING
              If  the  maximum  number  of  matches per query are found already, stop searching if the remaining
              realm looks unfeasable. One of on and off. Default: on.

   Scoring:-sc, --scoring-schemeINTEGER
              use '45' for Blosum45; '62' for Blosum62  (default);  '80'  for  Blosum80;  [ignored  for  BlastN]
              Default: 62.

       -ge, --score-gapINTEGER
              Score per gap character (default = -2 for BLASTN). Default: -1.

       -go, --score-gap-openINTEGER
              Additional cost for opening gap (default = -5 for BLASTN). Default: -11.

       -ma, --score-matchINTEGER
              Match score [only BLASTN]) Default: 2.

       -mi, --score-mismatchINTEGER
              Mismatch score [only BLASTN] Default: -3.

   Extension:-x, --x-dropINTEGER
              Stop  Banded  extension if score x below the maximum seen (-1 means no xdrop). In range [-1..inf].
              Default: 30.

       -b, --bandINTEGER
              Size of the DP-band used in extension (-3 means log2 of query  length;  -2  means  sqrt  of  query
              length; -1 means full dp; n means band of size 2n+1) In range [-3..inf]. Default: -3.

Synopsis

lambda [OPTIONS] -qQUERY.fasta-dDATABASE.fasta [-ooutput.m8]

Tuning

       Tuning  the seeding parameters and (de)activating alphabet reduction has a strong influence on both speed
       and sensitivity. We recommend the following alternative profiles for protein searches:

       fast (high similarity):       -ar none -sl 7 -sd 0

       sensitive (lower similarity): -so 5

       For further information see the wiki: <https://github.com/seqan/lambda/wiki>

See Also