logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

razers3 - Faster, fully sensitive read mapping

Description

       RazerS  3  is  a  versatile  full-sensitive  read  mapper based on k-mer counting and seeding filters. It
       supports single and paired-end mapping, shared-memory parallelism, and optimally parametrizes the  filter
       based   on   a   user-defined  minimal  sensitivity.  See  http://www.seqan.de/projects/razers  for  more
       information.

       Input to RazerS 3 is a reference genome file and either one file  with  single-end  reads  or  two  files
       containing left or right mates of paired-end reads. Use - to read single-end reads from stdin.

       (c) Copyright 2009-2014 by David Weese.

Examples

razers3-i96-tc12-omapped.razershg18.fareads.fq
              Map single-end reads with 4% error rate using 12 threads.

       razers3-i95-no-gaps-omapped.razershg18.fareads.fq.gz
              Map single-end gzipped reads with 5% error rate and no indels.

       razers3-i94-rr95-tc12-ll280--le80-omapped.razershg18.fareads_1.fqreads_2.fq
              Map paired-end reads with up to 6% errors, 95% sensitivity, 12 threads, and  only  output  aligned
              pairs with an outer distance of 200-360bp.

razers3 3.5.8 [tarball]                                                                               RAZERS3(1)

Formats, Naming, Sorting, And Coordinate Schemes

       RazerS  3 supports various output formats. The output format is detected automatically from the file name
       suffix.

       .razers
              Razer format

       .fa, .fasta
              Enhanced Fasta format

       .eland Eland format

       .gff   GFF format

       .sam   SAM format

       .bam   BAM format

       .afg   Amos AFG format

              By default, reads and contigs are referred by their Fasta ids given in the input files.  With  the
              -gn and -rn options this behaviour can be changed:

       0      Use Fasta id.

       1      Enumerate beginning with 1.

       2      Use the read sequence (only for short reads!).

       3      Use the Fasta id, do NOT append /L or /R for mate pairs.

              The way matches are sorted in the output file can be changed with the -so option for the following
              formats: razers, fasta, sam, and afg. Primary and secondary sort keys are:

       0      1. read number, 2. genome position

       1      1. genome position, 2. read number

              The  coordinate  space used for begin and end positions can be changed with the -pf option for the
              razer and fasta formats:

       0      Gap space. Gaps between characters are counted from 0.

       1      Position space. Characters are counted from 1.

Name

       razers3 - Faster, fully sensitive read mapping

Options

-h, --help
              Display the help message.

       --version
              Display version information.

   MainOptions:-i, --percent-identityDOUBLE
              Percent identity threshold. In range [50..100]. Default: 95.

       -rr, --recognition-rateDOUBLE
              Percent recognition rate. In range [80..100]. Default: 100.

       -ng, --no-gaps
              Allow only mismatches, no indels. Default: allow both.

       -f, --forward
              Map reads only to forward strands.

       -r, --reverse
              Map reads only to reverse strands.

       -m, --max-hitsINTEGER
              Output only <NUM> of the best hits. In range [1..inf]. Default: 100.

       --unique
              Output only unique best matches (-m 1 -dr 0 -pa).

       -tr, --trim-readsINTEGER
              Trim reads to given length. Default: off. In range [14..inf].

       -o, --outputOUTPUT_FILE
              Mapping result filename (use - to dump to stdout in razers format). Default: <READSFILE>.razers.
              Valid filetypes are: .sam, .razers, .gff, .fasta, .fa, .eland, .bam, and .afg.

       -v, --verbose
              Verbose mode.

       -vv, --vverbose
              Very verbose mode.

   Paired-endOptions:-ll, --library-lengthINTEGER
              Paired-end library length. In range [1..inf]. Default: 220.

       -le, --library-errorINTEGER
              Paired-end library length tolerance. In range [0..inf]. Default: 50.

   OutputFormatOptions:-a, --alignment
              Dump the alignment for each match (only razer or fasta format).

       -pa, --purge-ambiguous
              Purge reads with more than <max-hits> best matches.

       -dr, --distance-rangeINTEGER
              Only consider matches with at most NUM more errors compared to the best. Default: output all.

       -gn, --genome-namingINTEGER
              Select how genomes are named (see Naming section below). In range [0..1]. Default: 0.

       -rn, --read-namingINTEGER
              Select how reads are named (see Naming section below). In range [0..3]. Default: 0.

       --full-readid
              Use the whole read id (don't clip after whitespace).

       -so, --sort-orderINTEGER
              Select how matches are sorted (see Sorting section below). In range [0..1]. Default: 0.

       -pf, --position-formatINTEGER
              Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: 0.

       -ds, --dont-shrink-alignments
              Disable alignment shrinking in SAM.  This is required for generating a gold mapping for Rabema.

   FiltrationOptions:-fl, --filterSTRING
              Select k-mer filter. One of pigeonhole and swift. Default: pigeonhole.

       -mr, --mutation-rateDOUBLE
              Set the percent mutation rate (pigeonhole). In range [0..20]. Default: 5.

       -ol, --overlap-lengthINTEGER
              Manually set the overlap length of adjacent k-mers (pigeonhole). In range [0..inf].

       -pd, --param-dirSTRING
              Read user-computed parameter files in the directory <DIR> (swift).

       -t, --thresholdINTEGER
              Manually set minimum k-mer count threshold (swift). In range [1..inf].

       -tl, --taboo-lengthINTEGER
              Set taboo length (swift). In range [1..inf]. Default: 1.

       -s, --shapeSTRING
              Manually set k-mer shape.

       -oc, --overabundance-cutINTEGER
              Set k-mer overabundance cut ratio. In range [0..1]. Default: 1.

       -rl, --repeat-lengthINTEGER
              Skip simple-repeats of length <NUM>. In range [1..inf]. Default: 1000.

       -lf, --load-factorDOUBLE
              Set the load factor for the open addressing k-mer index. In range [1..inf]. Default: 1.6.

   VerificationOptions:-mN, --match-N
              N matches all other characters. Default: N matches nothing.

       -ed, --error-distrSTRING
              Write error distribution to FILE.

       -mf, --mismatch-fileSTRING
              Write mismatch patterns to FILE.

   MiscOptions:-cm, --compact-multDOUBLE
              Multiply  compaction  threshold  by  this  value after reaching and compacting. In range [0..inf].
              Default: 2.2.

       -ncf, --no-compact-fracDOUBLE
              Don't compact if in this last fraction of genome. In range [0..1]. Default: 0.05.

   ParallelismOptions:-tc, --thread-countINTEGER
              Set the number of threads to use (0 to force sequential mode). In range [0..inf]. Default: 1.

       -pws, --parallel-window-sizeINTEGER
              Collect candidates in windows of this length. In range [1..inf]. Default: 500000.

       -pvs, --parallel-verification-sizeINTEGER
              Verify candidates in packages of this size. In range [1..inf]. Default: 100.

       -pvmpc, --parallel-verification-max-package-countINTEGER
              Largest number of packages to create for verification per thread-1. In  range  [1..inf].  Default:
              100.

       -amms, --available-matches-memory-sizeINTEGER
              Bytes of main memory available for storing matches. In range [-1..inf]. Default: 0.

       -mhst, --match-histo-start-thresholdINTEGER
              When to start histogram. In range [1..inf]. Default: 5.

Required Arguments

ARGUMENT0INPUT_FILE
              A  reference  genome  file.  Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*],
              .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any
              of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

       READS List of INPUT_FILE's
              Either one (single-end) or two (paired-end) read files. Valid filetypes are:  .sam[.*],  .raw[.*],
              .gbk[.*],  .frn[.*],  .fq[.*],  .fna[.*],  .ffn[.*],  .fastq[.*],  .fasta[.*],  .faa[.*], .fa[.*],
              .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent
              (de)compression.

Synopsis

razers3 [OPTIONS] <GENOMEFILE> <READSFILE>
       razers3 [OPTIONS] <GENOMEFILE> <PE-READSFILE1> <PE-READSFILE2>

See Also