razers - Fast Read Mapping with Sensitivity Control
Contents
Description
RazerS is a versatile full-sensitive read mapper based on a k-mer counting filter. It supports single and
paired-end mapping, and optimally parametrizes the filter based on a user-defined minimal sensitivity.
See http://www.seqan.de/projects/razers for more information.
Input to RazerS is a reference genome file and either one file with single-end reads or two files
containing left or right mates of paired-end reads. Use - to read single-end reads from stdin.
(c) Copyright 2009 by David Weese.
Examples
razersexample/genome.faexample/reads.fa-id-a-mN-v
Map single-end reads with 4% error rate, indels, and output the alignments. Ns are considered to
match everything.
razersexample/genome.faexample/reads.faexample/reads2.fa-id-mN
Map paired-end reads with up to 4% errors, indels, and output concordantly mapped pairs within
default library size. Ns are considered to match everything.
razers 1.5.8 [tarball] RAZERS(1)
Formats, Naming, Sorting, And Coordinate Schemes
RazerS supports various output formats. The output format is detected automatically from the file name
suffix.
.razers
Razer format
.fa, .fasta
Enhanced Fasta format
.eland Eland format
.gff GFF format
By default, reads and contigs are referred by their Fasta ids given in the input files. With the
-gn and -rn options this behaviour can be changed:
0 Use Fasta id.
1 Enumerate beginning with 1.
2 Use the read sequence (only for short reads!).
The way matches are sorted in the output file can be changed with the -so option for the following
formats: razer, fasta, sam, and amos. Primary and secondary sort keys are:
0 1. read number, 2. genome position
1 1. genome position, 2. read number
The coordinate space used for begin and end positions can be changed with the -pf option for the
razer and fasta formats:
0 Gap space. Gaps between characters are counted from 0.
1 Position space. Characters are counted from 1.
Name
razers - Fast Read Mapping with Sensitivity Control
Options
-h, --help
Display the help message.
--version
Display version information.
MainOptions:-f, --forward
Map reads only to forward strands.
-r, --reverse
Map reads only to reverse strands.
-i, --percent-identityDOUBLE
Percent identity threshold. In range [50..100]. Default: 92.
-rr, --recognition-rateDOUBLE
Percent recognition rate. In range [80..100]. Default: 99.
-pd, --param-dirSTRING
Read user-computed parameter files in the directory <DIR>.
-id, --indels
Allow indels. Default: mismatches only.
-ll, --library-lengthINTEGER
Paired-end library length. In range [1..inf]. Default: 220.
-le, --library-errorINTEGER
Paired-end library length tolerance. In range [0..inf]. Default: 50.
-m, --max-hitsINTEGER
Output only <NUM> of the best hits. In range [1..inf]. Default: 100.
--unique
Output only unique best matches (-m 1 -dr 0 -pa).
-tr, --trim-readsINTEGER
Trim reads to given length. Default: off. In range [14..inf].
-o, --outputOUTPUT_FILE
Change output filename (use - to dump to stdout in razers format). Default: <READSFILE>.razers.
Valid filetypes are: .razers, .gff, .fasta, .fa, and .eland.
-v, --verbose
Verbose mode.
-vv, --vverbose
Very verbose mode.
OutputFormatOptions:-a, --alignment
Dump the alignment for each match (only razer or fasta format).
-pa, --purge-ambiguous
Purge reads with more than <max-hits> best matches.
-dr, --distance-rangeINTEGER
Only consider matches with at most NUM more errors compared to the best. Default: output all.
-gn, --genome-namingINTEGER
Select how genomes are named (see Naming section below). In range [0..1]. Default: 0.
-rn, --read-namingINTEGER
Select how reads are named (see Naming section below). In range [0..2]. Default: 0.
-so, --sort-orderINTEGER
Select how matches are sorted (see Sorting section below). In range [0..1]. Default: 0.
-pf, --position-formatINTEGER
Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: 0.
FiltrationOptions:-s, --shapeSTRING
Manually set k-mer shape. Default: 11111111111.
-t, --thresholdINTEGER
Manually set minimum k-mer count threshold. In range [1..inf].
-oc, --overabundance-cutINTEGER
Set k-mer overabundance cut ratio. In range [0..1].
-rl, --repeat-lengthINTEGER
Skip simple-repeats of length <NUM>. In range [1..inf]. Default: 1000.
-tl, --taboo-lengthINTEGER
Set taboo length. In range [1..inf]. Default: 1.
-lm, --low-memory
Decrease memory usage at the expense of runtime.
VerificationOptions:-mN, --match-N
N matches all other characters. Default: N matches nothing.
-ed, --error-distrSTRING
Write error distribution to FILE.
-mcl, --min-clipped-lenINTEGER
Set minimal read length for read clipping. In range [0..inf]. Default: 0.
-qih, --quality-in-header
Quality string in fasta header.
Required Arguments
ARGUMENT0INPUT_FILE
A reference genome file. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*],
.fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any
of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
READS List of INPUT_FILE's
Either one (single-end) or two (paired-end) read files. Valid filetypes are: .sam[.*], .raw[.*],
.gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*],
.embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent
(de)compression.
Synopsis
razers [OPTIONS] <GENOMEFILE> <READSFILE>
razers [OPTIONS] <GENOMEFILE> <MP-READSFILE1> <MP-READSFILE2>
