logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

sortmerna - tool for filtering, mapping and OTU-picking NGS reads

Description

       SortMeRNA  is  a  biological sequence analysis tool for filtering, mapping and OTU-picking NGS reads. The
       core algorithm is based on approximate seeds and allows for fast and  sensitive  analyses  of  nucleotide
       sequences.  The main application of SortMeRNA is filtering rRNA from metatranscriptomic data.  Additional
       applications include OTU-picking and taxonomy assignation available through QIIME v1.9+ (http://qiime.org
       - v1.9.0-rc1).

       SortMeRNA takes as input a file of reads (fasta or fastq  format)  and  one  or  multiple  rRNA  database
       file(s), and sorts apart rRNA and rejected reads into two files specified by the user. Optionally, it can
       provide  high  quality  local  alignments  of  rRNA reads against the rRNA database. SortMeRNA works with
       Illumina, 454, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

Name

       sortmerna - tool for filtering, mapping and OTU-picking NGS reads

Options

MANDATORYOPTIONS--refSTRING,STRING
              FASTA reference file, index file
              Example:
              --ref/path/to/file1.fasta,/path/to/index1
              If passing multiple reference sequence files, separate them by ':'
              Example:
              --ref/path/f1.fasta,/path/index1:/path/f2.fasta,path/index2--readsSTRING
              FASTA/FASTQ reads file

       --alignedSTRING
              aligned reads filepath + base file name (appropriate extension will be added)

   COMMONOPTIONS--otherSTRING
              rejected reads filepath + base file name (appropriate extension will be added)

       --fastxBOOL
              output FASTA/FASTQ fil (default: off, for aligned and/or rejected reads)

       --samBOOL
              output SAM alignmen (default: off, for aligned reads only)

       --SQBOOL
              add SQ tags to the SAM fil (default: off)

       --blastINT
              output alignments in various Blast-like formats
              0 - pairwise
              1 - tabular (Blast -m 8 format)
              2 - tabular + column for CIGAR
              3 - tabular + columns for CIGAR and query coverage

       --logBOOL
              output overall statistic (default: off)

       --num_alignmentsINT
              report first INT alignments per read reaching E-value (default: -1, --num_alignments  0  signifies
              all alignments will be output)

       or (default)

       --bestINT
              report  INT  best  alignments  per  read  reaching E-value (default: 1) by searching --min_lisINT
              candidate alignments (--best 0 signifies all candidate alignments will be searched)

       --min_lisINT
              search all alignments having the first INT  longest  LIS  (default:  2)  LIS  stands  for  Longest
              Increasing  Subsequence,  it is computed using seeds' positions to expand hits into longer matches
              prior to Smith-Waterman alignment.

       --print_all_reads
              output null alignment strings for non-aligned reads (default: off) to  SAM  and/or  BLAST  tabular
              files

       --paired_inBOOL
              both  paired-end  reads  go  in  --aligned fasta/q file (default: off, interleaved reads only, see
              Section 4.2.4 of User Manual)

       --paired_outBOOL
              both paired-end reads go in --other fasta/q  file  (default:  off,  interleaved  reads  only,  see
              Section 4.2.4 of User Manual)

       --matchINT
              SW score (positive integer) for a match (default: 2)

       --mismatchINT
              SW penalty (negative integer) for a mismatch (default: -3)

       --gap_openINT
              SW penalty (positive integer) for introducing a gap (default: 5)

       --gap_extINT
              SW penalty (positive integer) for extending a gap (default: 2)

       -NINT SW penalty for ambiguous letters (N's) (default: scored as --mismatch)

       -FBOOL
              search only the forward strand (default: off)

       -RBOOL
              search only the reverse-complementary strand (default: off)

       -aINT number of threads to use  (default: 1)

       -eDOUBLE
              E-value threshold  (default: 1)

       -mINT INT Mbytes for loading the reads into memory (default: 1024, maximum -m INT is 5872)

       -vBOOL
              verbose  (default: off)

   OTUPICKINGOPTIONS--idDOUBLE
              %id similarity threshold (the alignment must still pass the E-value threshold, default: 0.97)

       --coverageDOUBLE
              %query coverage threshold (the alignment must still pass the E-value threshold, default: 0.97)

       --de_novo_otuBOOL
              FASTA/FASTQ file for reads matching database < %id
              (set using --id) and < %cov (set using --coverage)
              (alignment must still pass the E-value threshold, default: off)

       --otu_mapBOOL
              output OTU map (input to QIIME's make_otu_table.py, default: off)

   ADVANCEDOPTIONS
       see SortMeRNA user manual for more details

       --passesINT
              three  intervals  at  which  to  place  the  seed  on  the  read  (L  is  the  seed  length set in
              indexdb_rna(1), default: L,L/2,3)

       --edgesINT
              number (or percent if INT followed by % sign) of nucleotides to add to each edge of the read prior
              to SW local alignment (default: 4)

       --num_seedsINT
              number of seeds matched before searching for candidate LIS (default: 2)

       --full_searchBOOL
              search for all 0-error and 1-error seed matches in the index rather than stopping after finding  a
              0-error match (<1% gain in sensitivity with up four-fold decrease in speed, default: off)

       --pidBOOL
              add pid to output file names (default: off)

       -hBOOL
              help

       --versionBOOL
              SortMeRNA version number

sortmerna 2.0                                      August 2015                                      SORTMERNA(1)

Synopsis

sortmerna--ref db.fasta,db.idx --reads file.fa --aligned base_name_output [OPTIONS]

See Also