Raptor-prepare - A fast and space-efficient pre-filter for querying very large collections of nucleotide
Contents
Description
Computes minimisers for the use with raptorlayout and raptorbuild.
Can continue where it left off after a crash or in multiple runs.
Examples
raptor prepare --input bins.list --output some_directory --kmer 20 --window 24
raptor prepare --input bins.list --output some_directory --kmer-count-cutoff 2
raptor prepare --input bins.list --output some_directory --use-filesize-dependent-cutoff
Legal
Raptor-prepareCopyright: BSD 3-Clause License
Author: Enrico Seiler
Contact:enrico.seiler@fu-berlin.deSeqAnCopyright: 2006-2023 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
Inyouracademicworkspleasecite: Raptor: A fast and space-efficient pre-filter for querying very large
collections of nucleotide sequences; Enrico Seiler, Svenja Mehringer, Mitra Darvish, Etienne Turc, and
Knut Reinert; iScience 2021 24 (7): 102782. doi: https://doi.org/10.1016/j.isci.2021.102782
For full copyright and/or warranty information see --copyright.
raptor-prepare 3.0.1 (commit unavailable) Unavailable RAPTOR-PREPARE(1)
Name
Raptor-prepare - A fast and space-efficient pre-filter for querying very large collections of nucleotide
sequences.
Options
Generaloptions--input (std::filesystem::path)
File containing file names. The file must contain at least one file path per line, with multiple
paths being separated by a whitespace. Each line in the file corresponds to one bin. Valid
extensions for the paths in the file are [minimiser] when using preprocessed input from raptorprepare, and [embl,fasta,fa,fna,ffn,faa,frn,fas,fastq,fq,genbank,gb,gbk,sam], possibly followed by
[bz2,gz,bgzf]. The input file must exist and read permissions must be granted.
--output (std::filesystem::path)
A valid path for the output directory.
Will create a minimiser.list inside the output directory. This file contains a list of generated
minimiser files, in the same order as the input.
Whenyoumanuallydeletea.in_progressfile,alsodeletethecorresponding.headerand.minimiserfile!
Created output files for each file:
*.header: Contains the shape, window size, cutoff and minimiser count.
*.minimiser: Contains binary minimiser values, one minimiser per line.
*.in_progress: Temporary file to track process. Deleted after finishing computation.
--threads (unsigned8bitinteger)
The number of threads to use. Default: 1. Value must be a positive integer.
--quiet
Do not print time and memory usage.
k-meroptions--kmer (unsigned8bitinteger)
The k-mer size. Default: 20. Value must be in range [1,32].
--window (unsigned32bitinteger)
The window size. Default: k-mer size. Value must be a positive integer.
--shape (std::string)
The shape to use for k-mers. Mutually exclusive with --kmer. Parsed from right to left. Default:
11111111111111111111 (a k-mer of size 20). Value must match the pattern '[01]+'.
Processingoptions--kmer-count-cutoff (unsigned8bitinteger)
Only store k-mers with at least (>=) x occurrences. Mutually exclusive with --use-filesize-
dependent-cutoff. Default: 1. Value must be in range [1,254].
--use-filesize-dependent-cutoff
Apply cutoffs from Mantis(Pandey et al., 2018). Mutually exclusive with --kmer-count-cutoff.
Commonoptions-h, --help
Prints the help page.
-hh, --advanced-help
Prints the help page including advanced options.
--version
Prints the version information.
--copyright
Prints the copyright/license information.
--export-help (std::string)
Export the help page information. Value must be one of [html, man, ctd, cwl].
Synopsis
raptor prepare --input <file> --output <directory> [--threads <number>] [--quiet] [--kmer
<number>|--shape <01-pattern>] [--window <number>] [--kmer-count-cutoff <number>|--use-filesize-
dependent-cutoff]
Url
https://github.com/seqan/raptor
Version
Lastupdate: Unavailable
Raptor-prepareversion: 3.0.1 (commit unavailable)
Shargversion: 1.1.1
SeqAnversion: 3.4.0-rc.3
