samtools-ampliconclip - clip reads using a BED file
Contents
Description
Clips the ends of read alignments if they intersect with regions defined in a BED file. While this tool
was originally written for clipping read alignment positions which correspond to amplicon primer
locations it can also be used in other contexts.
BED file entries used are chrom, chromStart, chromEnd and, optionally, strand. Standard BED file format
must be used, so if strand is needed then the name and score fields must also be present (even though
ampliconclip does not read them). There is a default tolerance of 5 bases when matching chromStart and
chromEnd to alignments.
By default the reads are soft clipped and clip is only done from the 5' end.
Some things to be aware of. While ordering is not significant, adjustments to the left most mapping
position (POS) will mean that coordinate sorted files will need resorting. In such cases the sorting
order in the header is set to unknown. Clipping of reads results in template length (TLEN) being
incorrect. This can be corrected by samtoolsfixmates. Any MD and NM aux tags will also be incorrect,
which can be fixed by samtoolscalmd. By default MD and NM tags are removed though if the output is in
CRAM format these tags will be automatically regenerated.
Name
samtools-ampliconclip - clip reads using a BED file
Options
-bFILE BED file of regions (e.g. amplicon primers) to be removed.
-oFILE Output file name (defaults to stdout).
-fFILE File to write stats to (defaults to stderr).
-u Output uncompressed SAM, BAM or CRAM.
--soft-clip
Soft clip reads (default).
--hard-clip
Hard clip reads.
--both-ends
Clip at both the 5' and the 3' ends where regions match. When using this option the --strand
option is ignored.
--strand Use strand entry from the BED file to clip on the matching forward or reverse alignment.
--clipped Only output clipped reads. Filter all others.
--fail Mark unclipped reads as QC fail.
--filter-lenINT
Filter out reads of INT size or shorter. In this case soft clips are not counted toward read
length. An INT of 0 will filter out reads with no matching bases.
--fail-lenINT
As --filter-len but mark as QC fail rather then filter out.
--unmap-lenINT
As --filter-len but mark as unmapped. Default is 0 (no matching reads). -1 will disable.
--no-excluded
Filter out any reads that are marked as QCFAIL or are unmapped. This works on the state of
the reads before clipping takes place.
--rejects-fileFILE
Write any filtered reads out to a file.
--primer-countsFILE
File to write with read counts per bed entry (bedgraph format).
--original Add an OA tag with the original data for clipped files.
--keep-tag In clipped reads, keep the possibly invalid NM and MD tags. By default these tags are
deleted.
--toleranceINT
The amount of latitude given in matching regions to alignments. Default 5 bases.
--no-PG Do not at a PG line to the header.
See Also
samtools(1), samtools-sort(1), samtools-fixmate(1), samtools-calmd(1) Samtools website: <http://www.htslib.org/> samtools-1.21 12 September 2024 samtools-ampliconclip(1)
Synopsis
samtools ampliconclip [-oout.file] [-fstat.file] [--soft-clip] [--hard-clip] [--both-ends] [--strand]
[--clipped] [--fail] [--filter-lenINT] [--fail-lenINT] [--unmap-lenINT] [--no-excluded] [--rejects-filerejects.file] [--original] [--keep-tag] [--tolerance] [--no-PG] [-u] -bbed.filein.file