pychopperpychopperpackageSubpackagespychopper.phmm_datapackageModulecontentspychopper.primer_datapackageModulecontentspychopper.scriptspackageSubmodulespychopper.scripts.pychoppermodulepychopper.scripts.pychopper.main()
Parse command line arguments.
Modulecontentspychopper.testspackageSubmodulespychopper.tests.test_detectormoduleclasspychopper.tests.test_detector.TestDetector(methodName='runTest')
Bases: TestCase
Create an instance of the class that will use the named test method when executed. Raises a
ValueError if the instance does not have a method with the specified name.
testPairAlign()testScoreCutoff()pychopper.tests.test_regression_simplemoduleclasspychopper.tests.test_regression_simple.TestIntegration(methodName='runTest')
Bases: TestCase
Create an instance of the class that will use the named test method when executed. Raises a
ValueError if the instance does not have a method with the specified name.
testIntegration()
Integration test.
testIntegration_umi()
Integration test.
ModulecontentsSubmodulespychopper.alignment_hitsmodulepychopper.alignment_hits.process_hits(hits,max_score)
Process alignment hits by removing overlaps
pychopper.choppermodulepychopper.chopper.analyse_hits(hits,config)
Segment reads based on alignment hits using dynamic programming. The algorithm is based on the
rule that each primer alignment hit can be used only once. Hence if a segment is included, the
next one has to be excluded.
pychopper.chopper.chopper_edlib(reads,primers,config,max_ed,cutoff,pool,min_batch)
Segment using the edlib/parasail backend
pychopper.chopper.chopper_phmm(reads,phmm_file,config,cutoff,threads,pool,min_batch)
Segment using the profile HMM backend
pychopper.chopper.segments_to_reads(read,segments,keep_primers,bam_tags,detect_umis)
Convert segments to output reads with annotation
pychopper.common_structuresmoduleclasspychopper.common_structures.Hit(Ref,RefStart,RefEnd,Query,QueryStart,QueryEnd,Score)
Bases: tuple
Create new instance of Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score)
Query Alias for field number 3
QueryEnd
Alias for field number 5
QueryStart
Alias for field number 4
Ref Alias for field number 0
RefEnd Alias for field number 2
RefStart
Alias for field number 1
Score Alias for field number 6
classpychopper.common_structures.Segment(Left,Start,End,Right,Strand,Len)
Bases: tuple
Create new instance of Segment(Left, Start, End, Right, Strand, Len)
End Alias for field number 2
Left Alias for field number 0
Len Alias for field number 5
Right Alias for field number 3
Start Alias for field number 1
Strand Alias for field number 4
classpychopper.common_structures.Seq(Id,Name,Seq,Qual,Umi)
Bases: tuple
Create new instance of Seq(Id, Name, Seq, Qual, Umi)
Id Alias for field number 0
Name Alias for field number 1
Qual Alias for field number 3
Seq Alias for field number 2
Umi Alias for field number 4
pychopper.edlib_backendmodulepychopper.edlib_backend.find_locations(reads,all_primers,max_ed,pool,min_batch)
Find alignment hits of all primers in all reads using the edlib/parasail backend
pychopper.edlib_backend.find_umi_single(params)
Find UMI in a single reads using the edlib/parasail backend
pychopper.hmmer_backendmodulepychopper.hmmer_backend.find_locations(reads,phmm_file,E,pool,min_batch)
Find alignment hits of all primers in all reads using the pHMM/nhmmscan backend
pychopper.parasail_backendmodulepychopper.parasail_backend.first_cigar(cigar)
Extract details of the first operation in a cigar string.
pychopper.parasail_backend.pair_align(reference,query,query_name,subs_mat,params)
Perform pairwise local alignment using parsail-python
pychopper.parasail_backend.process_alignment(aln,query,query_name,aln_params)
Process an alignment, extracting score, start and end.
pychopper.parasail_backend.refine_locations(read,all_primers,locations,aln_params={'gap_extend':1,'gap_open':1,'match':1,'mismatch':-2},subs_mat=<parasail.bindings_v2.Matrixobject>)
Refine alignment edges based on local alignment
pychopper.reportmoduleclasspychopper.report.Report(pdf)
Bases: object
Class for plotting utilities on the top of matplotlib. Plots are saved in the specified file
through the PDF backend.
Parameters
• self -- object.
• pdf -- Output pdf.
Returns
The report object.
ReturntypeReportclose()
Close PDF backend. Do not forget to call this at the end of your script or your output will
be damaged!
Parametersself -- object
Returns
None
Returntype
object
plot_arrays(data_map,title='',xlab='',ylab='',marker='.',legend_loc='best',legend=True,vlines=None,vlcolor='green',vlwitdh=0.5)
Plot multiple pairs of data arrays.
Parameters
• self -- object.
• data_map -- A dictionary with labels as keys and tupples of data arrays (x,y) as
values.
• title -- Figure title.
• xlab -- X axis label.
• ylab -- Y axis label.
• marker -- Marker passed to the plot function.
• legend_loc -- Location of legend.
• legend -- Plot legend if True
• vlines -- Dictionary with labels and positions of vertical lines to draw.
• vlcolor -- Color of vertical lines drawn.
• vlwidth -- Width of vertical lines drawn.
Returns
None
Returntype
object
plot_bars_simple(data_map,title='',xlab='',ylab='',alpha=0.6,xticks_rotation=0,auto_limit=False)
Plot simple bar chart from input dictionary.
Parameters
• self -- object.
• data_map -- A dictionary with labels as keys and data as values.
• title -- Figure title.
• xlab -- X axis label.
• ylab -- Y axis label.
• alpha -- Alpha value.
• xticks_rotation -- Rotation value for x tick labels.
• auto_limit -- Set y axis limits automatically.
Returns
None
Returntype
object
plot_histograms(data_map,title='',xlab='',ylab='',bins=50,alpha=0.7,legend_loc='best',legend=True,vlines=None)
Plot histograms of multiple data arrays.
Parameters
• self -- object.
• data_map -- A dictionary with labels as keys and data arrays as values.
• title -- Figure title.
• xlab -- X axis label.
• ylab -- Y axis label.
• bins -- Number of bins.
• alpha -- Transparency value for histograms.
• legend_loc -- Location of legend.
• legend -- Plot legend if True.
• vlines -- Dictionary with labels and positions of vertical lines to draw.
Returns
None
Returntype
object
save_close()
Utility method to save and close figure.
pychopper.seq_utilsmodule
Utilities manipulating biological sequences and formats. Extensions to biopython functionality.
pychopper.seq_utils.base_complement(k)
Return complement of base.
Performs the subsitutions: A<=>T, C<=>G, X=>X for both upper and lower case. The return value is
identical to the argument for all other values.
Parametersk -- A base.
Returns
Complement of base.
Returntype
str
pychopper.seq_utils.errs_tab(n)
Generate list of error rates for qualities less than equal than n.
pychopper.seq_utils.get_primers(primers)
Load primers from fasta file
pychopper.seq_utils.get_runid(desc)
Parse out runid from sequence description.
pychopper.seq_utils.mean_qual(quals,qround=False,tab=[1.0,0.7943282347242815,0.6309573444801932,0.5011872336272722,0.3981071705534972,0.31622776601683794,0.251188643150958,0.19952623149688797,0.15848931924611134,0.12589254117941673,0.1,0.07943282347242814,0.06309573444801933,0.05011872336272722,0.039810717055349734,0.03162277660168379,0.025118864315095794,0.0199526231496888,0.015848931924611134,0.012589254117941675,0.01,0.007943282347242814,0.00630957344480193,0.005011872336272725,0.003981071705534973,0.0031622776601683794,0.0025118864315095794,0.001995262314968879,0.001584893192461114,0.0012589254117941675,0.001,0.0007943282347242813,0.000630957344480193,0.0005011872336272725,0.00039810717055349735,0.00031622776601683794,0.00025118864315095795,0.00019952623149688788,0.00015848931924611142,0.00012589254117941674,0.0001,7.943282347242822e-05,6.309573444801929e-05,5.011872336272725e-05,3.9810717055349695e-05,3.1622776601683795e-05,2.5118864315095822e-05,1.9952623149688786e-05,1.584893192461114e-05,1.2589254117941661e-05,1e-05,7.943282347242822e-06,6.30957344480193e-06,5.011872336272725e-06,3.981071705534969e-06,3.162277660168379e-06,2.5118864315095823e-06,1.9952623149688787e-06,1.584893192461114e-06,1.2589254117941661e-06,1e-06,7.943282347242822e-07,6.30957344480193e-07,5.011872336272725e-07,3.981071705534969e-07,3.162277660168379e-07,2.5118864315095823e-07,1.9952623149688787e-07,1.584893192461114e-07,1.2589254117941662e-07,1e-07,7.943282347242822e-08,6.30957344480193e-08,5.011872336272725e-08,3.981071705534969e-08,3.162277660168379e-08,2.511886431509582e-08,1.9952623149688786e-08,1.5848931924611143e-08,1.2589254117941661e-08,1e-08,7.943282347242822e-09,6.309573444801943e-09,5.011872336272715e-09,3.981071705534969e-09,3.1622776601683795e-09,2.511886431509582e-09,1.9952623149688828e-09,1.584893192461111e-09,1.2589254117941663e-09,1e-09,7.943282347242822e-10,6.309573444801942e-10,5.011872336272714e-10,3.9810717055349694e-10,3.1622776601683795e-10,2.511886431509582e-10,1.9952623149688828e-10,1.584893192461111e-10,1.2589254117941662e-10,1e-10,7.943282347242822e-11,6.309573444801942e-11,5.011872336272715e-11,3.9810717055349695e-11,3.1622776601683794e-11,2.5118864315095823e-11,1.9952623149688828e-11,1.5848931924611107e-11,1.2589254117941662e-11,1e-11,7.943282347242821e-12,6.309573444801943e-12,5.011872336272715e-12,3.9810717055349695e-12,3.1622776601683794e-12,2.5118864315095823e-12,1.9952623149688827e-12,1.584893192461111e-12,1.258925411794166e-12,1e-12,7.943282347242822e-13,6.309573444801942e-13,5.011872336272715e-13,3.981071705534969e-13,3.162277660168379e-13,2.511886431509582e-13,1.9952623149688827e-13,1.584893192461111e-13])
Calculate average basecall quality of a read. Receive the ascii quality scores of a read and
return the average quality for that read First convert Phred scores to probabilities, calculate
average error probability convert average back to Phred scale
pychopper.seq_utils.random(size=None)
Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease
forward-porting to the new random API.
pychopper.seq_utils.readfq(fastq,sample=None,min_qual=0,rfq_sup={})
Read fastx files.
This is a generator function that yields sequtils.Seq objects. Optionally filter by a minimum
mean quality (min_qual). Optionally subsample the fastx file using sample (0.0 - 1.0)
pychopper.seq_utils.record_size(read,in_format='fastq')
Calculate record size.
pychopper.seq_utils.revcomp_seq(seq)
Reverse complement sequence record
pychopper.seq_utils.reverse_complement(seq)
Reverse complement sequence.
Param input sequence string.
Returns
reverse-complemented string.
pychopper.seq_utils.writefq(r,fh)
Write read to fastq file
pychopper.utilsmodulepychopper.utils.batch(iterable,size)pychopper.utils.check_command(cmd)pychopper.utils.check_min_hmmer_version(major,minor)pychopper.utils.count_fastq_records(fname,size=128000000,opener=<built-infunctionopen>)pychopper.utils.hit2bed(hit,read)pychopper.utils.parse_config_string(s)Modulecontents
• Index
• ModuleIndex
• SearchPage