The rest of the documentation details each of the object methods. Internal methods are usually preceded
with a _
new
Usage : $hsp = Bio::Search::HSP::PsiBlastHSP->new( %named_params );
: Bio::Search::HSP::PsiBlastHSP.pm objects are constructed
: automatically by Bio::SearchIO::BlastHitFactory.pm,
: so there is no need for direct instantiation.
Purpose : Constructs a new PsiBlastHSP object and Initializes key variables
: for the HSP.
Returns : A Bio::Search::HSP::PsiBlastHSP object
Argument : Named parameters:
: Parameter keys are case-insensitive.
: -RAW_DATA => array ref containing raw BLAST report data for
: for a single HSP. This includes all lines
: of the HSP alignment from a traditional BLAST
or PSI-BLAST (non-XML) report,
: -RANK => integer (1..n).
: -PROGRAM => string ('TBLASTN', 'BLASTP', etc.).
: -QUERY_NAME => string, id of query sequence
: -HIT_NAME => string, id of hit sequence
:
Comments : Having the raw data allows this object to do lazy parsing of
: the raw HSP data (i.e., not parsed until needed).
:
: Note that there is a fair amount of basic parsing that is
: currently performed in this module that would be more appropriate
: to do within a separate factory object.
: This parsing code will likely be relocated and more initialization
: parameters will be added to new().
:
See Also : L<Bio::SeqFeature::SimilarityPair::new()>, L<Bio::SeqFeature::Similarity::new()>
algorithm
Title : algorithm
Usage : $alg = $hsp->algorithm();
Function: Gets the algorithm specification that was used to obtain the hsp
For BLAST, the algorithm denotes what type of sequence was aligned
against what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated
dna-prt, TBLASTN prt-translated dna, TBLASTX translated
dna-translated dna).
Returns : a scalar string
Args : none
signif()
Usage : $hsp_obj->signif()
Purpose : Get the P-value or Expect value for the HSP.
Returns : Float (0.001 or 1.3e-43)
: Returns P-value if it is defined, otherwise, Expect value.
Argument : n/a
Throws : n/a
Comments : Provided for consistency with BlastHit::signif()
: Support for returning the significance data in different
: formats (e.g., exponent only), is not provided for HSP objects.
: This is only available for the BlastHit or Blast object.
See Also : "p", "expect", Bio::Search::Hit::BlastHit::signif()evalue
Usage : $hsp_obj->evalue()
Purpose : Get the Expect value for the HSP.
Returns : Float (0.001 or 1.3e-43)
Argument : n/a
Throws : n/a
Comments : Support for returning the expectation data in different
: formats (e.g., exponent only), is not provided for HSP objects.
: This is only available for the BlastHit or Blast object.
See Also : "p"
p
Usage : $hsp_obj->p()
Purpose : Get the P-value for the HSP.
Returns : Float (0.001 or 1.3e-43) or undef if not defined.
Argument : n/a
Throws : n/a
Comments : P-value is not defined with NCBI Blast2 reports.
: Support for returning the expectation data in different
: formats (e.g., exponent only) is not provided for HSP objects.
: This is only available for the BlastHit or Blast object.
See Also : "expect"
length
Usage : $hsp->length( [seq_type] )
Purpose : Get the length of the aligned portion of the query or sbjct.
Example : $hsp->length('query')
Returns : integer
Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total' (default = 'total')
('sbjct' is synonymous with 'hit')
Throws : n/a
Comments : 'total' length is the full length of the alignment
: as reported in the denominators in the alignment section:
: "Identical = 34/120 Positives = 67/120".
See Also : "gaps"
gaps
Usage : $hsp->gaps( [seq_type] )
Purpose : Get the number of gap characters in the query, sbjct, or total alignment.
: Also can return query gap chars and sbjct gap chars as a two-element list
: when in array context.
Example : $total_gaps = $hsp->gaps();
: ($qgaps, $sgaps) = $hsp->gaps();
: $qgaps = $hsp->gaps('query');
Returns : scalar context: integer
: array context without args: (int, int) = ('queryGaps', 'sbjctGaps')
Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
: ('sbjct' is synonymous with 'hit')
: (default = 'total', scalar context)
: Array context can be "induced" by providing an argument of 'list' or 'array'.
Throws : n/a
See Also : "length", "matches"
frac_identical
Usage : $hsp_object->frac_identical( [seq_type] );
Purpose : Get the fraction of identical positions within the given HSP.
Example : $frac_iden = $hsp_object->frac_identical('query');
Returns : Float (2-decimal precision, e.g., 0.75).
Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
: ('sbjct' is synonymous with 'hit')
: default = 'total' (but see comments below).
Throws : n/a
Comments : Different versions of Blast report different values for the total
: length of the alignment. This is the number reported in the
: denominators in the stats section:
: "Identical = 34/120 Positives = 67/120".
: NCBI-BLAST uses the total length of the alignment (with gaps)
: WU-BLAST uses the length of the query sequence (without gaps).
: Therefore, when called without an argument or an argument of 'total',
: this method will report different values depending on the
: version of BLAST used.
:
: To get the fraction identical among only the aligned residues,
: ignoring the gaps, call this method with an argument of 'query'
: or 'sbjct' ('sbjct' is synonymous with 'hit').
See Also : "frac_conserved", "num_identical", "matches"
frac_conserved
Usage : $hsp_object->frac_conserved( [seq_type] );
Purpose : Get the fraction of conserved positions within the given HSP.
: (Note: 'conservative' positions are called 'positives' in the
: Blast report.)
Example : $frac_cons = $hsp_object->frac_conserved('query');
Returns : Float (2-decimal precision, e.g., 0.75).
Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
: ('sbjct' is synonymous with 'hit')
: default = 'total' (but see comments below).
Throws : n/a
Comments : Different versions of Blast report different values for the total
: length of the alignment. This is the number reported in the
: denominators in the stats section:
: "Identical = 34/120 Positives = 67/120".
: NCBI-BLAST uses the total length of the alignment (with gaps)
: WU-BLAST uses the length of the query sequence (without gaps).
: Therefore, when called without an argument or an argument of 'total',
: this method will report different values depending on the
: version of BLAST used.
:
: To get the fraction conserved among only the aligned residues,
: ignoring the gaps, call this method with an argument of 'query'
: or 'sbjct'.
See Also : "frac_conserved", "num_conserved", "matches"
query_string
Title : query_string
Usage : my $qseq = $hsp->query_string;
Function: Retrieves the query sequence of this HSP as a string
Returns : string
Args : none
hit_string
Title : hit_string
Usage : my $hseq = $hsp->hit_string;
Function: Retrieves the hit sequence of this HSP as a string
Returns : string
Args : none
homology_string
Title : homology_string
Usage : my $homo_string = $hsp->homology_string;
Function: Retrieves the homology sequence for this HSP as a string.
: The homology sequence is the string of symbols in between the
: query and hit sequences in the alignment indicating the degree
: of conservation (e.g., identical, similar, not similar).
Returns : string
Args : none
expect
See Bio::Search::HSP::HSPI::expect()rank
Usage : $hsp->rank( [string] );
Purpose : Get the rank of the HSP within a given Blast hit.
Example : $rank = $hsp->rank;
Returns : Integer (1..n) corresponding to the order in which the HSP
appears in the BLAST report.
to_string
Title : to_string
Usage : print $hsp->to_string;
Function: Returns a string representation for the Blast HSP.
Primarily intended for debugging purposes.
Example : see usage
Returns : A string of the form:
[PsiBlastHSP] <rank>
e.g.:
[BlastHit] 1
Args : None
_set_data
Usage : called automatically during object construction.
Purpose : Parses the raw HSP section from a flat BLAST report and
sets the query sequence, sbjct sequence, and the "match" data
: which consists of the symbols between the query and sbjct lines
: in the alignment.
Argument : Array (all lines for a single, complete HSP, from a raw,
flat (i.e., non-XML) BLAST report)
Throws : Propagates any exceptions from the methods called ("See Also")
See Also : "_set_seq", "_set_score_stats", "_set_match_stats"
_set_score_stats
Usage : called automatically by _set_data()
Purpose : Sets various score statistics obtained from the HSP listing.
Argument : String with any of the following formats:
: blast2: Score = 30.1 bits (66), Expect = 9.2
: blast2: Score = 158.2 bits (544), Expect(2) = e-110
: blast1: Score = 410 (144.3 bits), Expect = 1.7e-40, P = 1.7e-40
: blast1: Score = 55 (19.4 bits), Expect = 5.3, Sum P(3) = 0.99
Throws : Exception if the stats cannot be parsed, probably due to a change
: in the Blast report format.
See Also : "_set_data"
_set_match_stats
Usage : Private method; called automatically by _set_data()
Purpose : Sets various matching statistics obtained from the HSP listing.
Argument : blast2: Identities = 23/74 (31%), Positives = 29/74 (39%), Gaps = 17/74 (22%)
: blast2: Identities = 57/98 (58%), Positives = 74/98 (75%)
: blast1: Identities = 87/204 (42%), Positives = 126/204 (61%)
: blast1: Identities = 87/204 (42%), Positives = 126/204 (61%), Frame = -3
: WU-blast: Identities = 310/553 (56%), Positives = 310/553 (56%), Strand = Minus / Plus
Throws : Exception if the stats cannot be parsed, probably due to a change
: in the Blast report format.
Comments : The "Gaps = " data in the HSP header has a different meaning depending
: on the type of Blast: for BLASTP, this number is the total number of
: gaps in query+sbjct; for TBLASTN, it is the number of gaps in the
: query sequence only. Thus, it is safer to collect the data
: separately by examining the actual sequence strings as is done
: in _set_seq().
See Also : "_set_data", "_set_seq"
_set_seq_data
Usage : called automatically when sequence data is requested.
Purpose : Sets the HSP sequence data for both query and sbjct sequences.
: Includes: start, stop, length, gaps, and raw sequence.
Argument : n/a
Throws : Propagates any exception thrown by _set_match_seq()
Comments : Uses raw data stored by _set_data() during object construction.
: These data are not always needed, so it is conditionally
: executed only upon demand by methods such as gaps(), _set_residues(),
: etc. _set_seq() does the dirty work.
See Also : "_set_seq"
_set_seq
Usage : called automatically by _set_seq_data()
: $hsp_obj->($seq_type, @data);
Purpose : Sets sequence information for both the query and sbjct sequences.
: Directly counts the number of gaps in each sequence (if gapped Blast).
Argument : $seq_type = 'query' or 'sbjct'
: @data = all seq lines with the form:
: Query: 61 SPHNVKDRKEQNGSINNAISPTATANTSGSQQINIDSALRDRSSNVAAQPSLSDASSGSN 120
Throws : Exception if data strings cannot be parsed, probably due to a change
: in the Blast report format.
Comments : Uses first argument to determine which data members to set
: making this method sensitive data member name changes.
: Behavior is dependent on the type of BLAST analysis (TBLASTN, BLASTP, etc).
Warning : Sequence endpoints are normalized so that start < end. This affects HSPs
: for TBLASTN/X hits on the minus strand. Normalization facilitates use
: of range information by methods such as match().
See Also : "_set_seq_data", "matches", "range", "start", "end"
_set_residues
Usage : called automatically when residue data is requested.
Purpose : Sets the residue numbers representing the identical and
: conserved positions. These data are obtained by analyzing the
: symbols between query and sbjct lines of the alignments.
Argument : n/a
Throws : Propagates any exception thrown by _set_seq_data() and _set_match_seq().
Comments : These data are not always needed, so it is conditionally
: executed only upon demand by methods such as seq_inds().
: Behavior is dependent on the type of BLAST analysis (TBLASTN, BLASTP, etc).
See Also : "_set_seq_data", "_set_match_seq", "seq_inds"
_set_match_seq
Usage : $hsp_obj->_set_match_seq()
Purpose : Set the 'match' sequence for the current HSP (symbols in between
: the query and sbjct lines.)
Returns : Array reference holding the match sequences lines.
Argument : n/a
Throws : Exception if the _matchList field is not set.
Comments : The match information is not always necessary. This method
: allows it to be conditionally prepared.
: Called by _set_residues>() and seq_str().
See Also : "_set_residues", "seq_str"
n
Usage : $hsp_obj->n()
Purpose : Get the N value (num HSPs on which P/Expect is based).
: This value is not defined with NCBI Blast2 with gapping.
Returns : Integer or null string if not defined.
Argument : n/a
Throws : n/a
Comments : The 'N' value is listed in parenthesis with P/Expect value:
: e.g., P(3) = 1.2e-30 ---> (N = 3).
: Not defined in NCBI Blast2 with gaps.
: This typically is equal to the number of HSPs but not always.
: To obtain the number of HSPs, use Bio::Search::Hit::BlastHit::num_hsps().
See Also : Bio::SeqFeature::SimilarityPair::score()matches
Usage : $hsp->matches([seq_type], [start], [stop]);
Purpose : Get the total number of identical and conservative matches
: in the query or sbjct sequence for the given HSP. Optionally can
: report data within a defined interval along the seq.
: (Note: 'conservative' matches are called 'positives' in the
: Blast report.)
Example : ($id,$cons) = $hsp_object->matches('hit');
: ($id,$cons) = $hsp_object->matches('query',300,400);
Returns : 2-element array of integers
Argument : (1) seq_type = 'query' or 'hit' or 'sbjct' (default = query)
: ('sbjct' is synonymous with 'hit')
: (2) start = Starting coordinate (optional)
: (3) stop = Ending coordinate (optional)
Throws : Exception if the supplied coordinates are out of range.
Comments : Relies on seq_str('match') to get the string of alignment symbols
: between the query and sbjct lines which are used for determining
: the number of identical and conservative matches.
See Also : "length", "gaps", "seq_str", Bio::Search::Hit::BlastHit::_adjust_contigs()num_identical
Usage : $hsp_object->num_identical();
Purpose : Get the number of identical positions within the given HSP.
Example : $num_iden = $hsp_object->num_identical();
Returns : integer
Argument : n/a
Throws : n/a
See Also : "num_conserved", "frac_identical"
num_conserved
Usage : $hsp_object->num_conserved();
Purpose : Get the number of conserved positions within the given HSP.
Example : $num_iden = $hsp_object->num_conserved();
Returns : integer
Argument : n/a
Throws : n/a
See Also : "num_identical", "frac_conserved"
range
Usage : $hsp->range( [seq_type] );
Purpose : Gets the (start, end) coordinates for the query or sbjct sequence
: in the HSP alignment.
Example : ($query_beg, $query_end) = $hsp->range('query');
: ($hit_beg, $hit_end) = $hsp->range('hit');
Returns : Two-element array of integers
Argument : seq_type = string, 'query' or 'hit' or 'sbjct' (default = 'query')
: ('sbjct' is synonymous with 'hit')
Throws : n/a
See Also : "start", "end"
start
Usage : $hsp->start( [seq_type] );
Purpose : Gets the start coordinate for the query, sbjct, or both sequences
: in the HSP alignment.
: NOTE: Start will always be less than end.
: To determine strand, use $hsp->strand()
Example : $query_beg = $hsp->start('query');
: $hit_beg = $hsp->start('hit');
: ($query_beg, $hit_beg) = $hsp->start();
Returns : scalar context: integer
: array context without args: list of two integers
Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default= 'query')
: ('sbjct' is synonymous with 'hit')
: Array context can be "induced" by providing an argument of 'list' or 'array'.
Throws : n/a
See Also : "end", "range"
end
Usage : $hsp->end( [seq_type] );
Purpose : Gets the end coordinate for the query, sbjct, or both sequences
: in the HSP alignment.
: NOTE: Start will always be less than end.
: To determine strand, use $hsp->strand()
Example : $query_end = $hsp->end('query');
: $hit_end = $hsp->end('hit');
: ($query_end, $hit_end) = $hsp->end();
Returns : scalar context: integer
: array context without args: list of two integers
Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default= 'query')
: ('sbjct' is synonymous with 'hit')
: Array context can be "induced" by providing an argument of 'list' or 'array'.
Throws : n/a
See Also : "start", "range", "strand"
strand
Usage : $hsp_object->strand( [seq_type] )
Purpose : Get the strand of the query or sbjct sequence.
Example : print $hsp->strand('query');
: ($query_strand, $hit_strand) = $hsp->strand();
Returns : -1, 0, or 1
: -1 = Minus strand, +1 = Plus strand
: Returns 0 if strand is not defined, which occurs
: for BLASTP reports, and the query of TBLASTN
: as well as the hit if BLASTX reports.
: In scalar context without arguments, returns queryStrand value.
: In array context without arguments, returns a two-element list
: of strings (queryStrand, sbjctStrand).
: Array context can be "induced" by providing an argument of 'list' or 'array'.
Argument : seq_type: 'query' or 'hit' or 'sbjct' or undef
: ('sbjct' is synonymous with 'hit')
Throws : n/a
See Also : "_set_seq", "_set_match_stats"
seq
Usage : $hsp->seq( [seq_type] );
Purpose : Get the query or sbjct sequence as a Bio::Seq.pm object.
Example : $seqObj = $hsp->seq('query');
Returns : Object reference for a Bio::Seq.pm object.
Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = 'query').
: ('sbjct' is synonymous with 'hit')
Throws : Propagates any exception that occurs during construction
: of the Bio::Seq.pm object.
Comments : The sequence is returned in an array of strings corresponding
: to the strings in the original format of the Blast alignment.
: (i.e., same spacing).
See Also : "seq_str", "seq_inds", Bio::Seq
seq_str
Usage : $hsp->seq_str( seq_type );
Purpose : Get the full query, sbjct, or 'match' sequence as a string.
: The 'match' sequence is the string of symbols in between the
: query and sbjct sequences.
Example : $str = $hsp->seq_str('query');
Returns : String
Argument : seq_Type = 'query' or 'hit' or 'sbjct' or 'match'
: ('sbjct' is synonymous with 'hit')
Throws : Exception if the argument does not match an accepted seq_type.
Comments : Calls _set_seq_data() to set the 'match' sequence if it has
: not been set already.
See Also : "seq", "seq_inds", "_set_match_seq"
seq_inds
Usage : $hsp->seq_inds( seq_type, class, collapse );
Purpose : Get a list of residue positions (indices) for all identical
: or conserved residues in the query or sbjct sequence.
Example : @s_ind = $hsp->seq_inds('query', 'identical');
: @h_ind = $hsp->seq_inds('hit', 'conserved');
: @h_ind = $hsp->seq_inds('hit', 'conserved', 1);
Returns : List of integers
: May include ranges if collapse is true.
Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = query)
: ('sbjct' is synonymous with 'hit')
: class = 'identical' or 'conserved' (default = identical)
: (can be shortened to 'id' or 'cons')
: (actually, anything not 'id' will evaluate to 'conserved').
: collapse = boolean, if true, consecutive positions are merged
: using a range notation, e.g., "1 2 3 4 5 7 9 10 11"
: collapses to "1-5 7 9-11". This is useful for
: consolidating long lists. Default = no collapse.
Throws : n/a.
Comments : Calls _set_residues() to set the 'match' sequence if it has
: not been set already.
See Also : "seq", "_set_residues", Bio::Search::BlastUtils::collapse_nums(),
Bio::Search::Hit::BlastHit::seq_inds()get_aln
Usage : $hsp->get_aln()
Purpose : Get a Bio::SimpleAlign object constructed from the query + sbjct
: sequences of the present HSP object.
Example : $aln_obj = $hsp->get_aln();
Returns : Object reference for a Bio::SimpleAlign.pm object.
Argument : n/a.
Throws : Propagates any exception ocurring during the construction of
: the Bio::SimpleAlign object.
Comments : Requires Bio::SimpleAlign.
: The Bio::SimpleAlign object is constructed from the query + sbjct
: sequence objects obtained by calling seq().
: Gap residues are included (see $GAP_SYMBOL).
See Also : "seq", Bio::SimpleAlign