The rest of the documentation details each of the object methods. Internal methods are usually preceded
with a _
new
Title : new
Usage : my $obj = Bio::Search::Hit::GenericHit->new();
Function: Builds a new Bio::Search::Hit::GenericHit object
Returns : Bio::Search::Hit::GenericHit
Args : -name => Name of Hit (required)
-description => Description (optional)
-accession => Accession number (optional)
-ncbi_gi => NCBI GI UID (optional)
-length => Length of the Hit (optional)
-score => Raw Score for the Hit (optional)
-bits => Bit Score for the Hit (optional)
-significance => Significance value for the Hit (optional)
-algorithm => Algorithm used (BLASTP, FASTX, etc...)
-hsps => Array ref of HSPs for this Hit.
-found_again => boolean, true if hit appears in a
"previously found" section of a PSI-Blast report.
-hsp_factory => Bio::Factory::ObjectFactoryI able to create HSPI
objects.
add_hsp
Title : add_hsp
Usage : $hit->add_hsp($hsp)
Function: Add a HSP to the collection of HSPs for a Hit
Returns : number of HSPs in the Hit
Args : Bio::Search::HSP::HSPI object, OR hash ref containing data suitable
for creating a HSPI object (&hsp_factory must be set to get it back)
hsp_factory
Title : hsp_factory
Usage : $hit->hsp_factory($hsp_factory)
Function: Get/set the factory used to build HSPI objects if necessary.
Returns : Bio::Factory::ObjectFactoryI
Args : Bio::Factory::ObjectFactoryI
Bio::Search::Hit::HitImethods
Implementation of Bio::Search::Hit::HitI methods
name
Title : name
Usage : $hit_name = $hit->name();
Function: returns the name of the Hit sequence
Returns : a scalar string
Args : [optional] scalar string to set the name
accession
Title : accession
Usage : $acc = $hit->accession();
Function: Retrieve the accession (if available) for the hit
Returns : a scalar string (empty string if not set)
Args : none
description
Title : description
Usage : $desc = $hit->description();
Function: Retrieve the description for the hit
Returns : a scalar string
Args : [optional] scalar string to set the description
length
Title : length
Usage : my $len = $hit->length
Function: Returns the length of the hit
Returns : integer
Args : [optional] integer to set the length
algorithm
Title : algorithm
Usage : $alg = $hit->algorithm();
Function: Gets the algorithm specification that was used to obtain the hit
For BLAST, the algorithm denotes what type of sequence was aligned
against what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated
dna-prt, TBLASTN prt-translated dna, TBLASTX translated
dna-translated dna).
Returns : a scalar string
Args : [optional] scalar string to set the algorithm
raw_score
Title : raw_score
Usage : $score = $hit->raw_score();
Function: Gets the "raw score" generated by the algorithm. What
this score is exactly will vary from algorithm to algorithm,
returning undef if unavailable.
Returns : a scalar value
Args : [optional] scalar value to set the raw score
score
Equivalent to raw_score()significance
Title : significance
Usage : $significance = $hit->significance();
Function: Used to obtain the E or P value of a hit, i.e. the probability that
this particular hit was obtained purely by random chance. If
information is not available (nor calculatable from other
information sources), return undef.
Returns : a scalar value or undef if unavailable
Args : [optional] scalar value to set the significance
bits
Usage : $hit_object->bits();
Purpose : Gets the bit score of the best HSP for the current hit.
Example : $bits = $hit_object->bits();
Returns : Integer or undef if bit score is not set
Argument : n/a
Comments : For BLAST1, the non-bit score is listed in the summary line.
See Also : score()next_hsp
Title : next_hsp
Usage : while( $hsp = $obj->next_hsp()) { ... }
Function : Returns the next available High Scoring Pair
Example :
Returns : Bio::Search::HSP::HSPI object or null if finished
Args : none
hsps
Usage : $hit_object->hsps();
Purpose : Get a list containing all HSP objects.
: Get the numbers of HSPs for the current hit.
Example : @hsps = $hit_object->hsps();
: $num = $hit_object->hsps(); # alternatively, use num_hsps()
Returns : Array context : list of Bio::Search::HSP::BlastHSP.pm objects.
: Scalar context: integer (number of HSPs).
: (Equivalent to num_hsps()).
Argument : n/a. Relies on wantarray
Throws : Exception if the HSPs have not been collected.
See Also : hsp(), num_hsps()num_hsps
Usage : $hit_object->num_hsps();
Purpose : Get the number of HSPs for the present hit.
Example : $nhsps = $hit_object->num_hsps();
Returns : Integer or '-' if HSPs have not been callected
Argument : n/a
See Also : hsps()rewind
Title : rewind
Usage : $hit->rewind;
Function: Allow one to reset the HSP iterator to the beginning
Since this is an in-memory implementation
Returns : none
Args : none
ambiguous_aln
Usage : $ambig_code = $hit_object->ambiguous_aln();
Purpose : Sets/Gets ambiguity code data member.
Example : (see usage)
Returns : String = 'q', 's', 'qs', '-'
: 'q' = query sequence contains overlapping sub-sequences
: while sbjct does not.
: 's' = sbjct sequence contains overlapping sub-sequences
: while query does not.
: 'qs' = query and sbjct sequence contains overlapping sub-sequences
: relative to each other.
: '-' = query and sbjct sequence do not contains multiple domains
: relative to each other OR both contain the same distribution
: of similar domains.
Argument : n/a
Throws : n/a
Comment : Note: "sbjct" is synonymous with "hit"
overlap
See documentation in Bio::Search::Hit::HitI::overlap()n
Usage : $hit_object->n();
Purpose : Gets the N number for the current hit.
: This is the number of HSPs in the set which was ascribed
: the lowest P-value (listed on the description line).
: This number is not the same as the total number of HSPs.
: To get the total number of HSPs, use num_hsps().
Example : $n = $hit_object->n();
Returns : Integer
Argument : n/a
Throws : Exception if HSPs have not been set (BLAST2 reports).
Comments : Note that the N parameter is not reported in gapped BLAST2.
: Calling n() on such reports will result in a call to num_hsps().
: The num_hsps() method will count the actual number of
: HSPs in the alignment listing, which may exceed N in
: some cases.
See Also : num_hsps()p
Usage : $hit_object->p( [format] );
Purpose : Get the P-value for the best HSP of the given BLAST hit.
: (Note that P-values are not provided with NCBI Blast2 reports).
Example : $p = $sbjct->p;
: $p = $sbjct->p('exp'); # get exponent only.
: ($num, $exp) = $sbjct->p('parts'); # split sci notation into parts
Returns : Float or scientific notation number (the raw P-value, DEFAULT).
: Integer if format == 'exp' (the magnitude of the base 10 exponent).
: 2-element list (float, int) if format == 'parts' and P-value
: is in scientific notation (See Comments).
Argument : format: string of 'raw' | 'exp' | 'parts'
: 'raw' returns value given in report. Default. (1.2e-34)
: 'exp' returns exponent value only (34)
: 'parts' returns the decimal and exponent as a
: 2-element list (1.2, -34) (See Comments).
Throws : Warns if no P-value is defined. Uses expect instead.
Comments : Using the 'parts' argument is not recommended since it will not
: work as expected if the P-value is not in scientific notation.
: That is, floats are not converted into sci notation before
: splitting into parts.
See Also : expect(), significance(), Bio::Search::SearchUtils::get_exponent()hsp
Usage : $hit_object->hsp( [string] );
Purpose : Get a single HSPI object for the present HitI object.
Example : $hspObj = $hit_object->hsp; # same as 'best'
: $hspObj = $hit_object->hsp('best');
: $hspObj = $hit_object->hsp('worst');
Returns : Object reference for a Bio::Search::HSP::BlastHSP.pm object.
Argument : String (or no argument).
: No argument (default) = highest scoring HSP (same as 'best').
: 'best' or 'first' = highest scoring HSP.
: 'worst' or 'last' = lowest scoring HSP.
Throws : Exception if the HSPs have not been collected.
: Exception if an unrecognized argument is used.
See Also : hsps(), num_hsps()
logical_length
Usage : $hit_object->logical_length( [seq_type] );
: (mostly intended for internal use).
Purpose : Get the logical length of the hit sequence.
: This is necessary since the number of identical/conserved residues
: can be in terms of peptide sequence space, yet the query and/or hit
: sequence are in nucleotide space.
Example : $len = $hit_object->logical_length();
Returns : Integer
Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
('sbjct' is synonymous with 'hit')
Throws : n/a
Comments :
: In the case of BLAST flavors:
: For TBLASTN reports, the length of the aligned portion of the
: nucleotide hit sequence is divided by 3; for BLASTX reports,
: the length of the aligned portion of the nucleotide query
: sequence is divided by 3. For TBLASTX reports, the length of
: both hit and query sequence are converted.
:
: This is important for functions like frac_aligned_query()
: which need to operate in amino acid coordinate space when dealing
: with [T]BLAST[NX] type reports.
See Also : length(), frac_aligned_query(), frac_aligned_hit()length_aln
Usage : $hit_object->length_aln( [seq_type] );
Purpose : Get the total length of the aligned region for query or sbjct seq.
: This number will include all HSPs
Example : $len = $hit_object->length_aln(); # default = query
: $lenAln = $hit_object->length_aln('query');
Returns : Integer
Argument : seq_Type = 'query' or 'hit' or 'sbjct' (Default = 'query')
('sbjct' is synonymous with 'hit')
Throws : Exception if the argument is not recognized.
Comments : This method will report the logical length of the alignment,
: meaning that for TBLAST[NX] reports, the length is reported
: using amino acid coordinate space (i.e., nucleotides / 3).
:
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically..
: If you don't want the tiled data, iterate through each HSP
: calling length() on each (use hsps() to get all HSPs).
See Also : length(), frac_aligned_query(), frac_aligned_hit(), gaps(),
Bio::Search::SearchUtils::tile_hsps(), Bio::Search::HSP::BlastHSP::length()gaps
Usage : $hit_object->gaps( [seq_type] );
Purpose : Get the number of gaps in the aligned query, hit, or both sequences.
: Data is summed across all HSPs.
Example : $qgaps = $hit_object->gaps('query');
: $hgaps = $hit_object->gaps('hit');
: $tgaps = $hit_object->gaps(); # default = total (query + hit)
Returns : scalar context: integer
: array context without args: two-element list of integers
: (queryGaps, hitGaps)
: Array context can be forced by providing an argument of 'list' or 'array'.
:
: CAUTION: Calling this method within printf or sprintf is arrray context.
: So this function may not give you what you expect. For example:
: printf "Total gaps: %d", $hit->gaps();
: Actually returns a two-element array, so what gets printed
: is the number of gaps in the query, not the total
:
Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total' | 'list' (default = 'total')
('sbjct' is synonymous with 'hit')
Throws : n/a
Comments : If you need data for each HSP, use hsps() and then interate
: through each HSP object.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically..
: Not relying on wantarray since that will fail in situations
: such as printf "%d", $hit->gaps() in which you might expect to
: be printing the total gaps, but evaluates to array context.
See Also : length_aln()matches
See documentation in Bio::Search::Hit::HitI::matches()start
Usage : $sbjct->start( [seq_type] );
Purpose : Gets the start coordinate for the query, sbjct, or both sequences
: in the BlastHit object. If there is more than one HSP, the lowest start
: value of all HSPs is returned.
Example : $qbeg = $sbjct->start('query');
: $sbeg = $sbjct->start('hit');
: ($qbeg, $sbeg) = $sbjct->start();
Returns : scalar context: integer
: array context without args: list of two integers (queryStart, sbjctStart)
: Array context can be "induced" by providing an argument of 'list' or 'array'.
Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
('sbjct' is synonymous with 'hit')
Throws : n/a
See Also : end(), range(), strand(),
Bio::Search::HSP::BlastHSP::start
end
Usage : $sbjct->end( [seq_type] );
Purpose : Gets the end coordinate for the query, sbjct, or both sequences
: in the BlastHit object. If there is more than one HSP,
the largest end
: value of all HSPs is returned.
Example : $qend = $sbjct->end('query');
: $send = $sbjct->end('hit');
: ($qend, $send) = $sbjct->end();
Returns : scalar context: integer
: array context without args: list of two integers
: (queryEnd, sbjctEnd)
: Array context can be "induced" by providing an argument
: of 'list' or 'array'.
Argument : In scalar context: seq_type = 'query' or 'sbjct'
: (case insensitive). If not supplied, 'query' is used.
Throws : n/a
See Also : start(), range(), strand()range
Usage : $sbjct->range( [seq_type] );
Purpose : Gets the (start, end) coordinates for the query or sbjct sequence
: in the HSP alignment.
Example : ($qbeg, $qend) = $sbjct->range('query');
: ($sbeg, $send) = $sbjct->range('hit');
Returns : Two-element array of integers
Argument : seq_type = string, 'query' or 'hit' or 'sbjct' (default = 'query')
('sbjct' is synonymous with 'hit')
Throws : n/a
See Also : start(), end()frac_identical
Usage : $hit_object->frac_identical( [seq_type] );
Purpose : Get the overall fraction of identical positions across all HSPs.
: The number refers to only the aligned regions and does not
: account for unaligned regions in between the HSPs, if any.
Example : $frac_iden = $hit_object->frac_identical('query');
Returns : Float (2-decimal precision, e.g., 0.75).
Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
: default = 'query' (but see comments below).
: ('sbjct' is synonymous with 'hit')
Throws : n/a
Comments :
: To compute the fraction identical, the logical length of the
: aligned portion of the sequence is used, meaning that
: in the case of BLAST flavors, for TBLASTN reports, the length of
: the aligned portion of the
: nucleotide hit sequence is divided by 3; for BLASTX reports,
: the length of the aligned portion of the nucleotide query
: sequence is divided by 3. For TBLASTX reports, the length of
: both hit and query sequence are converted.
: This is necessary since the number of identical residues is
: in terms of peptide sequence space.
:
: Different versions of Blast report different values for the total
: length of the alignment. This is the number reported in the
: denominators in the stats section:
: "Identical = 34/120 Positives = 67/120".
: NCBI BLAST uses the total length of the alignment (with gaps)
: WU-BLAST uses the length of the query sequence (without gaps).
:
: Therefore, when called with an argument of 'total',
: this method will report different values depending on the
: version of BLAST used. Total does NOT take into account HSP
: tiling, so it should not be used.
:
: To get the fraction identical among only the aligned residues,
: ignoring the gaps, call this method without an argument or
: with an argument of 'query' or 'hit'.
:
: If you need data for each HSP, use hsps() and then iterate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically.
See Also : frac_conserved(), frac_aligned_query(), matches(), Bio::Search::SearchUtils::tile_hsps()frac_conserved
Usage : $hit_object->frac_conserved( [seq_type] );
Purpose : Get the overall fraction of conserved positions across all HSPs.
: The number refers to only the aligned regions and does not
: account for unaligned regions in between the HSPs, if any.
Example : $frac_cons = $hit_object->frac_conserved('hit');
Returns : Float (2-decimal precision, e.g., 0.75).
Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
: default = 'query' (but see comments below).
: ('sbjct' is synonymous with 'hit')
Throws : n/a
Comments :
: To compute the fraction conserved, the logical length of the
: aligned portion of the sequence is used, meaning that
: in the case of BLAST flavors, for TBLASTN reports, the length of
: the aligned portion of the
: nucleotide hit sequence is divided by 3; for BLASTX reports,
: the length of the aligned portion of the nucleotide query
: sequence is divided by 3. For TBLASTX reports, the length of
: both hit and query sequence are converted.
: This is necessary since the number of conserved residues is
: in terms of peptide sequence space.
:
: Different versions of Blast report different values for the total
: length of the alignment. This is the number reported in the
: denominators in the stats section:
: "Positives = 34/120 Positives = 67/120".
: NCBI BLAST uses the total length of the alignment (with gaps)
: WU-BLAST uses the length of the query sequence (without gaps).
:
: Therefore, when called with an argument of 'total',
: this method will report different values depending on the
: version of BLAST used. Total does NOT take into account HSP
: tiling, so it should not be used.
:
: To get the fraction conserved among only the aligned residues,
: ignoring the gaps, call this method without an argument or
: with an argument of 'query' or 'hit'.
:
: If you need data for each HSP, use hsps() and then interate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically.
See Also : frac_identical(), matches(), Bio::Search::SearchUtils::tile_hsps()frac_aligned_query
Usage : $hit_object->frac_aligned_query();
Purpose : Get the fraction of the query sequence which has been aligned
: across all HSPs (not including intervals between non-overlapping
: HSPs).
Example : $frac_alnq = $hit_object->frac_aligned_query();
Returns : Float (2-decimal precision, e.g., 0.75),
: or undef if query length is unknown to avoid division by 0.
Argument : n/a
Throws : n/a
Comments : If you need data for each HSP, use hsps() and then interate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically.
See Also : frac_aligned_hit(), logical_length(), length_aln(), Bio::Search::SearchUtils::tile_hsps()frac_aligned_hit
Usage : $hit_object->frac_aligned_hit();
Purpose : Get the fraction of the hit (sbjct) sequence which has been aligned
: across all HSPs (not including intervals between non-overlapping
: HSPs).
Example : $frac_alnq = $hit_object->frac_aligned_hit();
Returns : Float (2-decimal precision, e.g., 0.75),
: or undef if hit length is unknown to avoid division by 0.
Argument : n/a
Throws : n/a
Comments : If you need data for each HSP, use hsps() and then interate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically.
See Also : frac_aligned_query(), matches(), , logical_length(), length_aln(),
Bio::Search::SearchUtils::tile_hsps()frac_aligned_sbjct
Same as frac_aligned_hit()num_unaligned_sbjct
Same as num_unaligned_hit()num_unaligned_hit
Usage : $hit_object->num_unaligned_hit();
Purpose : Get the number of the unaligned residues in the hit sequence.
: Sums across all all HSPs.
Example : $num_unaln = $hit_object->num_unaligned_hit();
Returns : Integer
Argument : n/a
Throws : n/a
Comments : See notes regarding logical lengths in the comments for frac_aligned_hit().
: They apply here as well.
: If you need data for each HSP, use hsps() and then interate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically..
See Also : num_unaligned_query(), Bio::Search::SearchUtils::tile_hsps(), frac_aligned_hit()num_unaligned_query
Usage : $hit_object->num_unaligned_query();
Purpose : Get the number of the unaligned residues in the query sequence.
: Sums across all all HSPs.
Example : $num_unaln = $hit_object->num_unaligned_query();
Returns : Integer
Argument : n/a
Throws : n/a
Comments : See notes regarding logical lengths in the comments for frac_aligned_query().
: They apply here as well.
: If you need data for each HSP, use hsps() and then interate
: through the HSP objects.
: This method requires that all HSPs be tiled. If they have not
: already been tiled, they will be tiled first automatically..
See Also : num_unaligned_hit(), frac_aligned_query(), Bio::Search::SearchUtils::tile_hsps()seq_inds
Usage : $hit->seq_inds( seq_type, class, collapse );
Purpose : Get a list of residue positions (indices) across all HSPs
: for identical or conserved residues in the query or sbjct sequence.
Example : @s_ind = $hit->seq_inds('query', 'identical');
: @h_ind = $hit->seq_inds('hit', 'conserved');
: @h_ind = $hit->seq_inds('hit', 'conserved', 1);
Returns : Array of integers
: May include ranges if collapse is non-zero.
Argument : [0] seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
: ('sbjct' is synonymous with 'hit')
: [1] class = 'identical' or 'conserved' (default = 'identical')
: (can be shortened to 'id' or 'cons')
: (actually, anything not 'id' will evaluate to 'conserved').
: [2] collapse = boolean, if non-zero, consecutive positions are merged
: using a range notation, e.g., "1 2 3 4 5 7 9 10 11"
: collapses to "1-5 7 9-11". This is useful for
: consolidating long lists. Default = no collapse.
Throws : n/a.
See Also : Bio::Search::HSP::BlastHSP::seq_inds()strand
See documentation in Bio::Search::Hit::HitI::strand()frame
See documentation in Bio::Search::Hit::HitI::frame()rank
Title : rank
Usage : $obj->rank($newval)
Function: Get/Set the rank of this Hit in the Query search list
i.e. this is the Nth hit for a specific query
Returns : value of rank
Args : newvalue (optional)
locus
Title : locus
Usage : $locus = $hit->locus();
Function: Retrieve the locus (if available) for the hit
Returns : a scalar string (empty string if not set)
Args : none
each_accession_number
Title : each_accession_number
Usage : @each_accession_number = $hit->each_accession_number();
Function: Get each accession number listed in the description of the hit.
If there are no alternatives, then only the primary accession will
be given
Returns : list of all accession numbers in the description
Args : none
tiled_hsps
See documentation in Bio::Search::SearchUtils::tile_hsps()query_length
Title : query_length
Usage : $obj->query_length($newval)
Function: Get/Set the query_length
Returns : value of query_length (a scalar)
Args : on set, new value (a scalar or undef, optional)
ncbi_gi
Title : ncbi_gi
Usage : $acc = $hit->ncbi_gi();
Function: Retrieve the NCBI Unique ID (aka the GI #),
if available, for the hit
Returns : a scalar string (empty string if not set)
Args : none
Note : As of Sept. 2016 NCBI records will no longer have a
GI; this attributue will remain in place for older
records
sort_hits
Title : sort_hsps
Usage : $result->sort_hsps(\&sort_function)
Function : Sorts the available HSP objects by a user-supplied function. Defaults to sort
by descending score.
Returns : n/a
Args : A coderef for the sort function. See the documentation on the Perl sort()
function for guidelines on writing sort functions.
Note : To access the special variables $a and $b used by the Perl sort() function
the user function must access Bio::Search::Hit::HitI namespace.
For example, use :
$hit->sort_hsps( sub{$Bio::Search::Result::HitI::a->length <=>
$Bio::Search::Result::HitI::b->length});
NOT $hit->sort_hsps($a->length <=> $b->length);
iteration
Usage : $hit->iteration( $iteration_num );
Purpose : Gets the iteration number in which the Hit was found.
Example : $iteration_num = $sbjct->iteration();
Returns : Integer greater than or equal to 1
Non-PSI-BLAST reports may report iteration as 1, but this number
is only meaningful for PSI-BLAST reports.
Argument : iteration_num (optional, used when setting only)
Throws : none
See Also : found_again()found_again
Title : found_again
Usage : $hit->found_again;
$hit->found_again(1);
Purpose : Gets a boolean indicator whether or not the hit has
been found in a previous iteration.
This is only applicable to PSI-BLAST reports.
This method indicates if the hit was reported in the
"Sequences used in model and found again" section of the
PSI-BLAST report or if it was reported in the
"Sequences not found previously or not previously below threshold"
section of the PSI-BLAST report. Only for hits in iteration > 1.
Example : if( $hit->found_again()) { ... };
Returns : Boolean, true (1) if the hit has been found in a
previous PSI-BLAST iteration.
Returns false (0 or undef) for hits that have not occurred in a
previous PSI-BLAST iteration.
Argument : Boolean (1 or 0). Only used for setting.
Throws : none
See Also : iteration()
perl v5.32.1 2021-08-15 Bio::Search::Hit::GenericHit(3pm)