Bio::Tools::EUtilities - NCBI eutil XML parsers.
Contents
Bio::Tools::Eutilities Methods
cache_response
Title : cache_response
Usage : $parser->cache_response(1)
Function : sets flag to cache response object (off by default)
Returns : value eval'ing to TRUE or FALSE
Args : value eval'ing to TRUE or FALSE
Note : must be set prior to any parsing run
response
Title : response
Usage : my $response = $parser->response;
Function : Get/Set HTTP::Response object
Returns : HTTP::Response
Args : HTTP::Response
Note : to prevent object from destruction set cache_response() to TRUE
parameter_base
Title : parameter_base
Usage : my $response = $parser->parameter_base;
Function : Get/Set Bio::ParameterBaseI object (should be Bio::Tools::EUtilities::EUtilParameters)
Returns : Bio::Tools::EUtilities::EUtilParameters || undef
Args : (optional) Bio::Tools::EUtilities::EUtilParameters
Note : If this object is present, it may be used as a last resort for
some data values if parsed XML does not contain said values (for
instance, database, term, IDs, etc).
data_parsed
Title : data_parsed
Usage : if ($parser->data_parsed) {...}
Function : returns TRUE if data has been parsed
Returns : value eval'ing to TRUE or FALSE
Args : none (set within parser)
Note : mainly internal method (set in case user wants to check
whether parser is exhausted).
is_lazy
Title : is_lazy
Usage : if ($parser->is_lazy) {...}
Function : returns TRUE if parser is set to lazy parsing mode
(only affects elink/esummary)
Returns : Boolean
Args : none
Note : Permanently set in constructor. Still highly experimental.
Don't stare directly at happy fun ball...
parse_data
Title : parse_data
Usage : $parser->parse_data
Function : direct call to parse data; normally implicitly called
Returns : none
Args : none
to_string
Title : to_string
Usage : $foo->to_string()
Function : converts current object to string
Returns : none
Args : (optional) simple data for text formatting
Note : Implemented in plugins
print_all
Title : print_all
Usage : $info->print_all();
$info->print_all(-fh => $fh, -cb => $coderef);
Function : prints (dumps) all data in parser. Unless a coderef is supplied,
this just dumps the parser-specific to_string method to either a
file/fh or STDOUT
Returns : none
Args : [optional]
-file : file to print to
-fh : filehandle to print to (cannot be used concurrently with file)
-cb : coderef to use in place of default print method. This is
passed in the parser object
-wrap : number of columns to wrap default text output to (def = 80)
Notes : only applicable for einfo. If -file or -fh are not defined,
prints to STDOUT
Bio::Tools::Eutilities::Eutildatai Methods
eutil
Title : eutil
Usage : $eutil->$foo->eutil
Function : Get/Set eutil
Returns : string
Args : string (eutil)
Throws : on invalid eutil
datatype
Title : datatype
Usage : $type = $foo->datatype;
Function : Get/Set data object type
Returns : string
Args : string
Copyright
This software is copyright (c) 2006-2013 by Chris Fields.
This software is available under the same terms as the perl 5 programming language system itself.
perl v5.40.0 2025-01-27 Bio::Tools::EUtilities(3pm)
Description
Parses NCBI eutils XML output for retrieving IDs and other information. Part of the BioPerl EUtilities
system.
This is a general parser for eutils XML; data from efetch is NOT parsed (this requires separate format-
dependent parsers). All other XML for eutils is parsed. These modules can be used independently of
Bio::DB::EUtilities and Bio::Tools::EUtilities::EUtilParameters; if used in this way, only data present
in the XML will be parsed out (other bits are retrieved from a passed-in
Bio::Tools::EUtilities::EUtilParameters instance used while querying the database)
Feedback
Mailinglists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments
and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
https://bioperl.org/Support.html - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to
the module maintainer directly. Many experienced and reponsive experts will be able look at the problem
and quickly address it. Please include a thorough description of the problem with code and data examples
if at all possible.
Reportingbugs
Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution.
Bug reports can be submitted via the web:
https://github.com/bioperl/bio-eutilities/issues
Methods Useful For Multiple Eutils
get_ids
Title : get_ids
Usage : my @ids = $parser->get_ids
Function : returns array of requested IDs (see Notes for more specifics)
Returns : array
Args : [conditional] not required except when running elink queries against
multiple databases. In case of the latter, the database name is
optional but recommended when retrieving IDs as the ID list will
be globbed together. In such cases, if a db name isn't provided a
warning is issued as a reminder.
Notes : esearch : returned ID list
elink : returned ID list (see Args above for caveats)
all others : from parameter_base->id or undef
get_database
Title : get_database
Usage : my $db = $info->get_database;
Function : returns single database name (eutil-compatible). This is the
queried database. For most eutils this is straightforward. For
elinks (which have 'db' and 'dbfrom') this is db/dbto, for egquery,
it is the first db in the list (you probably want get_databases
instead)
Returns : string
Args : none
Notes : egquery : first db in the query (you probably want get_databases)
einfo : the queried database
espell : the queried database
all others : from parameter_base->db or undef
get_db(aliasforget_database)get_databases
Title : get_databases
Usage : my @dbs = $parser->get_databases
Function : returns list of databases
Returns : array of strings
Args : none
Notes : This is guaranteed to return a list of databases. For a single
database use the convenience method get_db/get_database
egquery : list of all databases in the query
einfo : the queried database, or the available databases
espell : the queried database
elink : collected from each LinkSet
all others : from parameter_base->db or undef
get_dbs(aliasforget_databases)next_History
Title : next_History
Usage : while (my $hist=$parser->next_History) {...}
Function : returns next HistoryI (if present).
Returns : Bio::Tools::EUtilities::HistoryI (Cookie or LinkSet)
Args : none
Note : esearch, epost, and elink are all capable of returning data which
indicates search results (in the form of UIDs) is stored on the
remote server. Access to this data is wrapped up in simple interface
(HistoryI), which is implemented in two classes:
Bio::DB::EUtilities::History (the simplest) and
Bio::DB::EUtilities::LinkSet. In general, calls to epost and esearch
will only return a single HistoryI object (formerly known as a
Cookie), but calls to elink can generate many depending on the
number of IDs, the correspondence, etc. Hence this iterator, which
allows one to retrieve said data one piece at a time.
next_cookie(aliasfornext_History)get_Histories
Title : get_Histories
Usage : my @hists = $parser->get_Histories
Function : returns list of HistoryI objects.
Returns : list of Bio::Tools::EUtilities::HistoryI (History or LinkSet)
Args : none
Name
Bio::Tools::EUtilities - NCBI eutil XML parsers.
Synopsis
# from file or fh
my $parser = Bio::Tools::EUtilities->new(
-eutil => 'einfo',
-file => 'output.xml'
);
# or HTTP::Response object...
my $parser = Bio::Tools::EUtilities->new(
-eutil => 'esearch',
-response => $response
);
# esearch, esummary, elink
@ids = $parser->get_ids(); # returns array or array ref of IDs
# egquery, espell
$term = $parser->get_term(); # returns array or array ref of IDs
# elink, einfo
$db = $parser->get_database(); # returns database
# Query-related methods (esearch, egquery, espell data)
# eutil data centered on use of search terms
my $ct = $parser->get_count; # uses optional database for egquery count
my $translation = $parser->get_count;
my $corrected = $parser->get_corrected_query; # espell
while (my $gquery = $parser->next_GlobalQuery) {
# iterates through egquery data
}
# Info-related methods (einfo data)
# database-related information
my $desc = $parser->get_description;
my $update = $parser->get_last_update;
my $nm = $parser->get_menu_name;
my $ct = $parser->get_record_count;
while (my $field = $parser->next_FieldInfo) {
# ...
}
while (my $field = $parser->next_LinkInfo) {
# ...
}
# History methods (epost data, some data returned from elink)
# data which enables one to retrieve and query against user-stored
# information on the NCBI server
while (my $cookie = $parser->next_History) {
# ...
}
my @hists = $parser->get_Histories;
# Bio::Tools::EUtilities::Summary (esummary data)
# information on a specific database record
# retrieve nested docsum data
while (my $docsum = $parser->next_DocSum) {
print "ID:",$docsum->get_ids,"\n";
while (my $item = $docsum->next_Item) {
# do stuff here...
while (my $listitem = $docsum->next_ListItem) {
# do stuff here...
while (my $listitem = $docsum->next_Structure) {
# do stuff here...
}
}
}
}
# retrieve flattened item list per DocSum
while (my $docsum = $parser->next_DocSum) {
my @items = $docsum->get_all_DocSum_Items;
}
Todo
This module is largely complete. However there are a few holes which will eventually be filled in.
TranslationSets from esearch are not currently parsed, for instance.
Constructormethodsnew
Title : new
Usage : my $parser = Bio::Tools::EUtilities->new(-file => 'my.xml',
-eutil => 'esearch');
Function : create Bio::Tools::EUtilities instance
Returns : new Bio::Tools::EUtilities instance
Args : -file/-fh - File or filehandle
-eutil - eutil parser to use (supports all but efetch)
-response - HTTP::Response object (optional)
Version
version 1.77
