GetLastError
$error = $sph->GetLastError;
Get last error message (string)
GetLastWarning
$warning = $sph->GetLastWarning;
Get last warning message (string)
IsConnectError
Check connection error flag (to differentiate between network connection errors and bad responses).
Returns true value on connection error.
SetEncoders
$sph->SetEncoders(\&encode_function, \&decode_function)
COMPATIBILITY NOTE: SetEncoders() was introduced in version 0.17. Prior to that, all strings were
considered to be sequences of bytes which may have led to issues with multi-byte characters. If you were
previously encoding/decoding strings external to Sphinx::Search, you will need to disable
encoding/decoding by setting Sphinx::Search to use raw values as explained below (or modify your code and
let Sphinx::Search do the recoding).
Set the string encoder/decoder functions for transferring strings between perl and Sphinx. The encoder
should take the perl internal representation and convert to the bytestream that searchd expects, and the
decoder should take the bytestream returned by searchd and convert to perl format.
The searchd format will depend on the 'charset_type' index setting in the Sphinx configuration file.
The coders default to encode_utf8 and decode_utf8 respectively, which are compatible with the 'utf8'
charset_type.
If either the encoder or decoder functions are left undefined in the call to SetEncoders, they return to
their default values.
If you wish to send raw values (no encoding/decoding), supply a function that simply returns its
argument, e.g.
$sph->SetEncoders( sub { shift }, sub { shift });
Returns $sph.
SetServer
$sph->SetServer($host, $port);
$sph->SetServer($path, $port);
In the first form, sets the host (string) and port (integer) details for the searchd server using a
network (INET) socket (default is localhost:9312).
In the second form, where $path is a local filesystem path (optionally prefixed by 'unix://'), sets the
client to access the searchd server via a local (UNIX domain) socket at the specified path.
Returns $sph.
SetConnectTimeout
$sph->SetConnectTimeout($timeout)
Set server connection timeout (in seconds).
Returns $sph.
SetConnectRetries
$sph->SetConnectRetries($retries)
Set server connection retries (in case of connection fail).
Returns $sph.
SetLimits
$sph->SetLimits($offset, $limit);
$sph->SetLimits($offset, $limit, $max);
Set match offset/limits, and optionally the max number of matches to return.
Returns $sph.
SetMaxQueryTime
$sph->SetMaxQueryTime($millisec);
Set maximum query time, in milliseconds, per index.
The value may not be negative; 0 means "do not limit".
Returns $sph.
SetMatchMode
** DEPRECATED **
$sph->SetMatchMode($mode);
Set match mode, which may be one of:
• SPH_MATCH_ALL
Match all words
• SPH_MATCH_ANY
Match any words
• SPH_MATCH_PHRASE
Exact phrase match
• SPH_MATCH_BOOLEAN
Boolean match, using AND (&), OR (|), NOT (!,-) and parenthetic grouping.
• SPH_MATCH_EXTENDED
Extended match, which includes the Boolean syntax plus field, phrase and proximity operators.
Returns $sph.
SetRankingMode
$sph->SetRankingMode(SPH_RANK_BM25, $rank_exp);
Set ranking mode, which may be one of:
• SPH_RANK_PROXIMITY_BM25
Default mode, phrase proximity major factor and BM25 minor one
• SPH_RANK_BM25
Statistical mode, BM25 ranking only (faster but worse quality)
• SPH_RANK_NONE
No ranking, all matches get a weight of 1
• SPH_RANK_WORDCOUNT
Simple word-count weighting, rank is a weighted sum of per-field keyword occurence counts
• SPH_RANK_MATCHANY
Returns rank as it was computed in SPH_MATCH_ANY mode earlier, and is internally used to emulate
SPH_MATCH_ANY queries.
• SPH_RANK_FIELDMASK
Returns a 32-bit mask with N-th bit corresponding to N-th fulltext field, numbering from 0. The bit
will only be set when the respective field has any keyword occurences satisfiying the query.
• SPH_RANK_SPH04
SPH_RANK_SPH04 is generally based on the default SPH_RANK_PROXIMITY_BM25 ranker, but additionally
boosts the matches when they occur in the very beginning or the very end of a text field.
• SPH_RANK_EXPR
Allows the ranking formula to be specified at run time. It exposes a number of internal text factors
and lets you define how the final weight should be computed from those factors. $rank_exp should be
set to the ranking expression string, e.g. to emulate SPH_RANK_PROXIMITY_BM25, use
"sum(lcs*user_weight)*1000+bm25".
Returns $sph.
SetSortMode
$sph->SetSortMode(SPH_SORT_RELEVANCE);
$sph->SetSortMode($mode, $sortby);
Set sort mode, which may be any of:
SPH_SORT_RELEVANCE - sort by relevance
SPH_SORT_ATTR_DESC, SPH_SORT_ATTR_ASC
Sort by attribute descending/ascending. $sortby specifies the sorting attribute.
SPH_SORT_TIME_SEGMENTS
Sort by time segments (last hour/day/week/month) in descending order, and then by relevance in
descending order. $sortby specifies the time attribute.
SPH_SORT_EXTENDED
Sort by SQL-like syntax. $sortby is the sorting specification.
SPH_SORT_EXPR
Returns $sph.
SetWeights
** DEPRECATED **
$sph->SetWeights([ 1, 2, 3, 4]);
This method is deprecated. Use SetFieldWeights instead.
Set per-field (integer) weights. The ordering of the weights correspond to the ordering of fields as
indexed.
Returns $sph.
SetFieldWeights
$sph->SetFieldWeights(\%weights);
Set per-field (integer) weights by field name. The weights hash provides field name to weight mappings.
Takes precedence over SetWeights.
Unknown names will be silently ignored. Missing fields will be given a weight of 1.
Returns $sph.
SetIndexWeights
$sph->SetIndexWeights(\%weights);
Set per-index (integer) weights. The weights hash is a mapping of index name to integer weight.
Returns $sph.
SetIDRange
$sph->SetIDRange($min, $max);
Set IDs range only match those records where document ID is between $min and $max (including $min and
$max)
Returns $sph.
SetFilter
$sph->SetFilter($attr, \@values);
$sph->SetFilter($attr, \@values, $exclude);
Sets the results to be filtered on the given attribute. Only results which have attributes matching the
given values will be returned. (Attribute values must be integers).
This may be called multiple times with different attributes to select on multiple attributes.
If 'exclude' is set, excludes results that match the filter.
Returns $sph.
SetFilterString
$sph->SetFilterString($attr, $value)
$sph->SetFilterString($attr, $value, $exclude)
Adds new string value filter. Only those documents where $attr column value matches the string value
from $value will be matched (or rejected, if $exclude is true).
SetFilterRange
$sph->SetFilterRange($attr, $min, $max);
$sph->SetFilterRange($attr, $min, $max, $exclude);
Sets the results to be filtered on a range of values for the given attribute. Only those records where
$attr column value is between $min and $max (including $min and $max) will be returned.
If 'exclude' is set, excludes results that fall within the given range.
Returns $sph.
SetFilterFloatRange
$sph->SetFilterFloatRange($attr, $min, $max, $exclude);
Same as SetFilterRange, but allows floating point values.
Returns $sph.
SetGeoAnchor
$sph->SetGeoAnchor($attrlat, $attrlong, $lat, $long);
Setup anchor point for using geosphere distance calculations in filters and sorting. Distance will be
computed with respect to this point
$attrlat is the name of latitude attribute
$attrlong is the name of longitude attribute
$lat is anchor point latitude, in radians
$long is anchor point longitude, in radians
Returns $sph.
SetGroupBy
$sph->SetGroupBy($attr, $func);
$sph->SetGroupBy($attr, $func, $groupsort);
Sets attribute and function of results grouping.
In grouping mode, all matches are assigned to different groups based on grouping function value. Each
group keeps track of the total match count, and the best match (in this group) according to current
sorting function. The final result set contains one best match per group, with grouping function value
and matches count attached.
$attr is any valid attribute. Use ResetGroupBy to disable grouping.
$func is one of:
• SPH_GROUPBY_DAY
Group by day (assumes timestamp type attribute of form YYYYMMDD)
• SPH_GROUPBY_WEEK
Group by week (assumes timestamp type attribute of form YYYYNNN)
• SPH_GROUPBY_MONTH
Group by month (assumes timestamp type attribute of form YYYYMM)
• SPH_GROUPBY_YEAR
Group by year (assumes timestamp type attribute of form YYYY)
• SPH_GROUPBY_ATTR
Group by attribute value
• SPH_GROUPBY_ATTRPAIR
Group by two attributes, being the given attribute and the attribute that immediately follows it in
the sequence of indexed attributes. The specified attribute may therefore not be the last of the
indexed attributes.
Groups in the set of results can be sorted by any SQL-like sorting clause, including both document
attributes and the following special internal Sphinx attributes:
@id - document ID;
@weight, @rank, @relevance - match weight;
@group - group by function value;
@count - number of matches in group.
The default mode is to sort by groupby value in descending order, ie. by "@group desc".
In the results set, "total_found" contains the total amount of matching groups over the whole index.
WARNING: grouping is done in fixed memory and thus its results are only approximate; so there might be
more groups reported in total_found than actually present. @count might also be underestimated.
For example, if sorting by relevance and grouping by a "published" attribute with SPH_GROUPBY_DAY
function, then the result set will contain only the most relevant match for each day when there were any
matches published, with day number and per-day match count attached, and sorted by day number in
descending order (ie. recent days first).
SetGroupDistinct
$sph->SetGroupDistinct($attr);
Set count-distinct attribute for group-by queries
SetRetries
$sph->SetRetries($count, $delay);
Set distributed retries count and delay
SetOverride
** DEPRECATED **
$sph->SetOverride($attrname, $attrtype, $values);
Set attribute values override. There can be only one override per attribute.
$values must be a hash that maps document IDs to attribute values
SetSelect
$sph->SetSelect($select)
Set select list (attributes or expressions). SQL-like syntax.
SetQueryFlag
$sph->SetQueryFlag($flag_name, $flag_value);
SetOuterSelect
$sph->SetOuterSelect($orderby, $offset, $limit)
ResetFilters
$sph->ResetFilters;
Clear all filters.
ResetGroupBy
$sph->ResetGroupBy;
Clear all group-by settings (for multi-queries)
ResetOverrides
Clear all attribute value overrides (for multi-queries)
ResetQueryFlag
Clear all query flags.
ResetOuterSelect
Clear all outer select settings.
Query
$results = $sph->Query($query, $index);
Connect to searchd server and run given search query.
query is query string
index is index name to query, default is "*" which means to query all indexes. Use a space or comma
separated list to search multiple indexes.
Returns undef on failure
Returns hash which has the following keys on success:
matches
Array containing hashes with found documents ( "doc", "weight", "group", "stamp" )
total
Total amount of matches retrieved (upto SPH_MAX_MATCHES, see sphinx.h)
total_found
Total amount of matching documents in index
time
Search time
words
Hash which maps query terms (stemmed!) to ( "docs", "hits" ) hash
Returns the results array on success, undef on error.
AddQuery
$sph->AddQuery($query, $index);
Add a query to a batch request.
Batch queries enable searchd to perform internal optimizations, if possible; and reduce network
connection overheads in all cases.
For instance, running exactly the same query with different groupby settings will enable searched to
perform expensive full-text search and ranking operation only once, but compute multiple groupby results
from its output.
Parameters are exactly the same as in Query() call.
Returns corresponding index to the results array returned by RunQueries() call.
RunQueries
$sph->RunQueries
Run batch of queries, as added by AddQuery.
Returns undef on network IO failure.
Returns an array of result sets on success.
Each result set in the returned array is a hash which contains the same keys as the hash returned by
Query, plus:
• error
Errors, if any, for this query.
• warning
Any warnings associated with the query.
SphinxQL
my $results = $sph->SphinxQL($sphinxql_query);
This is an alternative implementation of the SphinxQL API to the DBI option. Frankly, it was an
experiment, and the DBI driver proved to have much better performance. Whilst this may be useful to some,
in general if you are considering using this method then you should probably look at connecting directly
via DBI instead.
Results are return in a hash containing an array of 'columns' and 'rows' and possibly a warning count. If
a server-side error occurs, the hash contains the 'error' field. If a communication error occurs, the
return value will be undefined. In either error case, GetLastError will return the error.
BuildExcerpts
$excerpts = $sph->BuildExcerpts($docs, $index, $words, $opts)
Generate document excerpts for the specified documents.
docs
An array reference of strings which represent the document contents
index
A string specifiying the index whose settings will be used for stemming, lexing and case folding
words
A string which contains the words to highlight
opts
A hash which contains additional optional highlighting parameters:
before_match - a string to insert before a set of matching words, default is "<b>"
after_match - a string to insert after a set of matching words, default is "<b>"
chunk_separator - a string to insert between excerpts chunks, default is " ... "
limit - max excerpt size in symbols (codepoints), default is 256
limit_passages - Limits the maximum number of passages that can be included into the snippet.
Integer, default is 0 (no limit).
limit_words - Limits the maximum number of keywords that can be included into the snippet. Integer,
default is 0 (no limit).
around - how many words to highlight around each match, default is 5
exact_phrase - whether to highlight exact phrase matches only, default is false
single_passage - whether to extract single best passage only, default is false
use_boundaries
weight_order - Whether to sort the extracted passages in order of relevance (decreasing weight), or
in order of appearance in the document (increasing position). Boolean, default is false.
query_mode - Whether to handle $words as a query in extended syntax, or as a bag of words (default
behavior). For instance, in query mode ("one two" | "three four") will only highlight and include
those occurrences "one two" or "three four" when the two words from each pair are adjacent to each
other. In default mode, any single occurrence of "one", "two", "three", or "four" would be
highlighted. Boolean, default is false.
force_all_words - Ignores the snippet length limit until it includes all the keywords. Boolean,
default is false.
start_passage_id - Specifies the starting value of %PASSAGE_ID% macro (that gets detected and
expanded in before_match, after_match strings). Integer, default is 1.
load_files - Whether to handle $docs as data to extract snippets from (default behavior), or to treat
it as file names, and load data from specified files on the server side. Boolean, default is false.
html_strip_mode - HTML stripping mode setting. Defaults to "index", which means that index settings
will be used. The other values are "none" and "strip", that forcibly skip or apply stripping
irregardless of index settings; and "retain", that retains HTML markup and protects it from
highlighting. The "retain" mode can only be used when highlighting full documents and thus requires
that no snippet size limits are set. String, allowed values are "none", "strip", "index", and
"retain".
allow_empty - Allows empty string to be returned as highlighting result when a snippet could not be
generated (no keywords match, or no passages fit the limit). By default, the beginning of original
text would be returned instead of an empty string. Boolean, default is false.
passage_boundary
emit_zones
load_files_scattered
Returns undef on failure.
Returns an array ref of string excerpts on success.
BuildKeywords
$results = $sph->BuildKeywords($query, $index, $hits)
Generate keyword list for a given query Returns undef on failure, Returns an array of hashes, where each
hash describes a word in the query with the following keys:
• tokenized
Tokenised term from query
• normalized
Normalised term from query
• docs
Number of docs in which word was found (if $hits is true)
• hits
Number of occurrences of word (if $hits is true)
EscapeString
$escaped = $sph->EscapeString('abcde!@#$%')
Inserts backslash before all non-word characters in the given string.
UpdateAttributes
$sph->UpdateAttributes($index, \@attrs, \%values);
$sph->UpdateAttributes($index, \@attrs, \%values, $mva);
$sph->UpdateAttributes($index, \@attrs, \%values, $mva, $ignorenonexistent);
Update specified attributes on specified documents
index
Name of the index to be updated
attrs
Array of attribute name strings
values
A hash with key as document id, value as an array of new attribute values
mva If set, indicates that there is update of MVA attributes
ignorenonexistent
If set, the update will silently ignore any warnings about trying to update a column which is not
exists in current index schema.
Returns number of actually updated documents (0 or more) on success
Returns undef on failure
Usage example:
$sph->UpdateAttributes("test1", [ qw/group_id/ ], { 1 => [ 456] }) );
Open
$sph->Open()
Opens a persistent connection for subsequent queries.
To reduce the network connection overhead of making Sphinx queries, you can call $sph->Open(), then run
any number of queries, and call $sph->Close() when finished.
Returns 1 on success, 0 on failure.
Close
$sph->Close()
Closes a persistent connection.
Returns 1 on success, 0 on failure.
Status
$status = $sph->Status()
$status = $sph->Status($session)
Queries searchd status, and returns a hash of status variable name and value pairs.
Returns undef on failure.
FlushAttributes