logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

KinoSearch1::Analysis::PolyAnalyzer - multiple analyzers in series

Constructor

new()
           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
               language   => 'en',
           );

       Construct a PolyAnalyzer object.  If the parameter "analyzers" is specified, it will override "language"
       and no attempt will be made to generate a default set of Analyzers.

       •   language - Must be an ISO code from the list of supported languages.

       •   analyzers   -   Must   be   an   arrayref.    Each   element   in   the   array   must  inherit  from
           KinoSearch1::Analysis::Analyzer.  The order of the analyzers matters.  Don't put a Stemmer  before  a
           Tokenizer  (can't stem whole documents or paragraphs -- just individual words), or a Stopalizer after
           a Stemmer (stemmed words, e.g. "themselv", will not appear in a stoplist).  In general, the  sequence
           should be: normalize, tokenize, stopalize, stem.

Description

       A PolyAnalyzer is a series of Analyzers -- objects which inherit from KinoSearch1::Analysis::Analyzer --
       each of which will be called upon to "analyze" text in turn.  You can either provide the Analyzers
       yourself, or you can specify a supported language, in which case a PolyAnalyzer consisting of an
       LCNormalizer, a Tokenizer, and a Stemmer will be generated for you.

       Supported languages:

           en => English,
           da => Danish,
           de => German,
           es => Spanish,
           fi => Finnish,
           fr => French,
           it => Italian,
           nl => Dutch,
           no => Norwegian,
           pt => Portuguese,
           ru => Russian,
           sv => Swedish,

License, Disclaimer, Bugs, Etc.

       See KinoSearch1 version 1.01.

perl v5.40.0                                       2024-10-20              KinoSearch1::A...s::PolyAnalyzer(3pm)

Name

       KinoSearch1::Analysis::PolyAnalyzer - multiple analyzers in series

Synopsis

           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
               language  => 'es',
           );

           # or...
           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
               analyzers => [
                   $lc_normalizer,
                   $custom_tokenizer,
                   $snowball_stemmer,
               ],
           );

See Also