logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

apertium-tagger — part-of-speech tagger and trainer for Apertium

Bugs

       Many... lurking in the dark and waiting for you!

Apertium                                        February 22, 2021                             APERTIUM-TAGGER(1)

Description

apertium-tagger is the application  responsible  for  the  apertium  part-of-speech  tagger  training  or
       tagging, depending on the calling options.  This command only reads from the standard input if the option
       --tagger or -g is used.

Files

       These are the kinds of files used with each option:

       dictionary
               Full expanded dictionary file

       corpus  Training text corpus file

       tagger_spec
               Tagger specification file, in XML format

       serialized_tagger
               Tagger data file, built in the training and used while tagging

       tagged_corpus
               Hand-tagged text corpus

       untagged_corpus
               Untagged  text  corpus,  morphological analysis of hand-tagged corpus to use both jointly with -s
               option

       input   Input file, stdin by default

       output  Output file, stdout by default

Models

-u, --unigram=MODEL
               use unigram algorithm MODEL from <https://coltekin.net/cagri/papers/trmorph-tools.pdf>

       -w, --sliding-window
               use the Light Sliding Window algorithm

       -x, --perceptron
               use the averaged perceptron algorithm

Modes

-g, --tagger
               Tags input text by means of Viterbi algorithm.

       -rn, --retrainn
               Retrains  the  model  with  n  additional  Baum-Welch  iterations (unsupervised).  This option is
               incompatible with -u (--unigram)

       -sn, --supervisedn
               Initializes parameters against a hand-tagged text (supervised)  through  the  maximum  likelihood
               estimate  method, then performs n iterations of the Baum-Welch training algorithm (unsupervised).
               The CRP argument can be omitted only when n = 0.

       -tn, --trainn
               Initializes parameters through Kupiec's method (unsupervised), then performs n iterations of  the
               Baum-Welch training algorithm (unsupervised).

Name

       apertium-tagger — part-of-speech tagger and trainer for Apertium

Options

-d, --debug
               Print error (if any) or debug messages while operating.

       -e,--skip-on-error
               Used with -xs to ignore certain types of errors with the training corpus

       -f, --first
               Used in conjunction with -g (--tagger) makes the tagger give all lexical forms of each word, with
               the chosen one in the first place (after the lemma)

       -m, --mark
               Mark disambiguated words.

       -p, --show-superficial
               Prints the superficial form of the word along side the lexical form in the output stream.

       -z, --null-flush
               Used in conjunction with -g (--tagger) to flush the output after getting each null character.

       --help  Display a help message.

See Also

apertium(1), lt-comp(1), lt-expand(1), lt-proc(1)

Synopsis

apertium-tagger [options] -gserialized_tagger [input [output]]
       apertium-tagger [options] -riterationscorpusserialized_taggerapertium-tagger  [options]  -siterationsdictionarycorpustagger_specserialized_taggertagged_corpusuntagged_corpusapertium-tagger [options] -s0dictionarytagger_specserialized_taggertagged_corpusuntagged_corpusapertium-tagger [options] -s0-umodelserialized_taggertagged_corpusapertium-tagger [options] -titerationsdictionarycorpustagger_specserialized_tagger

See Also