logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

lt-trim — compiled dictionary trimmer for Apertium

Author

       Copyright  © 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free software.  You may
       redistribute   copies   of   it   under   the   terms   of    theGNUGeneralPublicLicense:
       https://www.gnu.org/licenses/gpl.html.

Bugs

       Many... lurking in the dark and waiting for you!

Apertium                                        February 7, 2014                                      LT-TRIM(1)

Description

lt-trim is the application responsible for trimming compiled dictionaries.  The analyses (right-side when
       compiling  lr) of analyser_binary are trimmed to the input side of bidix_binary (left-side when compiling
       lr, right-side when compiling rl), such that only  analyses  which  would  pass  through  ‘lt-proc(1)  -bbidix_binary’ are kept.

       Both  compound  tags  (“<compound-only-L>”,  “<compound-R>”) and join elements (“<j/>” in XML, “+” in the
       stream) and the group element (“<g/>” in XML, “#” in  the  stream)  should  be  handled  correctly,  even
       combinations of + followed by # in monodix are handled.

       Some  minor  caveats:  If you have the capitalised lemma “Foo” in the monodix, but “foo” in the bidix, an
       analysis “^Foo<tag>$” would pass through bidix when doing lt-proc(1) -b, but will  not  make  it  through
       trimming.   Make  sure your lemmas have the same capitalisation in the different dictionaries.  Also, you
       should not have literal ‘+’ or ‘#’ in your lemmas.  Since lt-comp(1) doesn't escape these, lt-trim cannot
       know that they are different from “<j/>” or “<g/>”, and you may get @-marked output this  way.   You  can
       analyse ‘+’ or ‘#’ by having the literal symbol in the “<l>” part and some other string (e.g., “plus”) in
       the “<r>”.

       You  should  not  trim a generator unless you have a very simple translator pipeline, since the output of
       bidix seldom goes unchanged through transfer.

Files

analyser_binary
               The untrimmed analyser dictionary (a finite state transducer).

       bidix_binary
               The dictionary to use as trimmer (a finite state transducer).

       trimmed_analyser_binary
               The trimmed analyser dictionary (a finite state transducer).

Name

       lt-trim — compiled dictionary trimmer for Apertium

Options

-s, --match-section
               A section with this name (id@type) in the analyser will only be trimmed against  a  section  with
               the  same  id  in  the  bidix.  (The  default is to trim all sections of the analyser against all
               sections of the bidix.) Using this option can some times  speed  up  trimming  considerably.  For
               example, if you have some complicated regular expressions, try putting them in a

                 <section id="regex" type="standard">

               in both .dix files and passing “regex@standard” to --match-section.

               This argument may be used multiple times to specify multiple sections that must match by name.

See Also

apertium(1), apertium-tagger(1), lt-comp(1), lt-expand(1), lt-print(1), lt-proc(1)

Synopsis

lt-trimanalyser_binarybidix_binarytrimmed_analyser_binary

See Also