classifier_tester - for legacy tesseract engine.

Author

       The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995)
       and Google (2006-2018).

                                                   01/19/2025                               CLASSIFIER_TESTER(1)

Copying

       Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0

Description

classifier_tester(1) runs Tesseract in a special mode. It takes a list of .tr files and tests a character
       classifier on data as formatted for training, but it doesn’t have to be the same as the training data.

In/Out Arguments

       a list of .tr files

Name

       classifier_tester - for *legacy tesseract* engine.

Options

       -l lang
           (Input) three character language code; default value eng.

       -classifier x
           (Input) One of "pruner", "full".

       -U unicharset
           (Input) The unicharset for the language.

       -F font_properties_file
           (Input) font properties file, each line is of the following form, where each field other than the
           font name is 0 or 1:

               *font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*

       -X xheights_file
           (Input) x heights file, each line is of the following form, where xheight is calculated as the pixel
           x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders +
           descenders = 133, how much is x height? ]

               *font_name* *xheight*

       -output_trainer trainer
           (Output, Optional) Filename for output trainer.

Synopsis

classifier_tester -U unicharset_file -F font_properties_file -X xheights_file -classifier x -lang lang
       [-output_trainer trainer] *.tr

classifier_tester - for legacy tesseract engine.

Contents

Author

Copying

Description

In/Out Arguments

Name

Options

See Also

Synopsis

See Also