classifier_tester - for *legacy tesseract* engine.
Contents
Copying
Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0
Description
classifier_tester(1) runs Tesseract in a special mode. It takes a list of .tr files and tests a character
classifier on data as formatted for training, but it doesn’t have to be the same as the training data.
In/Out Arguments
a list of .tr files
Name
classifier_tester - for *legacy tesseract* engine.
Options
-l lang
(Input) three character language code; default value eng.
-classifier x
(Input) One of "pruner", "full".
-U unicharset
(Input) The unicharset for the language.
-F font_properties_file
(Input) font properties file, each line is of the following form, where each field other than the
font name is 0 or 1:
*font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*
-X xheights_file
(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel
x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders +
descenders = 133, how much is x height? ]
*font_name* *xheight*
-output_trainer trainer
(Output, Optional) Filename for output trainer.
See Also
tesseract(1)
Synopsis
classifier_tester -U unicharset_file -F font_properties_file -X xheights_file -classifier x -lang lang
[-output_trainer trainer] *.tr
