sylseg-sk - segments a Slovak words into the sylables

Author

       Jozef Ivanecky (dodo (at) kanoistika.sk)

Description

The syllabic segmentation is essential for some linguistic or speech recognition applications. Depending
on the language either ruler-based or statistical approach is being used. For Slovak, the statistical
approach seems to be more suitable.

sylseg-sk implements one of the statistical approaches for the syllabic segmentaton. Each input word is
segmented into the syllables. The several possible segmentations are generated and sorted by the
likelihood. If no input file is specified, the standard input is expected. If an input file is used then
the output is written into the file as well. The filename is input filename with the extension
".syllables".

The input-output code page is ISO 8859-2. To use it with different CP use some CP converter and pipes.
For example to have input and output in UTF-8 use (for interactive use): filtermUTF8-iso2iso2-UTF8sylseg-sk or (for batch processing) iconv-fUTF-8-tISO_8859-2|sylseg-sk|iconv-fISO_8859-2-tUTF-8

The performance of the syllabic segmentation depends on the used statistics. To improve the quality of
the segmentation is possible to train the better system with the sylseg-sk-training tool and replace the
original file located in /usr/share/sylseg_sk/sylseg-sk.stats

The design of the sylseg-sk is language independent. With retrained statistics it theoreticaly should
work for any language.

Examples

       Use standard input and debug level 3:
              sylseg-sk --dl 3

       Process file aaa.txt and print just the best segmentation:
              sylseg-sk --best aaa.txt

Exit Status

       sylseg-sk returns a zero if it succeeds to process all the input words.

Name

       sylseg-sk - segments a Slovak words into the sylables

Options

       --best Print the best result only.

       --color
              Enable color output.

       --dl 1..5
              Set  the debug level. Controls the amount of displayed information Debug level 0 displays nothing.
              The maximum level 5 displays a full debugging report. The default debug level is 1.

       --help Display a short help text.

       --ofile <file_name>
              Write output also into given file.

Synopsis

sylseg-sk [--best] [--color] [--dl debug level] [--help] [--ofile <file_name>] [<input_file>]