logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

chasen ‐ Japanese Morphological Analysis System

Author

       This manual page was written by Takao KAWAMURA <kawamura@debian.or.jp>  and  modified  by  Hideki  Yamane
       <henrich@debian.or.jp> and Osamu Aoki <osamu@debian.org> for the Debian GNU/Linux system (but may be used
       by others).

                                                                                                       CHASEN(1)

Description

chasen  is  a  morphological  analysis  system. It can segment and tokenize Japanese text string, and can
       output with many additional information (pronunciation, semantic information, and others).

       It will print the result of such an operation to the standard output, so that it can be either written to
       a file or further processed.

Name

       chasen ‐ Japanese Morphological Analysis System

Options

-s     Use partial sentence mode for analysis.

       -j     Use Japanese sentence mode for analysis.  KUTEN (including other puncuation marks) and empty  line
              are treated as the punctuation of the text.

       -C     Use the command mode for analysis.

       -b     Show the best path. (default)

       -m     Show all morphemes where ambiguity is identified in the best path.

       -p     Show all paths expanding for all combinations of the ambiguity.

       -f     Show formatted morpheme data in column (default)

       -e     Show entire morpheme data.

       -c     Show coded morpheme data.

       -d     Show detailed morpheme for use by Prolog.

       -v     Show detailed morpheme for use by VisualMorphs.

       -O[c|s]
              Show morpheme as compound words or their segments.

       -Fformat
              Show morpheme formatted by the format such as "%m\t%y\t%M\t%U(%P-)\t%T \t%F \n" .

       -Fh    Print help information for -F option.

       -ilang
              Specify  the  character  encoding  of  the  input file.  e: EUC-JP, s:Shift JIS, w:UTF-8, u:UTF-8,
              a:ISO-8859-1

       -ofile
              Specify the output file to be file .

       -wwidth
              Specify the cost width.

       -rrcfile
              Use rcfile as the chasenrc file.

       -R     Use the system default chasenrc file (/etc/chasenrc).

       -Llang
              Specify language.

       -lp    Print the list of parts of speech (hinshi).

       -lt    Print list of conjugation types.

       -lf    Print list of conjugation forms.

       -h     Print help.

       -V     Print ChaSen version number.

       This manual page was written for the Debian GNU/Linux distribution because the original program does  not
       have a manual page.

See Also

       The  programs are documented fully by /usr/share/doc/chasen/manual‐j.tex or /usr/share/doc/chasen/manual‐j.pdf.

Synopsis

chasen[options]file

See Also