logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

Str - Regular expressions and high-level string processing

Documentation

       Module Str
        : sigend

       Regular expressions and high-level string processing

   Regularexpressions
       The  Str  library  provides  regular expressions on sequences of bytes.  It is, in general, unsuitable to
       match Unicode characters.

       typeregexp

       The type of compiled regular expressions.

       valregexp : string->regexp

       Compile a regular expression. The following constructs are recognized:

       - .  Matches any character except newline.

       - * (postfix) Matches the preceding expression zero, one or several times

       - + (postfix) Matches the preceding expression one or several times

       - ?  (postfix) Matches the preceding expression once or not at all

       - [..]  Character set. Ranges are denoted with - , as in  [a-z]  .   An  initial  ^  ,  as  in  [^0-9]  ,
       complements  the  set.   To  include  a  ] character in a set, make it the first character of the set. To
       include a - character in a set, make it the first or the last character of the set.

       - ^ Matches at beginning of line: either at the beginning of the matched string, or  just  after  a  '\n'
       character.

       - $ Matches at end of line: either at the end of the matched string, or just before a '\n' character.

       - \| (infix) Alternative between two expressions.

       - \(..\) Grouping and naming of the enclosed expression.

       -  \1 The text matched by the first \(...\) expression ( \2 for the second expression, and so on up to \9
       ).

       - \b Matches word boundaries.

       - \ Quotes special characters.  The special characters are $^\.*+?[] .

       In regular expressions you will often use backslash characters;  it's  easier  to  use  a  quoted  string
       literal {|...|} to avoid having to escape backslashes.

       For example, the following expression:
       letr=Str.regexp{|hello\([A-Za-z]+\)|}inStr.replace_firstr{|\1|}"helloworld"
       returns the string "world" .

       If  you  want  a  regular  expression  that matches a literal backslash character, you need to double it:
       Str.regexp{|\\|} .

       You can use regular string literals "..."  too, however you will have to escape backslashes. The  example
       above can be rewritten with a regular string literal as:
       letr=Str.regexp"hello\\([A-Za-z]+\\)"inStr.replace_firstr"\\1""helloworld"

       And the regular expression for matching a backslash becomes a quadruple backslash: Str.regexp"\\\\" .

       valregexp_case_fold : string->regexp

       Same  as  regexp  ,  but the compiled expression will match text in a case-insensitive way: uppercase and
       lowercase letters will be considered equivalent.

       valquote : string->stringStr.quotes returns a regexp string that matches exactly s and nothing else.

       valregexp_string : string->regexpStr.regexp_strings returns a regular expression that matches exactly s and nothing else.

       valregexp_string_case_fold : string->regexpStr.regexp_string_case_fold  is  similar  to  Str.regexp_string  ,  but   the   regexp   matches   in   a
       case-insensitive way.

   Stringmatchingandsearchingvalstring_match : regexp->string->int->boolstring_matchrsstart tests whether a substring of s that starts at position start matches the regular
       expression r .  The first character of a string has position 0 , as usual.

       valsearch_forward : regexp->string->int->intsearch_forwardrsstart searches the string s for a substring matching the regular expression  r  .  The
       search  starts  at position start and proceeds towards the end of the string.  Return the position of the
       first character of the matched substring.

       RaisesNot_found if no substring matches.

       valsearch_backward : regexp->string->int->intsearch_backwardrslast searches the string s for a substring matching the regular expression  r  .  The
       search  first  considers  substrings  that  start  at position last and proceeds towards the beginning of
       string. Return the position of the first character of the matched substring.

       RaisesNot_found if no substring matches.

       valstring_partial_match : regexp->string->int->bool

       Similar to Str.string_match , but also returns true if the argument string is a prefix of a  string  that
       matches.  This includes the case of a true complete match.

       valmatched_string : string->stringmatched_strings  returns  the  substring of s that was matched by the last call to one of the following
       matching or searching functions:

       - Str.string_match

       - Str.search_forward

       - Str.search_backward

       - Str.string_partial_match

       - Str.global_substitute

       - Str.substitute_first

       provided that none of the following functions was called in between:

       - Str.global_replace

       - Str.replace_first

       - Str.split

       - Str.bounded_split

       - Str.split_delim

       - Str.bounded_split_delim

       - Str.full_split

       - Str.bounded_full_split

       Note: in the case of global_substitute and substitute_first , a call  to  matched_string  is  only  valid
       within the subst argument, not after global_substitute or substitute_first returns.

       The  user  must  make  sure  that  the  parameter s is the same string that was passed to the matching or
       searching function.

       valmatch_beginning : unit->intmatch_beginning() returns the position of the first character of the substring that was  matched  by  the
       last call to a matching or searching function (see Str.matched_string for details).

       valmatch_end : unit->intmatch_end()  returns the position of the character following the last character of the substring that was
       matched by the last call to a matching or searching function (see Str.matched_string for details).

       valmatched_group : int->string->stringmatched_groupns returns the substring of s that was matched by the n th group \(...\)  of  the  regular
       expression  that was matched by the last call to a matching or searching function (see Str.matched_string
       for details). When n is 0 , it returns the substring matched by the whole regular expression.   The  user
       must  make  sure  that  the  parameter  s is the same string that was passed to the matching or searching
       function.

       RaisesNot_found if the n th group of the regular expression was  not  matched.   This  can  happen  with
       groups  inside alternatives \| , options ?  or repetitions * .  For instance, the empty string will match
       \(a\)* , but matched_group1"" will raise Not_found because the first group itself was not matched.

       valgroup_beginning : int->intgroup_beginningn returns the position of the first character of the substring that was matched by the  n
       th  group of the regular expression that was matched by the last call to a matching or searching function
       (see Str.matched_string for details).

       RaisesNot_found if the n th group of the regular expression was not matched.

       RaisesInvalid_argument if there are fewer than n groups in the regular expression.

       valgroup_end : int->intgroup_endn returns the position of the character following the last  character  of  substring  that  was
       matched  by  the  n th group of the regular expression that was matched by the last call to a matching or
       searching function (see Str.matched_string for details).

       RaisesNot_found if the n th group of the regular expression was not matched.

       RaisesInvalid_argument if there are fewer than n groups in the regular expression.

   Replacementvalglobal_replace : regexp->string->string->stringglobal_replaceregexptempls returns a string identical to s , except that  all  substrings  of  s  that
       match  regexp  have  been  replaced  by templ . The replacement template templ can contain \1 , \2 , etc;
       these sequences will be replaced  by  the  text  matched  by  the  corresponding  group  in  the  regular
       expression.  \0 stands for the text matched by the whole regular expression.

       valreplace_first : regexp->string->string->string

       Same  as  Str.global_replace  ,  except  that only the first substring matching the regular expression is
       replaced.

       valglobal_substitute : regexp->(string->string)->string->stringglobal_substituteregexpsubsts returns a string identical to s , except that all substrings of  s  that
       match  regexp  have been replaced by the result of function subst . The function subst is called once for
       each matching substring, and receives s (the whole text) as argument.

       valsubstitute_first : regexp->(string->string)->string->string

       Same as Str.global_substitute , except that only the first substring matching the regular  expression  is
       replaced.

       valreplace_matched : string->string->stringreplace_matchedrepls  returns the replacement text repl in which \1 , \2 , etc. have been replaced by
       the text matched by the corresponding groups in the regular expression that was matched by the last  call
       to a matching or searching function (see Str.matched_string for details).  s must be the same string that
       was passed to the matching or searching function.

   Splittingvalsplit : regexp->string->stringlistsplitrs  splits s into substrings, taking as delimiters the substrings that match r , and returns the
       list of substrings.  For instance, split(regexp"[\t]+")s splits s  into  blank-separated  words.   An
       occurrence of the delimiter at the beginning or at the end of the string is ignored.

       valbounded_split : regexp->string->int->stringlist

       Same as Str.split , but splits into at most n substrings, where n is the extra integer parameter.

       valsplit_delim : regexp->string->stringlist

       Same  as  Str.split  but  occurrences  of the delimiter at the beginning and at the end of the string are
       recognized and returned as empty strings in the result.  For instance, split_delim(regexp"")"abc"
       returns ["";"abc";""] , while split with the same arguments returns ["abc"] .

       valbounded_split_delim : regexp->string->int->stringlist

       Same  as  Str.bounded_split  ,  but  occurrences  of the delimiter at the beginning and at the end of the
       string are recognized and returned as empty strings in the result.

       typesplit_result =
        | Text ofstring
        | Delim ofstringvalfull_split : regexp->string->split_resultlist

       Same as Str.split_delim , but returns  the  delimiters  as  well  as  the  substrings  contained  between
       delimiters.   The former are tagged Delim in the result list; the latter are tagged Text .  For instance,
       full_split(regexp"[{}]")"{ab}" returns [Delim"{";Text"ab";Delim"}"] .

       valbounded_full_split : regexp->string->int->split_resultlist

       Same as Str.bounded_split_delim , but returns the delimiters as well as the substrings contained  between
       delimiters.  The former are tagged Delim in the result list; the latter are tagged Text .

   Extractingsubstringsvalstring_before : string->int->stringstring_beforesn  returns  the substring of all characters of s that precede position n (excluding the
       character at position n ).

       valstring_after : string->int->stringstring_aftersn returns the substring of all characters of s  that  follow  position  n  (including  the
       character at position n ).

       valfirst_chars : string->int->stringfirst_charssn returns the first n characters of s .  This is the same function as Str.string_before .

       vallast_chars : string->int->stringlast_charssn returns the last n characters of s .

OCamldoc                                           2025-06-12                                            Str(3o)

Module

       Module   Str

Name

       Str - Regular expressions and high-level string processing

See Also