SimpleInterfacechar*cmark_markdown_to_html(constchar*text, size_tlen, intoptions)
Convert text (assumed to be a UTF-8 encoded string with length len) from CommonMark Markdown to HTML,
returning a null-terminated, UTF-8-encoded string. It is the caller's responsibility to free the returned
buffer.
NodeStructure
typedef enum {
/* Error status */
CMARK_NODE_NONE,
/* Block */
CMARK_NODE_DOCUMENT,
CMARK_NODE_BLOCK_QUOTE,
CMARK_NODE_LIST,
CMARK_NODE_ITEM,
CMARK_NODE_CODE_BLOCK,
CMARK_NODE_HTML_BLOCK,
CMARK_NODE_CUSTOM_BLOCK,
CMARK_NODE_PARAGRAPH,
CMARK_NODE_HEADING,
CMARK_NODE_THEMATIC_BREAK,
CMARK_NODE_FIRST_BLOCK = CMARK_NODE_DOCUMENT,
CMARK_NODE_LAST_BLOCK = CMARK_NODE_THEMATIC_BREAK,
/* Inline */
CMARK_NODE_TEXT,
CMARK_NODE_SOFTBREAK,
CMARK_NODE_LINEBREAK,
CMARK_NODE_CODE,
CMARK_NODE_HTML_INLINE,
CMARK_NODE_CUSTOM_INLINE,
CMARK_NODE_EMPH,
CMARK_NODE_STRONG,
CMARK_NODE_LINK,
CMARK_NODE_IMAGE,
CMARK_NODE_FIRST_INLINE = CMARK_NODE_TEXT,
CMARK_NODE_LAST_INLINE = CMARK_NODE_IMAGE
} cmark_node_type;
typedef enum {
CMARK_NO_LIST,
CMARK_BULLET_LIST,
CMARK_ORDERED_LIST
} cmark_list_type;
typedef enum {
CMARK_NO_DELIM,
CMARK_PERIOD_DELIM,
CMARK_PAREN_DELIM
} cmark_delim_type;
Custommemoryallocatorsupport
typedef struct cmark_mem {
void *(*calloc)(size_t, size_t);
void *(*realloc)(void *, size_t);
void (*free)(void *);
} cmark_mem;
Defines the memory allocation functions to be used by CMark when parsing and allocating a document tree
cmark_mem*cmark_get_default_mem_allocator()
Returns a pointer to the default memory allocator.
CreatingandDestroyingNodescmark_node*cmark_node_new(cmark_node_typetype)
Creates a new node of type type. Note that the node may have other required properties, which it is the
caller's responsibility to assign.
cmark_node*cmark_node_new_with_mem(cmark_node_typetype, cmark_mem*mem)
Same as cmark_node_new, but explicitly listing the memory allocator used to allocate the node. Note: be
sure to use the same allocator for every node in a tree, or bad things can happen.
voidcmark_node_free(cmark_node*node)
Frees the memory allocated for a node and any children.
TreeTraversalcmark_node*cmark_node_next(cmark_node*node)
Returns the next node in the sequence after node, or NULL if there is none.
cmark_node*cmark_node_previous(cmark_node*node)
Returns the previous node in the sequence after node, or NULL if there is none.
cmark_node*cmark_node_parent(cmark_node*node)
Returns the parent of node, or NULL if there is none.
cmark_node*cmark_node_first_child(cmark_node*node)
Returns the first child of node, or NULL if node has no children.
cmark_node*cmark_node_last_child(cmark_node*node)
Returns the last child of node, or NULL if node has no children.
Iterator
An iterator will walk through a tree of nodes, starting from a root node, returning one node at a time,
together with information about whether the node is being entered or exited. The iterator will first
descend to a child node, if there is one. When there is no child, the iterator will go to the next
sibling. When there is no next sibling, the iterator will return to the parent (but with a
cmark_event_type of CMARK_EVENT_EXIT). The iterator will return CMARK_EVENT_DONE when it reaches the root
node again. One natural application is an HTML renderer, where an ENTER event outputs an open tag and an
EXIT event outputs a close tag. An iterator might also be used to transform an AST in some systematic
way, for example, turning all level-3 headings into regular paragraphs.
void
usage_example(cmark_node *root) {
cmark_event_type ev_type;
cmark_iter *iter = cmark_iter_new(root);
while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) {
cmark_node *cur = cmark_iter_get_node(iter);
// Do something with `cur` and `ev_type`
}
cmark_iter_free(iter);
}
Iterators will never return EXIT events for leaf nodes, which are nodes of type:
• CMARK_NODE_HTML_BLOCK
• CMARK_NODE_THEMATIC_BREAK
• CMARK_NODE_CODE_BLOCK
• CMARK_NODE_TEXT
• CMARK_NODE_SOFTBREAK
• CMARK_NODE_LINEBREAK
• CMARK_NODE_CODE
• CMARK_NODE_HTML_INLINE
Nodes must only be modified after an EXIT event, or an ENTER event for leaf nodes.
typedef enum {
CMARK_EVENT_NONE,
CMARK_EVENT_DONE,
CMARK_EVENT_ENTER,
CMARK_EVENT_EXIT
} cmark_event_type;
cmark_iter*cmark_iter_new(cmark_node*root)
Creates a new iterator starting at root. The current node and event type are undefined until
cmark_iter_next is called for the first time. The memory allocated for the iterator should be released
using cmark_iter_free when it is no longer needed.
voidcmark_iter_free(cmark_iter*iter)
Frees the memory allocated for an iterator.
cmark_event_typecmark_iter_next(cmark_iter*iter)
Advances to the next node and returns the event type (CMARK_EVENT_ENTER, CMARK_EVENT_EXIT or
CMARK_EVENT_DONE).
cmark_node*cmark_iter_get_node(cmark_iter*iter)
Returns the current node.
cmark_event_typecmark_iter_get_event_type(cmark_iter*iter)
Returns the current event type.
cmark_node*cmark_iter_get_root(cmark_iter*iter)
Returns the root node.
voidcmark_iter_reset(cmark_iter*iter, cmark_node*current, cmark_event_typeevent_type)
Resets the iterator so that the current node is current and the event type is event_type. The new current
node must be a descendant of the root node or the root node itself.
Accessorsvoid*cmark_node_get_user_data(cmark_node*node)
Returns the user data of node.
intcmark_node_set_user_data(cmark_node*node, void*user_data)
Sets arbitrary user data for node. Returns 1 on success, 0 on failure.
cmark_node_typecmark_node_get_type(cmark_node*node)
Returns the type of node, or CMARK_NODE_NONE on error.
constchar*cmark_node_get_type_string(cmark_node*node)
Like cmark_node_get_type, but returns a string representation of the type, or "<unknown>".
constchar*cmark_node_get_literal(cmark_node*node)
Returns the string contents of node, or an empty string if none is set. Returns NULL if called on a node
that does not have string content.
intcmark_node_set_literal(cmark_node*node, constchar*content)
Sets the string contents of node. Returns 1 on success, 0 on failure.
intcmark_node_get_heading_level(cmark_node*node)
Returns the heading level of node, or 0 if node is not a heading.
intcmark_node_set_heading_level(cmark_node*node, intlevel)
Sets the heading level of node, returning 1 on success and 0 on error.
cmark_list_typecmark_node_get_list_type(cmark_node*node)
Returns the list type of node, or CMARK_NO_LIST if node is not a list.
intcmark_node_set_list_type(cmark_node*node, cmark_list_typetype)
Sets the list type of node, returning 1 on success and 0 on error.
cmark_delim_typecmark_node_get_list_delim(cmark_node*node)
Returns the list delimiter type of node, or CMARK_NO_DELIM if node is not a list.
intcmark_node_set_list_delim(cmark_node*node, cmark_delim_typedelim)
Sets the list delimiter type of node, returning 1 on success and 0 on error.
intcmark_node_get_list_start(cmark_node*node)
Returns starting number of node, if it is an ordered list, otherwise 0.
intcmark_node_set_list_start(cmark_node*node, intstart)
Sets starting number of node, if it is an ordered list. Returns 1 on success, 0 on failure.
intcmark_node_get_list_tight(cmark_node*node)
Returns 1 if node is a tight list, 0 otherwise.
intcmark_node_set_list_tight(cmark_node*node, inttight)
Sets the "tightness" of a list. Returns 1 on success, 0 on failure.
constchar*cmark_node_get_fence_info(cmark_node*node)
Returns the info string from a fenced code block.
intcmark_node_set_fence_info(cmark_node*node, constchar*info)
Sets the info string in a fenced code block, returning 1 on success and 0 on failure.
constchar*cmark_node_get_url(cmark_node*node)
Returns the URL of a link or image node, or an empty string if no URL is set. Returns NULL if called on a
node that is not a link or image.
intcmark_node_set_url(cmark_node*node, constchar*url)
Sets the URL of a link or image node. Returns 1 on success, 0 on failure.
constchar*cmark_node_get_title(cmark_node*node)
Returns the title of a link or image node, or an empty string if no title is set. Returns NULL if called
on a node that is not a link or image.
intcmark_node_set_title(cmark_node*node, constchar*title)
Sets the title of a link or image node. Returns 1 on success, 0 on failure.
constchar*cmark_node_get_on_enter(cmark_node*node)
Returns the literal "on enter" text for a custom node, or an empty string if no on_enter is set. Returns
NULL if called on a non-custom node.
intcmark_node_set_on_enter(cmark_node*node, constchar*on_enter)
Sets the literal text to render "on enter" for a custom node. Any children of the node will be rendered
after this text. Returns 1 on success 0 on failure.
constchar*cmark_node_get_on_exit(cmark_node*node)
Returns the literal "on exit" text for a custom node, or an empty string if no on_exit is set. Returns
NULL if called on a non-custom node.
intcmark_node_set_on_exit(cmark_node*node, constchar*on_exit)
Sets the literal text to render "on exit" for a custom node. Any children of the node will be rendered
before this text. Returns 1 on success 0 on failure.
intcmark_node_get_start_line(cmark_node*node)
Returns the line on which node begins.
intcmark_node_get_start_column(cmark_node*node)
Returns the column at which node begins.
intcmark_node_get_end_line(cmark_node*node)
Returns the line on which node ends.
intcmark_node_get_end_column(cmark_node*node)
Returns the column at which node ends.
TreeManipulationvoidcmark_node_unlink(cmark_node*node)
Unlinks a node, removing it from the tree, but not freeing its memory. (Use cmark_node_free for that.)
intcmark_node_insert_before(cmark_node*node, cmark_node*sibling)
Inserts sibling before node. Returns 1 on success, 0 on failure.
intcmark_node_insert_after(cmark_node*node, cmark_node*sibling)
Inserts sibling after node. Returns 1 on success, 0 on failure.
intcmark_node_replace(cmark_node*oldnode, cmark_node*newnode)
Replaces oldnode with newnode and unlinks oldnode (but does not free its memory). Returns 1 on success, 0
on failure.
intcmark_node_prepend_child(cmark_node*node, cmark_node*child)
Adds child to the beginning of the children of node. Returns 1 on success, 0 on failure.
intcmark_node_append_child(cmark_node*node, cmark_node*child)
Adds child to the end of the children of node. Returns 1 on success, 0 on failure.
voidcmark_consolidate_text_nodes(cmark_node*root)
Consolidates adjacent text nodes.
Parsing
Simple interface:
cmark_node *document = cmark_parse_document("Hello *world*", 13,
CMARK_OPT_DEFAULT);
Streaming interface:
cmark_parser *parser = cmark_parser_new(CMARK_OPT_DEFAULT);
FILE *fp = fopen("myfile.md", "rb");
while ((bytes = fread(buffer, 1, sizeof(buffer), fp)) > 0) {
cmark_parser_feed(parser, buffer, bytes);
if (bytes < sizeof(buffer)) {
break;
}
}
document = cmark_parser_finish(parser);
cmark_parser_free(parser);
cmark_parser*cmark_parser_new(intoptions)
Creates a new parser object.
cmark_parser*cmark_parser_new_with_mem(intoptions, cmark_mem*mem)
Creates a new parser object with the given memory allocator
voidcmark_parser_free(cmark_parser*parser)
Frees memory allocated for a parser object.
voidcmark_parser_feed(cmark_parser*parser, constchar*buffer, size_tlen)
Feeds a string of length len to parser.
cmark_node*cmark_parser_finish(cmark_parser*parser)
Finish parsing and return a pointer to a tree of nodes.
cmark_node*cmark_parse_document(constchar*buffer, size_tlen, intoptions)
Parse a CommonMark document in buffer of length len. Returns a pointer to a tree of nodes. The memory
allocated for the node tree should be released using cmark_node_free when it is no longer needed.
cmark_node*cmark_parse_file(FILE*f, intoptions)
Parse a CommonMark document in file f, returning a pointer to a tree of nodes. The memory allocated for
the node tree should be released using cmark_node_free when it is no longer needed.
Renderingchar*cmark_render_xml(cmark_node*root, intoptions)
Render a node tree as XML. It is the caller's responsibility to free the returned buffer.
char*cmark_render_html(cmark_node*root, intoptions)
Render a node tree as an HTML fragment. It is up to the user to add an appropriate header and footer. It
is the caller's responsibility to free the returned buffer.
char*cmark_render_man(cmark_node*root, intoptions, intwidth)
Render a node tree as a groff man page, without the header. It is the caller's responsibility to free the
returned buffer.
char*cmark_render_commonmark(cmark_node*root, intoptions, intwidth)
Render a node tree as a commonmark document. It is the caller's responsibility to free the returned
buffer.
char*cmark_render_latex(cmark_node*root, intoptions, intwidth)
Render a node tree as a LaTeX document. It is the caller's responsibility to free the returned buffer.
Options
#define CMARK_OPT_DEFAULT 0
Default options.
Optionsaffectingrendering
#define CMARK_OPT_SOURCEPOS (1 << 1)
Include a data-sourcepos attribute on all block elements.
#define CMARK_OPT_HARDBREAKS (1 << 2)
Render softbreak elements as hard line breaks.
#define CMARK_OPT_SAFE (1 << 3)
CMARK_OPT_SAFE is defined here for API compatibility, but it no longer has any effect. "Safe" mode is now
the default: set CMARK_OPT_UNSAFE to disable it.
#define CMARK_OPT_UNSAFE (1 << 17)
Render raw HTML and unsafe links (javascript:, vbscript:, file:, and data:, except for image/png,
image/gif, image/jpeg, or image/webp mime types). By default, raw HTML is replaced by a placeholder HTML
comment. Unsafe links are replaced by empty strings.
#define CMARK_OPT_NOBREAKS (1 << 4)
Render softbreak elements as spaces.
Optionsaffectingparsing
#define CMARK_OPT_NORMALIZE (1 << 8)
Legacy option (no effect).
#define CMARK_OPT_VALIDATE_UTF8 (1 << 9)
Validate UTF-8 in the input before parsing, replacing illegal sequences with the replacement character
U+FFFD.
#define CMARK_OPT_SMART (1 << 10)
Convert straight quotes to curly, --- to em dashes, -- to en dashes.
Versioninformationintcmark_version(void)
The library version as integer for runtime checks. Also available as macro CMARK_VERSION for compile time
checks.
• Bits 16-23 contain the major version.
• Bits 8-15 contain the minor version.
• Bits 0-7 contain the patchlevel.
In hexadecimal format, the number 0x010203 represents version 1.2.3.
constchar*cmark_version_string(void)
The library version string for runtime checks. Also available as macro CMARK_VERSION_STRING for compile
time checks.