Extensions (original) (raw)

Contents

Extensions need to extend the parser, HTML renderer, formatter, or any combination of these. To use an extension, the builder objects can be configured with a list of extensions. Because extensions are optional, they live in separate artifacts, requiring additional dependencies to be added to the project.

Let's look at how to enable tables from GitHub Flavored Markdown or MultiMarkdown, which ever your prefer. First, add all modules as a dependency (see Maven Central for individual modules):

com.vladsch.flexmark flexmark-all 0.64.8

Configure the extension on the builders:

import com.vladsch.flexmark.ext.tables.TablesExtension; import com.vladsch.flexmark.ext.gfm.strikethrough.StrikethroughExtension;

class SomeClass { static final DataHolder OPTIONS = new MutableDataSet() .set(Parser.EXTENSIONS, Arrays.asList(TablesExtension.create(), StrikethroughExtension.create())) .toImmutable();

Parser parser = Parser.builder(options).build();
HtmlRenderer renderer = HtmlRenderer.builder(options).build();

}

Configuring Options

A generic options API allows easy configuration of the parser, renderer and extensions. It consists of DataKey<T> instances defined by various components. Each data key defines the type of its value and a default value.

The values are accessed via the DataHolder and MutableDataHolder interfaces, with the former being a read only container. Since the data key provides a unique identifier for the data there is no collision for options.

To configure the parser or renderer, pass a data holder to the builder() method with the desired options configured, including extensions.

import com.vladsch.flexmark.html.HtmlRenderer; import com.vladsch.flexmark.parser.Parser;

public class SomeClass { static final DataHolder OPTIONS = new MutableDataSet() .set(Parser.REFERENCES_KEEP, KeepType.LAST) .set(HtmlRenderer.INDENT_SIZE, 2) .set(HtmlRenderer.PERCENT_ENCODE_URLS, true)

        // for full GFM table compatibility add the following table extension options:
        .set(TablesExtension.COLUMN_SPANS, false)
        .set(TablesExtension.APPEND_MISSING_COLUMNS, true)
        .set(TablesExtension.DISCARD_EXTRA_COLUMNS, true)
        .set(TablesExtension.HEADER_SEPARATOR_COLUMN_MATCH, true)
        .set(Parser.EXTENSIONS, Arrays.asList(TablesExtension.create()))
        .toImmutable();

static final Parser PARSER = Parser.builder(OPTIONS).build();
static final HtmlRenderer RENDERER = HtmlRenderer.builder(OPTIONS).build();

}

In the code sample above, Parser.REFERENCES_KEEP defines the behavior of references when duplicate references are defined in the source. In this case it is configured to keep the last value, whereas the default behavior is to keep the first value.

The HtmlRenderer.INDENT_SIZE and HtmlRenderer.PERCENT_ENCODE_URLS define options to use for rendering. Similarly, extension options can be added at the same time. Any options not set, will default to their respective defaults as defined by their data keys.

All markdown element reference types should be stored using a subclass of NodeRepository<T> as is the case for references, abbreviations and footnotes. This provides a consistent mechanism for overriding the default behavior of these references for duplicates from keep first to keep last.

By convention, data keys are defined in the extension class and in the case of the core in theParser or HtmlRenderer.

Data keys are described in their respective extension classes and in Parser andHtmlRenderer.

Core

Core implements parser, Html renderer and formatter functionality for CommonMark markdown elements.

Parser

Unified options handling added which are also used to selectively disable loading of core parsers and processors.

Parser.builder() now implements MutableDataHolder so you can use get/set to customize properties directly on it or pass it a DataHolder with predefined options.

Defined in Parser class:

ℹ️ In parsers, use state.getParsing().CODE_BLOCK_INDENT to ensure that all parsers have the same setting. Parsing copies the setting from options on creation so having this option changed after parsing phase has started will have no effect.

ℹ️ Parser.USE_HARDCODED_LINK_ADDRESS_PARSER set to true is the default because the regex based parsing requires much more stack space and will causeStackOverflowError error when attempting to parse link URLs larger than about 1.5k characters. This option is available only for backwards compatibility and in case someone customizes the regex for parsing. Performance of the hard-coded parser is on par with the regex one while requiring no stack space for parsing.

Test Regex Parser Hardcoded Parser Options
emphasisClosersWithNoOpeners 205 ms 221 ms Default
emphasisOpenersWithNoClosers 162 ms 170 ms Default
linkClosersWithNoOpeners 61 ms 65 ms Default
linkOpenersAndEmphasisClosers 277 ms 286 ms Default
linkOpenersWithNoClosers 87 ms 89 ms Default
StackOverflow longImageLinkTest 136 ms 738 ms Default
longLinkTest 77 ms 63 ms Default
mismatchedOpenersAndClosers 264 ms 342 ms Default
nestedBrackets 85 ms 72 ms Default
nestedStrongEmphasis 8 ms 6 ms Default
emphasisClosersWithNoOpeners 173 ms 113 ms Space in URLs
emphasisOpenersWithNoClosers 163 ms 123 ms Space in URLs
linkClosersWithNoOpeners 55 ms 56 ms Space in URLs
linkOpenersAndEmphasisClosers 216 ms 229 ms Space in URLs
linkOpenersWithNoClosers 85 ms 85 ms Space in URLs
longImageLinkTest Stack Overflow 684 ms Space in URLs
longLinkTest Stack Overflow 71 ms Space in URLs
mismatchedOpenersAndClosers 214 ms 327 ms Space in URLs
nestedBrackets 55 ms 79 ms Space in URLs
nestedStrongEmphasis 5 ms 6 ms Space in URLs

List Parsing Options

Because list parsing is the greatest discrepancy between Markdown parser implementations. Before CommonMark there was no hard specification for parsing lists and every implementation took artistic license with how it determines what the list should look like.

flexmark-java implements four parser families based on their list processing characteristics. In addition to ParserEmulationProfile setting, each of the families has a standard set of common options that control list processing, with defaults set by each but modifiable by the end user.

There are a few ways to configure the list parsing options:

  1. the recommended way is to apply ParserEmulationProfile to options viaMutableDataHolder.setFrom(ParserEmulationProfile) to have all options configured for a particular parser.
  2. start with the ParserEmulationProfile.getOptions() and modify defaults for the family and then pass it to MutableDataHolder.setFrom(MutableDataSetter)
  3. by configuring an instance of MutableListOptions and then passing it toMutableDataHolder.setFrom(MutableDataSetter)
  4. first via individual keys
List Item Options

⚠️ If both LISTS_ITEM_TYPE_MISMATCH_TO_NEW_LIST andLISTS_ITEM_TYPE_MISMATCH_TO_SUB_LIST are set to true then a new list will be created if the item had a blank line, otherwise a sub-list is created.

List Item Paragraph Interruption Options

Renderer

Unified options handling added, existing configuration options were kept but now they modify the corresponding unified option.

Renderer Builder() now has an indentSize(int) method to set size of indentation for hierarchical tags. Same as setting HtmlRenderer.INDENT_SIZE data key in options.

Defined in HtmlRenderer class:

Suppressed links will render only the child nodes, effectively [Something New](javascript:alert(1)) will render as if it was Something New.

Link suppression based on URL prefixes does not apply to HTML inline or block elements. Use HTML suppression options for this.

❗ All the escape and suppress options have dynamic default value. This allows you to set the ESCAPE_HTML and have all html escaped. If you set a value of a specific key then the set value will be used for that key. Similarly, comment affecting keys take their values from the non-comment counterpart. If you want to exclude comments from being affected by suppression or escaping then you need to set the corresponding comment key to false and set the non-comment key to true.

Formatter

Formatter renders the AST as markdown with various formatting options to clean up and make the source consistent and possibly convert from one indentation rule set to another. Formatter API allows extensions to provide and handle formatting options for custom nodes.

ℹ️ in versions prior to 0.60.0 formatter functionality was implemented inflexmark-formatter module and required an additional dependency.

See: Markdown Formatter

Formatter can also be used to help translate the markdown document to another language by extracting translatable strings, replacing non-translatable strings with an identifier and finally replacing the translatable text spans with translated versions.

See: Translation Helper API

Formatter can be used to merge multiple markdown documents into a single document while preserving reference resolution to references within each document, even when reference ids conflict between merged documents.

See: Markdown Merge API

PDF Output Module

HTML to PDF conversion is done using Open HTML To PDF library by PdfConverterExtension inflexmark-pdf-converter module.

See: PDF Renderer Converter

Usage PDF Output

Available Extensions

The following extensions are developed with this library, each in their own artifact.

Extension options are defined in their extension class.

Abbreviation

Allows to create abbreviations which will be replaced in plain text into <abbr></abbr> tags or optionally into <a></a> with titles for the abbreviation expansion.

Use class AbbreviationExtension from artifact flexmark-ext-abbreviation.

The following options are available:

Defined in AbbreviationExtension class:

Static Field Default Value Description
ABBREVIATIONS new repository repository for document's abbreviation definitions
ABBREVIATIONS_KEEP KeepType.FIRST which duplicates to keep.
USE_LINKS false use instead of tags for rendering html
ABBREVIATIONS_PLACEMENT ElementPlacement.AS_IS formatting option see: Markdown Formatter
ABBREVIATIONS_SORT ElementPlacement.AS_IS formatting option see: Markdown Formatter

Admonition

To create block-styled side content. Based on Admonition Extension, Material for MkDocs(Personal opinion: Material for MkDocs is eye-candy. If you have not taken a look at it, you are missing out on a visual treat.). See Admonition Extension

Use class AbbreviationExtension from artifact flexmark-ext-admonition.

CSS and JavaScript must be included in your page

Default CSS and JavaScript are contained in the jar as resources:

Their content is also available by calling AdmonitionExtension.getDefaultCSS() andAdmonitionExtension.getDefaultScript() static methods.

The script should be included at the bottom of the body of the document and is used to toggle open/closed state of collapsible admonition elements.

Automatically adds anchor links to heading, using GitHub id generation algorithm

⚠️ This extension will only render an anchor link for headers that have an id attribute associated with them. You need to have the HtmlRenderer.GENERATE_HEADER_ID option to set totrue so that header ids are generated.

Use class AnchorLinkExtension from artifact flexmark-ext-anchorlink.

The following options are available:

Defined in AnchorLinkExtension class:

Static Field Default Value Description
ANCHORLINKS_SET_ID true whether to set the id attribute to the header id, if true
ANCHORLINKS_SET_NAME false whether to set the name attribute to the header id, if true
ANCHORLINKS_WRAP_TEXT true whether to wrap the heading text in the anchor, if true
ANCHORLINKS_TEXT_PREFIX "" raw html prefix. Added before heading text, wrapped or unwrapped
ANCHORLINKS_TEXT_SUFFIX "" raw html suffix. Added before heading text, wrapped or unwrapped
ANCHORLINKS_ANCHOR_CLASS "" class for the a tag

Aside

Same as block quotes but uses | for prefix and generates <aside> tags. To make it compatible with the table extension, aside block lines cannot have | as the last character of the line, and if using this extension the tables must have the lines terminate with a | otherwise they will be treated as aside blocks.

Use class AsideExtension from artifact flexmark-ext-aside.

Defined in AsideExtension class:

Static Field Default Value Description
IGNORE_BLANK_LINE false aside block will include blank lines between aside blocks and treat them as if the blank lines are also preceded by the aside block marker
EXTEND_TO_BLANK_LINE false aside blocks extend to blank line when true. Enables more customary a la block quote parsing than commonmark strict standard
ALLOW_LEADING_SPACE true when true leading spaces before > are allowed
INTERRUPTS_ITEM_PARAGRAPH true when true block quotes can interrupt list item text, else need blank line before to be included in list items
INTERRUPTS_PARAGRAPH true when true block quote can interrupt a paragraph, else needs blank line before
WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH true when true block quotes with leading spaces can interrupt list item text, else need blank line before or no leading spaces

AsideExtension option keys are dynamic data keys dependent on corresponding Parser block quote options for their defaults. If they are not explicitly set then they will take their default value from the value of the corresponding block quote value (prefix BLOCK_QUOTE_ toAsideExtension key name to get Parser block quote key name).

⚠️ This can potentially break code relying on versions of the extension before 0.40.20because parsing rules can change depending on which block quote options are changed from their default values.

To ensure independent options for aside blocks and block quotes, set aside options explicitly. The following will set all aside options to default values, independent from block quote options:

.set(EXTEND_TO_BLANK_LINE, false)
.set(IGNORE_BLANK_LINE, false)
.set(ALLOW_LEADING_SPACE, true)
.set(INTERRUPTS_PARAGRAPH, true)
.set(INTERRUPTS_ITEM_PARAGRAPH, true)
.set(WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH, true)

Attributes

Converts attributes {...} syntax into attributes AST nodes and adds an attribute provider to set attributes for immediately preceding sibling element during HTML rendering. SeeAttributes Extension

Defined in AttributeExtension from artifact flexmark-ext-attributes

Use class AttributesExtension from artifact flexmark-ext-attributes.

Full spec:ext_attributes_ast_spec

Turns plain links such as URLs and email addresses into links (based on autolink-java).

⚠️ current implementation has significant performance impact on large files.

Use class AutolinkExtension from artifact flexmark-ext-autolink.

Defined in AsideExtension class:

Definition Lists

Converts definition syntax of Php Markdown Extra Definition List to <dl></dl> HTML and corresponding AST nodes.

Definition items can be preceded by : or ~, depending on the configured options.

Use class DefinitionExtension from artifact flexmark-ext-definition.

The following options are available:

Defined in DefinitionExtension class:

Static Field Default Value Description
COLON_MARKER true enable use of : as definition item marker
MARKER_SPACES 1 minimum number of spaces after definition item marker for valid definition item
TILDE_MARKER true enable use of ~ as definition item marker
DOUBLE_BLANK_LINE_BREAKS_LIST false When true double blank line between definition item and next definition term will break a definition list
FORMAT_MARKER_SPACES 3 formatting option see: Markdown Formatter
FORMAT_MARKER_TYPE DefinitionMarker.ANY formatting option see: Markdown Formatter

ℹ️ this extension uses list parsing and indentation rules and will its best to align list item and definition item parsing according to selected options. For non-fixed indent family of parsers will use the definition item content indent column for sub-items, otherwise uses the Parser.LISTS_ITEM_INDENT value for sub-items.

Wiki: Definition List Extension

Docx Converter

Renders the parsed Markdown AST to docx format using the docx4j library.

artifact: flexmark-docx-converter

See the DocxConverterCommonMark Sample for code and Customizing Docx Rendering for an overview and information on customizing the styles.

Pegdown version can be found in DocxConverterPegdown Sample

For details see Docx Renderer Extension

Emoji

Allows to create image link to emoji images from emoji shortcuts using Emoji-Cheat-Sheet.com,GitHub Emoji API and optionally to replace with its unicode equivalent character with mapping to GitHub shortcut or Emoji-Cheat-Sheet.com shortcut based on the file name.

Use class EmojiExtension from artifact flexmark-ext-emoji.

The following options are available:

Defined in EmojiExtension class:

Enumerated Reference

Used to create numbered references and numbered text labels for elements in the document.Enumerated References Extension

Use class EnumeratedReferenceExtension from artifact flexmark-ext-enumerated-reference.

❗ Note Attributes extension is needed in order for references to be properly resolved for rendering.

Footnotes

Creates footnote references in the document. Footnotes are placed at the bottom of the rendered document with links from footnote references to footnote and vice-versa. Footnotes Extension

Converts: [^footnote] to footnote references and [^footnote]: footnote definition to footnotes in the document.

Gfm-Issues

Enables Gfm issue reference parsing in the form of #123

Use class GfmIssuesExtension in artifact flexmark-ext-gfm-issues.

The following options are available:

Defined in GfmIssuesExtension class:

Gfm-Strikethrough/Subscript

Enables strikethrough of text by enclosing it in ~~. For example, in hey ~~you~~, you will be rendered as strikethrough text.

Use class StrikethroughExtension in artifact flexmark-ext-gfm-strikethrough.

Enables subscript of text by enclosing it in ~. For example, in hey ~you~, you will be rendered as subscript text.

Use class SubscriptExtension in artifact flexmark-ext-gfm-strikethrough.

To enables both subscript and strike through:

Use class StrikethroughSubscriptExtension in artifact flexmark-ext-gfm-strikethrough.

⚠️ Only one of these extensions can be included in the extension list. If you want both strikethrough and subscript use the StrikethroughSubscriptExtension.

The following options are available:

Defined in StrikethroughSubscriptExtension class:

Gfm-TaskList

Enables list items based task lists whose text begins with: [ ], [x] or [X]

Use class TaskListExtension in artifact flexmark-ext-gfm-tasklist.

The following options are available:

Defined in TaskListExtension class:

Static Field Default Value Description
ITEM_DONE_MARKER string to use for the item done marker html.
ITEM_NOT_DONE_MARKER string to use for the item not done marker html.
ITEM_CLASS "task-list-item" tight list item class attribute
ITEM_ITEM_DONE_CLASS "" list item class for done task list item
ITEM_ITEM_NOT_DONE_CLASS "" list item class for not done task list item
LOOSE_ITEM_CLASS value of ITEM_CLASS loose list item class attribute, if not set then will use value of tight item class
PARAGRAPH_CLASS "" p tag class attribute, only applies to loose list items
FORMAT_LIST_ITEM_CASE TaskListItemCase.AS_IS formatting option see: Markdown Formatter
FORMAT_LIST_ITEM_PLACEMENT TaskListItemPlacement.AS_IS formatting option see: Markdown Formatter

Gfm-Users

Enables Gfm user reference parsing in the form of #123

Use class GfmUsersExtension in artifact flexmark-ext-gfm-users.

The following options are available:

Defined in GfmUsersExtension class:

GitLab Flavoured Markdown

Parses and rendersGitLab Flavoured Markdown.

<div class="video-container">  
<video src="video.mp4" width="400" controls="true"></video>  
<p><a href="video.mp4" target="_blank" rel="noopener noreferrer" title="Download 'Sample Video'">Sample Video</a></p>  
</div>  

Use class GitLabExtension in artifact flexmark-ext-gitlab.

The following options are available:

Defined in GitLabExtension class:

ℹ️ to have Math and Mermaid properly rendered requires inclusion ofKatex and Mermaid scripts in the HTML page.

If you have the files in the same directory as the HTML page, somewhere between the <head> and</head> tags, you need to include:

In addition to the Katex script you need to add JavaScript to the bottom of the page body to convert math elements when the DOM is loaded: