Refactor fenced code attributes by waylan · Pull Request #816 · Python-Markdown/markdown (original) (raw)

I guess hl_lines is special.

Yes, but only for backward compatibility. Originally, only the language was supported. At some point we added support for hl_lines. Now that we are adding support for all of pygments options, hl_lines does not make sense in its current form. So as not to break existing documents, we continue to support it outside of brackets. Of course, you can define the language and hl_lines within brackets like anything else. Therefore, we may deprecate support outside of brackets in the future.

Is there any way that the attribute handling could be more common across extensions, e.g., I think that pandoc attributes are the same wherever they're supported (and one of the places where they're supported is fenced code blocks). From https://pandoc.org/MANUAL.html:
Optionally, you may attach attributes to fenced or backtick code block using this syntax:

~~~~ {#mycode .haskell .numberLines startFrom="100"}
qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++
              qsort (filter (>= x) xs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here mycode is an identifier, haskell and numberLines are classes,
and startFrom is an attribute with value 100. Some output formats can
use this information to do syntax highlighting.

[...] the code block above will appear as follows:

<pre id="mycode" class="haskell numberLines" startFrom="100">
 <code>
 ...
 </code>
</pre>
In fact (and this is probably the last thing you want to hear!) is there any way that the fenced code block extension could depend on the attr_list extension for processing its attribute lists?

Actually the list parser in the attr_list extension is imported and used by fenced-code as of this PR. However, it is only used for anything defined within the brackets. But as only a single-word language and hl_lines are supported outside of brackets (as explained above), for all intents and purposes, it is an attr_list.

However, it does differ from an attr_list in at least one way. As the key=value pairs are used as Pygments options, it does not make sense to include them in the syntax highlighted output. The exception being classes, which can be defined with keys. We need a way to differentiate between values that should be passed to Pygments, and values which should be assigned as attributes. Additionally, Pygments only provides a way to assign classes. Therefore, as many .classnames as are defined will get added to the class attribute. IDs get ignored because Pygments doesn't provide that option. Everything else is a Pygments option which effects the output, but doesn't get included in the output.

Now, that behavior is dependent on syntax highlighting being enabled (codehilite extension is enabled). Without syntax highlighting, then the ID gets set on the pre tag and class gets set on the code tag. I suppose we could also assign key=value pairs. However, in its current state, that doesn't happen. key=value pairs are simply ignored. This was done to match PHP's behavior of only allowing class and id to be set. After all, we claim to imitate that implementation. It also happens to match GitHub's implementation.

However, an argument could be made that including random attributes in the output would allow JavaScript highlighting libs to make use of them. And we have Pandoc as another implementation which already supports that. We also have received bug reports from time to time from users who expected any attribute to work if/when the attr_list is enabled. I suppose we could change behavior based on that by only assigning id and class by default, but assigning all defined attributes if attr_list is also enabled (and code highlighting is off).