Pre-PEP: d-string / Dedented Multiline Strings with Optional Language Hinting (original) (raw)

Hi everyone, I’d like to open a pep, so I may have made mistakes in the process.

Appreciate any feedback.
Thanks,


PEP: NNNN
Title: Dedented Multiline Strings with Optional Language Hinting
Author:
Discussions-To: Pending
Status: Draft
Type: Standards Track
Created: 06-May-2025
Python-Version: 3.15

Abstract

This PEP proposes a new string prefix d for Python that enables automatic
dedenting of multiline strings.
Optionally, a language hint may be included immediately after the opening
triple quotes to assist editors and tooling with syntax highlighting.

Motivation

Writing readable, properly formatted multiline strings currently requires
the use of textwrap.dedent, adding verbosity and cognitive overhead.
Developers often embed structured content (SQL, HTML, …) within strings,
but Python does not natively support lightweight language hinting.

Combining dedenting with an optional language hint improves both code
readability and tooling support without impacting runtime behavior.

Additionally, this proposal complements the upcoming t-string
feature (PEP 750), which would allow template strings with placeholders.
By combining the d-string with t-strings, developers can easily format and
manage indented content (like SQL queries or template code) while keeping
everything neat and readable.

The d-string’s dedenting behavior would work seamlessly with t-strings,
enabling both automatic formatting and interpolation in a single workflow,
streamlining code maintenance and readability.

Rationale

Specification

Syntax examples

Dedented string without hint:

.. code-block:: python

d"“”
Some plain text
across multiple lines
“”"

Dedented string with language hint:

.. code-block:: python

d""“sql
SELECT *
FROM users
“””

Additional example with another language hint:

.. code-block:: python

d""“jinja2

{{ title }}

“””

In both cases, the hint (“sql”, “jinja2”) is purely for tooling and has
no impact at runtime.
It allows syntax highlighting, linting to improves the developer experience.

Invalid usage examples

.. code-block:: python

d"Just a single-line string" # Error: must use triple quotes

Backwards Compatibility

This proposal introduces a new string prefix d.
It does not affect any existing code.

Security Implications

Security considerations for this feature would mostly involve data injection,
especially since we’re dealing with string manipulation and possibly embedding
SQL, HTML, or any templating languages.

Mitigation: While the d-string itself doesn’t execute the content, users
should be aware of security concerns when the content is later executed or
rendered.

How to Teach This

The d prefix will be documented as part of the language standard.
Code editors should update their syntax highlighting for d-string embeded
language hinting.

Reference Implementation

A reference implementation may be found at:

Footnotes

.. [1] CommonMark Spec
.. [2] Reference | Cucumber

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

Nineteendo (Nice Zombies) May 6, 2025, 5:18pm 2

This reminds me of this discussion: D-string vs str.dedent()

rrolls (Rob Rolls) May 6, 2025, 6:34pm 3

t-strings should be used for this.

Instead of your

some_sql = d'''sql
select * from
...long_table_expression...
...long_table_expression...
...long_table_expression...
where foo
'''

you should be writing

some_sql = some_sql_library.SQL(t'''
select * from
...long_table_expression...
...long_table_expression...
...long_table_expression...
where foo
''')

The idea here (though I don’t know if this has actually been implemented anywhere yet) is that if some_sql_library.SQL is appropriately typed and your editor has appropriate support, the content of the t-string will be automatically known to be SQL and highlighted appropriately. We shouldn’t be adding a second way to do this, and a Markdown-style tag immediately before or after the leading (triple-)quote isn’t as useful/flexible as an actual expression that can be evaluated at type-checking/static-analysis time or indeed at runtime (such as a function or class, here some_sql_library.SQL).

For my reasoning, please see my earlier explanation at PEP 750: Tag Strings For Writing Domain-Specific Languages - #161 by rrolls (and various other posts I made later in that thread).


Does it?

This is how I use triple-quoted strings:

# suppose there's some reason I'd be indented multiple times...
def foo(items):
    for item in items:
        if item.bar:
            do_something_with_string('''
contents
of a rather long
triple-quoted
string
''')
            do_something_else_conditionally()
        do_something_else_with_item(item)
    do_stuff_after_the_loop()

and I’ve never had any trouble with it.

On the contrary, if I were to align the contents of my string with the code, it starts to look like my string is Python code, and it can get very difficult to understand what is Python and what is string contents.

It’s also much more convenient to have my string contents left-aligned in the source code, because then I know I can use the same full width of a line (whatever the line width happens to be for the code style of the project I’m working on) regardless of the indentation level of the code where the triple-quoted string is placed.

I always place a newline at the beginning and end of my triple-quoted strings, so I don’t have anything after the opening ''' on its line, nor before the closing ''' on its line. If my string needs to not contain that extra pair of newlines, I just put [1:-1] immediately after the closing '''.

So maybe this would help people with certain code styles that call for the content of triple-quoted strings to be lined up with the code - but I’d argue that any potential benefit of adding something for that purpose would need to be weighed against the alternative of just suggesting that people adopt the pattern I’ve described here - which already works just fine.


Since these two points are the only purported benefits, I’d give a -1 overall.

bwoodsend (Brénainn Woodsend) May 6, 2025, 8:28pm 4

If we could have an indentation friendly way of writing multi-line strings then it would honestly be my favorite change since Python 3.6’s adding os.PathLike(). Having to choose between textwrap.dedent() [1], disrupting the indentation flow by literally writing the string without indentation [2] or making them globals [3] is a real pain point for me.

It would also make long literals so much cleaner. Every time I want to write a halfway nontrivial error message I always wish I could just write…

        raise SomeException(d"""
            Some explanation that is longer than the ~60 characters of width \
            I normally have left at this point because I'm already a few \
            levels of indentation deep. Blah de blah de blah.
        """)

… but I can’t because that indentation propagates into the output. Instead we just have to write a sequence of single line strings which are a pain to format and an even bigger pain to rewrap should you go back and edit one.

That said, I do prefer the version from the previous thread. It’s explicit about stripping the first newline and provides a way to preserve some indentation.


  1. which doesn’t play well with multi-line f"string" substitutions ↩︎
  2. why not go all out and replace all indentation in Python with braces if you find indented strings unnecessary? ↩︎
  3. yay! I get more fragmented code for every string substitution! ↩︎

mardiros (Guillaume Gauvrit) May 6, 2025, 8:57pm 5

I am afraid to tell that I was not aware that the discussion has been made bout it.
Many thanks you for the link.

JamesParrott (James Parrott) May 6, 2025, 10:27pm 6

If d-strings do something simple like automatically havetextwrap.dedent called on them, and are otherwise unproblematic, then I want this feature so bad!

bryevdv (Bryan Van de Ven) May 7, 2025, 12:11am 7

        raise SomeException(
            "Some explanation that is longer than the ~60 characters of width "
            "I normally have left at this point because I'm already a few "
            "levels of indentation deep. Blah de blah de blah."
        )

jamestwebber (James Webber) May 7, 2025, 12:27am 8

How about from textwrap import dedent as d?

blhsing (Ben Hsing) May 7, 2025, 1:17am 9

+1. This would make a lot of embedded multi-line strings much easier to read without the added noise of a call to textwrap.dedent.

This doesn’t allow implicit newlines and isn’t copy-and-paste-friendly though.

methane (Inada Naoki) May 7, 2025, 1:22am 10

Thank you for writing the PEP.
Since t-string is accepted, I think d-string is more important than str.dedent().

I didn’t have enough time since previous discussion. I would be a sponsor of this PEP if you would reflect the previous discussion.

One idea: How about using ``` instead of d prefix?
Pros: It looks similar to markdown many people knows.
Cons: It is difficult to write markdown containing code fence in the multiline literal.

blhsing (Ben Hsing) May 7, 2025, 1:33am 11

Very interesting. Using the line continuation marker as a way to suppress newlines while dedenting the next lines goes beyond what textwrap.dedent currently has to offer and actually requires changes to the parser/tokenizer for this to happen. Adds complexity to the implementation for sure but would actually make the proposal a clear win over the status quo of using textwrap.dedent. Would love to see this feature included.

blhsing (Ben Hsing) May 7, 2025, 2:08am 12

The pros sound good, while the cons may be alleviated if we allow the d prefix as an alternative syntax, much like how people are already free to choose between ' and ", and ''' and """, depending on what’s enclosed.

blhsing (Ben Hsing) May 7, 2025, 2:35am 13

On second thought, it is usually unlikely that you would have some lines with line continuation markers and some without, in a single string literal. So instead of a line continuation marker at the end of every line, which adds noise to both reading and writing, we may allow a “line continuation mode” through a different prefix such as D.

        raise SomeException(D"""
            Some explanation that is longer than the ~60 characters of width 
            I normally have left at this point because I'm already a few 
            levels of indentation deep. Blah de blah de blah.
        """) # results in a dedented string with no newlines

The line continuation marker feature may still be useful when there are mutiple paragraphs:

        raise SomeException(d"""
            Some explanation that is longer than the ~60 characters of width \
            I normally have left at this point because I'm already a few \
            levels of indentation deep.

            - detail #1: ...
            - detail #2: ...
        """)

although the above may also be accomplished with concatenation of two string literals, depending on personal preferences:

        raise SomeException(D"""
            Some explanation that is longer than the ~60 characters of width 
            I normally have left at this point because I'm already a few 
            levels of indentation deep.
        """
        d"""
            - detail #1: ...
            - detail #2: ...
        """)

Nineteendo (Nice Zombies) May 7, 2025, 5:04am 14

Maybe not if we’re using raw strings:

from textwrap import dedent

print(dedent(r"""
    Some explanation that is longer than the ~60 characters of width \
    I normally have left at this point because I'm already a few \
    levels of indentation deep. Blah de blah de blah.
""").replace("\\\n", ""))

mardiros (Guillaume Gauvrit) May 7, 2025, 5:57am 15

Thank you for the link, my point of view is that t-string will very specific and will not handle
all the use cases. Jinja2 will not be rewritten using t-string I guess.

When I write multi-line strings, I usually place them at the top of the file to avoid indentation issues. I’ve never really had trouble distinguishing between string content, Python code, or even docstrings. The dedent function was created to keep text left-aligned, especially when defined inside indented blocks, which helps improve readability.

mardiros (Guillaume Gauvrit) May 7, 2025, 6:01am 16

The idea is also to provide a meta information about the nature of the string,
its the content type, or a language hinting for linters and code editor.

AA-Turner (Adam Turner) May 7, 2025, 9:57am 17

I see value in some mechanism to make removing leading whitespace from strings easier. I don’t entirely understand why this proposal also includes ‘language hinting’, that was discussed and rejected in the t-string proposal, as e.g. ‘SQL’ could be any number of dialects.

I would be willing to sponsor a simple PEP that added e.g. str.dedent(), but I would be hesitant to add a new string prefix, and I don’t support the language tags.

A

mardiros (Guillaume Gauvrit) May 7, 2025, 12:25pm 18

I agree that SQL could be any number of dialects, but my code editor already knows how to highlight it. The idea is that the language marker is purely syntactic sugar for the code editor or linter, not for the Python interpreter itself.

AA-Turner (Adam Turner) May 7, 2025, 12:30pm 19

Considering from another direction, some editors (e.g. PyCharm) support ‘language injection’ via special comments, such as # language=SQL. What problem does this proposal solve that specially recognised comments can’t? I still fail to see the connexion between dedented strings and hinting of arbitrary embedded programming languages, and why they should be in the same proposal.

A

pf_moore (Paul Moore) May 7, 2025, 12:34pm 20

Are there any examples of existing programming languages[1] with a construct like this? I don’t dispute that it would be convenient to have something like this, but if no other language has ever implemented it (and in my experience, that’s the case) then I’d have to question why.


  1. Markdown isn’t a programming language ↩︎