gh-121650: Encode newlines in headers, and verify headers are sound by encukou · Pull Request #122233 · python/cpython (original) (raw)

and others added 2 commits

July 24, 2024 15:30

@encukou

This should fail for custom fold() implementations that aren't careful about newlines.

@encukou @basbloemsaat

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.


Credit for an earlier attempt:

Co-Authored-By: Bas Bloemsaat bas@bloemsaat.org

@encukou

@encukou

@encukou

@encukou

I'm not touching other instances in this file, since this PR might be backported to very old versions.

@encukou encukou marked this pull request as ready for review

July 29, 2024 13:18

@encukou

serhiy-storchaka

@encukou @serhiy-storchaka

Co-authored-by: Serhiy Storchaka storchaka@gmail.com

serhiy-storchaka

ambv pushed a commit to ambv/cpython that referenced this pull request

Aug 2, 2024

… are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

Yhg1s pushed a commit that referenced this pull request

Aug 6, 2024

…sound (GH-122233) (#122484)

gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)

GH-GH- Encode header parts that contain newlines

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

GH-GH- Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

Yhg1s pushed a commit that referenced this pull request

Aug 6, 2024

@encukou

…sound (GH-122233) (#122599)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com (cherry picked from commit 0976339)

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Aug 6, 2024

…s are sound

pythongh-121650: Encode newlines in headers, and verify headers are sound (pythonGH-122233)

Encode header parts that contain newlines

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

frenzymadness pushed a commit to frenzymadness/cpython that referenced this pull request

Aug 13, 2024

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

frenzymadness pushed a commit to fedora-python/cpython that referenced this pull request

Aug 15, 2024

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

stratakis pushed a commit to stratakis/cpython that referenced this pull request

Aug 15, 2024

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

hrnciar added a commit to hrnciar/cpython that referenced this pull request

Aug 16, 2024

@hrnciar @bsiem

headers are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Aug 16, 2024

headers are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Aug 20, 2024

headers are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

blhsing pushed a commit to blhsing/cpython that referenced this pull request

Aug 22, 2024

…ound (pythonGH-122233)

Encode header parts that contain newlines

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful about newlines.

Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

ambv added a commit that referenced this pull request

Sep 4, 2024

…ound (GH-122233) (#122611)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

ambv added a commit that referenced this pull request

Sep 4, 2024

…sound (GH-122233) (#122608)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

Verify that email headers are well-formed.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

ambv added a commit that referenced this pull request

Sep 4, 2024

…sound (GH-122233) (#122609)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

ambv added a commit that referenced this pull request

Sep 4, 2024

…ound (GH-122233) (#122610)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com

brainhoard-github pushed a commit to distro-core-curated-mirrors/poky-contrib that referenced this pull request

Sep 16, 2024

@anusurivijay @sakoman

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Apr 23, 2025

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Jul 4, 2025

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

frenzymadness pushed a commit to fedora-python/cpython that referenced this pull request

Aug 12, 2025

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026

pythonGH-122233 added an implementation to Generator to refuse to serialize (write) headers that are unsafely folded or delimited.

This revision adds the same implementation to BytesGenerator, so it gets the same safety protections for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux 5822488+beledouxdenis@users.noreply.github.com Co-authored-by: Petr Viktorin 302922+encukou@users.noreply.github.com Co-authored-by: Bas Bloemsaat 1586868+basbloemsaat@users.noreply.github.com

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026

pythonGH-122233 added an implementation to Generator to refuse to serialize (write) headers that are unsafely folded or delimited.

This revision adds the same implementation to BytesGenerator, so it gets the same safety protections for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux 5822488+beledouxdenis@users.noreply.github.com Co-authored-by: Petr Viktorin 302922+encukou@users.noreply.github.com Co-authored-by: Bas Bloemsaat 1586868+basbloemsaat@users.noreply.github.com

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026

pythonGH-122233 added an implementation to Generator to refuse to serialize (write) headers that are unsafely folded or delimited.

This revision adds the same implementation to BytesGenerator, so it gets the same safety protections for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux 5822488+beledouxdenis@users.noreply.github.com Co-authored-by: Petr Viktorin 302922+encukou@users.noreply.github.com Co-authored-by: Bas Bloemsaat 1586868+basbloemsaat@users.noreply.github.com

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Feb 3, 2026

…s are sound (pythonGH-122233)

Per RFC 2047:

[...] these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include a newline character in a header value, just like we already allow undecodable bytes or control characters. They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from c5bba85.

This commit was backported to simplify the backport of the other commit fixing CVE. The only modification is a removal of one test case which tests multiple changes in Python 3.7 and it wasn't working properly with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin encukou@gmail.com Co-authored-by: Bas Bloemsaat bas@bloemsaat.org Co-authored-by: Serhiy Storchaka storchaka@gmail.com Co-authored-by: bsiem 52461103+bsiem@users.noreply.github.com

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})