gh-121650: Encode newlines in headers, and verify headers are sound by encukou · Pull Request #122233 · python/cpython (original) (raw)
Re: #121812
Hello @basbloemsaat,
I've spent the day reading through the email module, and RFCs, and I believe I found a better place to fix the issue.
This involved lots of experimentation, so I'm sending an alternative PR rather than a review on yours.
- The generator (writer) verifies that the representation of each header is sound (a parser won't treat it as multiple headers, start-of-body, or part of another header). That should cover custom
fold()
implementations orHeader
subclasses.- However, some user out there is probably misusing such header injection in working code, so, I added a policy attribute to turn it back.
- Newlines are encoded in
fold()
, just like undecodable bytes and other special characters.
Overall, this means that we treat newlines as valid content of headers, but “escape” them when such a header is serialized to text.
This PR is a proof of concept. It needs tests and documentation, but I'm out of time for today, and I wanted to share what I have.
Does this look reasonable to you?