Issue 1923: meaningful whitespace can be lost in rfc822_escape (original) (raw)

Created on 2008-01-24 15:43 by stephenemslie, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
distutils_metadata_whitespace.diff stephenemslie,2008-01-28 11:09
Messages (7)
msg61633 - (view) Author: Stephen Emslie (stephenemslie) Date: 2008-01-24 15:43
distutils.util.rfc822_escape strips each line of its whitespace before indenting, but this can mean losing meaningful whitespace, such as in reStructuredText. distutils uses rfc822_escape to escape fields in metadata, such as PKG-INFO. This unfortunately means that you cant use reStructuredText formatting in your long description (suggested in PEP345), or are limited to a set that doesn't require indentation (no block quotes, etc.). for example: >>> rest = """ ... a literal python block:: ... >>> import this ... """ >>> print distutils.util.rfc822_escape(rest) a literal python block:: >>> import this I would be expecting this to look something like: a literal python block:: >>> import this It looks like this behavior was intentionally added in rev 20099, but that was about 7 years ago - before reStructuredText and eggs. I wonder if it makes sense to re-think that implementation with this sort of metadata in mind, assuming this behavior isn't required to be rfc822 compliant. I think it would certainly be a shame to miss out on a good thing like proper (renderable) reST in our metadata. Is distutils being over-cautious in flattening out all whitespace? A w3c discussion on multiple lines in rfc822 [1] seems to suggest that whitespace can be 'unfolded' safely, so it seems a shame to be throwing it away when it can have important meaning. http://www.w3.org/Protocols/rfc822/3_Lexical.html
msg61653 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-24 20:37
Can you provide a patch with doc updates and an unit test?
msg61775 - (view) Author: Stephen Emslie (stephenemslie) Date: 2008-01-28 11:09
Here's that keeps the whitespace in tact, along with a simple test. This doesn't patch docs as the existing documentation_ already describes the long string as multiple lines of "plain text in reStructuredText format", which is what this fixes. .. _documentation: http://docs.python.org/dev/distutils/setupscript.html#additional-meta-data
msg72104 - (view) Author: Simon Cross (hodgestar) Date: 2008-08-28 18:38
I've just checked that the patch still applies cleanly to 2.6 and it does and the tests still passes. It looks like the patch has already been applied to 3.0 but without the test. The test part of the part applies cleanly to 3.0 too.
msg72419 - (view) Author: Simon Cross (hodgestar) Date: 2008-09-03 21:22
Poking the issue.
msg95979 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-12-05 02:22
Notice that we are also losing something else that can mean a lot in reST : empty lines. They also need to be escaped. But we can't do it properly unless we encode empty lines with something else than a 8 space line because when rfc822.Message reads it, it removes it.
msg96022 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-12-06 09:31
I will treat the empty line problem in another issue because I won't apply it in 2.6/3.1. This one is fixed in r76684, r76685, r76686, r76687. Thanks !
History
Date User Action Args
2022-04-11 14:56:30 admin set github: 46217
2009-12-06 09:31:42 tarek set status: open -> closedmessages: + versions: + Python 2.6, Python 3.1, Python 2.7, Python 3.2, - Python 2.5
2009-12-05 02:22:10 tarek set messages: +
2009-12-04 13:26:24 pitrou set assignee: tareknosy: + tarek
2008-09-03 21:22:59 hodgestar set messages: +
2008-08-28 18:38:20 hodgestar set nosy: + hodgestarmessages: +
2008-01-28 11:09:38 stephenemslie set files: + distutils_metadata_whitespace.diffmessages: +
2008-01-24 20:37:59 christian.heimes set priority: lowkeywords: + easymessages: + nosy: + christian.heimes
2008-01-24 15:43:59 stephenemslie create