msg278820 - (view) |
Author: Константин Волков (Константин Волков) |
Date: 2016-10-17 17:08 |
There is strange thing with long headers serialized, they have \n prefix. Example fails on Python3.4/3.5: from email.message import Message from email import message_from_bytes x = '<147672320775.19544.6718708004153358411@mkren-spb.root.devdomain.local>' header = 'Message-ID' msg = Message() msg[header] = x data = msg.as_bytes() = message_from_bytes(data) print(x) print([header]) assert [header] == x MessageID was generated by email.utils.make_msgid function. |
|
|
msg278821 - (view) |
Author: Константин Волков (Константин Волков) |
Date: 2016-10-17 17:10 |
Something with copy paste. x = '<147672320775.19544.6718708004153358411@mkren-spb.root.devdomain.local>' |
|
|
msg278822 - (view) |
Author: Константин Волков (Константин Волков) |
Date: 2016-10-17 17:12 |
Something with inserting long strings here. Its duplicating for some reason. Adding example as attachment. |
|
|
msg278823 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2016-10-17 17:22 |
Ah, interesting case. Both the old folder/parser and the new folder/parser fail, in slightly different ways. I'll have to add this test case to the tests as I finish rewriting the folder. Thanks for the report. |
|
|
msg278894 - (view) |
Author: Mariusz Masztalerczuk (mmasztalerczuk) * |
Date: 2016-10-18 15:57 |
I think that it is not bug. It is just rfc ;) Due to https://www.ietf.org/rfc/rfc2822.txt, A message consists of header fields, optionally followed by a message body. Lines in a message MUST be a maximum of 998 characters excluding the CRLF, but it is RECOMMENDED that lines be limited to 78 characters excluding the CRLF Because you have the line with the size more then 78 chars (the header + value), the python is trying to break this line into two. Maybe there should be option to increase this value to something more then 78? (because max is 998 due to rfc) |
|
|
msg278904 - (view) |
Author: Константин Волков (Константин Волков) |
Date: 2016-10-18 16:22 |
But message ID have its own syntax https://www.ietf.org/rfc/rfc2822.txt: 3.6.4. Identification fields message-id = "Message-ID:" msg-id CRLF msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] 3.2.3. Folding white space and comments However, where CFWS occurs in this standard, it MUST NOT be inserted in such a way that any line of a folded header field is made up entirely of WSP characters and nothing else. Its not obvious, but it seems that there must be no CRLF symbol before MessageID. |
|
|
msg278909 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2016-10-18 16:49 |
It is a bug, but it is not a bug that the message-id body gets put on a second line. The old (compat32) folder introduces an extra space while folding, which then gets preserved when the re-parsing is done. The new folder (policy=default) folds correctly (putting the id on a separate line), but the parser fails to remove the leading blank from the value when it is parsed. It should remove the leading blank because that blank "belongs" to the header label (the "Message-Id:" part). The RFC caution about whitespace only lines applies to whole lines; the first line in the present example is not blank because it has the header label on it. I also need to add a test with a Message-Id that is in itself longer than 77 characters. Such a header can't be folded, so it will have to be emitted with a length longer than the default. (And yes, the default can be changed to any value you like, see Policy.max_line_len). |
|
|