Issue 10281: Exception raised when an NNTP overview field is absent (original) (raw)

Created on 2010-11-01 19:58 by jelie, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
nntpover.patch pitrou,2010-11-02 22:52
nntpover2.patch pitrou,2010-11-02 23:19
Messages (12)
msg120158 - (view) Author: Julien ÉLIE (jelie) Date: 2010-11-01 19:58
Following the first example of the documentation: import nntplib s = nntplib.NNTP('news.trigofacile.com') resp, count, first, last, name = s.group('fr.comp.lang.python') print('Group', name, 'has', count, 'articles, range', first, 'to', last) resp, overviews = s.over((last - 9, last)) for id, over in overviews: print(id, nntplib.decode_header(over['subject'])) s.quit() An exception is raised: "OVER/XOVER response doesn't include names of additional headers" I believe the issue comes from the fact that the source code does not handle the case described in Section 8.3.2 of RFC 3977: For all fields, the value is processed by first removing all CRLF pairs (that is, undoing any folding and removing the terminating CRLF) and then replacing each TAB with a single space. If there is no such header in the article, no such metadata item, or no header or item stored in the database for that article, the corresponding field MUST be empty. Example of a successful retrieval of overview information for a range of articles: [C] GROUP misc.test [S] 211 1234 3000234 3002322 misc.test [C] OVER 3000234-3000240 [S] 224 Overview information follows [S] 3000234|I am just a test article "Demo User" <nobody@example.com> 6 Oct 1998 04:38:40 -0500
msg120272 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-02 22:41
I am wondering how to return the corresponding information. Should the field be totally absent from the returned dictionary, should it map to the empty string, or should it map to None? I'm leaning towards the latter (map to None), but perhaps the empty string is better?
msg120275 - (view) Author: Julien ÉLIE (jelie) Date: 2010-11-02 22:45
The empty string would mean the header exists, and is empty (though not RFC-compliant). For instance: "User-Agent: \r\n" I believe None is better.
msg120278 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-02 22:52
Here is a patch for returning None on absent fields. (works with trigofacile.com)
msg120282 - (view) Author: Julien ÉLIE (jelie) Date: 2010-11-02 23:05
OK, thanks. By the way, why is the token stripped? token = token[len(h):].lstrip(" ") "X-Header: test \r\n" in an header is kept in the overview as-is. I do not see why " test " should not be the value returned. Also, with: token = token or None "X-Header: \r\n" becomes None if I understand how the source code works... Yet, it is a real '', not None.
msg120283 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-02 23:08
> OK, thanks. > By the way, why is the token stripped? > token = token[len(h):].lstrip(" ") > > "X-Header: test \r\n" in an header is kept in the overview as-is. > I do not see why " test " should not be the value returned. It's a simple way of handling "Xref: foo" and returning "foo" rather than " foo". If spaces are supposed to be significant I can just strip the first one, though. > Also, with: > token = token or None > > "X-Header: \r\n" becomes None if I understand how the source code > works... Yet, it is a real '', not None. Er, so you're disagreeing with your previous message? Or am I missing something? :)
msg120286 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-02 23:18
Here is a patch trying to better handle whitespace. Would it be ok for you?
msg120287 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-02 23:19
Oops, sorry.
msg120299 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-11-03 02:00
My conclusion in working on the email package is that only the first space after the ':', if it exists, should be stripped. That is, even though the RFC (for email) reads as if the space after the colon is part of the value, in practice it is part of the delimiter, but is optional (and almost always present, in email). Whether additional leading spaces are significant depends on why they are there. Since they are an unusual case, I would choose to preserve them on the theory that someone might care, and that someone who doesn't care can strip them.
msg120332 - (view) Author: Julien ÉLIE (jelie) Date: 2010-11-03 17:55
> Er, so you're disagreeing with your previous message? > Or am I missing something? :) I was saying that if an empty string is returned, then it means that the header exists and is empty. An example was "User-Agent: \r\n". And my remark "I believe None is better." concerned your initial question "Should the field be totally absent [...]" regarding how to deal with a header that does not exist. Therefore, "User-Agent: \r\n" becomes a real '', not None. None is only when the User-Agent: header field is absent from the headers. > Here is a patch trying to better handle whitespace. > Would it be ok for you? Yes Antoine, thanks!
msg120333 - (view) Author: Julien ÉLIE (jelie) Date: 2010-11-03 18:01
> My conclusion in working on the email package is that only > the first space after the ':', if it exists, should be stripped. > That is, even though the RFC (for email) reads as if the space > after the colon is part of the value, in practice it is part > of the delimiter, but is optional (and almost always present, > in email). That is why the RFC (for netnews) explicitly mentions that the space after the colon is not part of the value. See the grammar for OVER in RFC 3977: hdr-n-content = [(header-name ":" / metadata-name) SP hdr-content] So yes, only the first space should be stripped.
msg120335 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-03 18:19
Ok, committed in r86139.
History
Date User Action Args
2022-04-11 14:57:08 admin set github: 54490
2010-11-03 18:19:27 pitrou set status: open -> closedresolution: fixedmessages: + stage: resolved
2010-11-03 18:01:08 jelie set messages: +
2010-11-03 17:55:54 jelie set messages: +
2010-11-03 02:30:10 rhettinger set messages: -
2010-11-03 02:13:20 rhettinger set nosy: + rhettingermessages: +
2010-11-03 02:00:49 r.david.murray set nosy: + r.david.murraymessages: +
2010-11-02 23:19:31 pitrou set files: + nntpover2.patchmessages: +
2010-11-02 23:19:20 pitrou set files: - nntpover2.patch
2010-11-02 23🔞46 pitrou set files: + nntpover2.patchmessages: +
2010-11-02 23:08:37 pitrou set messages: +
2010-11-02 23:05:08 jelie set messages: +
2010-11-02 22:52:33 pitrou set files: + nntpover.patchkeywords: + patchmessages: +
2010-11-02 22:45:10 jelie set messages: +
2010-11-02 22:41:23 pitrou set messages: +
2010-11-01 20:02:43 pitrou set nosy: + pitrou
2010-11-01 19:58:22 jelie create