Issue 1210: imaplib does not run under Python 3 (original) (raw)

Issue1210

Created on 2007-09-27 05:49 by rtmq, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
imaplib_bytes-4.patch vstinner,2008-11-04 18:34
Messages (26)
msg56154 - (view) Author: Robert T McQuaid (rtmq) Date: 2007-09-27 05:49
imaplib does not run under Python 3. The following two-line python program, named testimap.py, works when run from a Windows XP system shell prompt using Python 2.5.1, but fails with Python 3.0. It appears that the logic does not follow the distinction between characters and bytes in Python 3. import imaplib mail=imaplib.IMAP4("mail.rtmq.infosathse.com") e:\python25\python testimap.py e:\python30\python testimap.py 2>f:syserr The last line produced the trace: Traceback (most recent call last): File "testimap.py", line 10, in mail=imaplib.IMAP4("mail.rtmq.infosathse.com") File "e:\python30\lib\imaplib.py", line 184, in __init__ self.welcome = self._get_response() File "e:\python30\lib\imaplib.py", line 962, in _get_response self._append_untagged(typ, dat) File "e:\python30\lib\imaplib.py", line 800, in _append_untagged if typ in ur: TypeError: unhashable type: 'bytes'
msg56156 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-09-27 06:10
Would you like to work on a patch?
msg56163 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-09-27 14:39
Just to further understand the issue, I added "imaplib.Debug=5" and here is the output preceding the exception stack trace(I replaced the real IMAP server name) *************** 20:19.52 imaplib version 2.58 20:19.52 new IMAP4 connection, tag=LOLD 20:19.52 < * OK Microsoft Exchange Server 2003 IMAP4rev1 server version 6.5.7638.1 (imapserver.com) ready. 20:19.52 matched r'\* (?P[A-Z-]+)( (?P.*))?' => (b'OK', b' Microsoft Exchange Server 2003 IMAP4rev1 server version 6.5.7638.1 (imapserver.com) ready.', b'Microsoft Exchange Server 2003 IMAP4rev1 server version 6.5.7638.1 (imapserver.com) ready.') *************** So it appears that the response is of type "bytes" which in turn is due to reading the socket in binary mode (self.file = self.sock.makefile('rb')). I would like to see how the problem can be fixed but any pointers are appreciated.
msg56193 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-09-28 18:41
I have gone through the python-3000 discussions about similar problems in other stdlib modules (email, imghdr, sndhdr etc) and found PEP 3137 (Immutable Bytes and Mutable Buffer). Since that work is in progress, I don't think it is worthwhile to fix this problem at this point.
msg57242 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-11-08 13:53
The transition is done. Can you work on a patch and maybe add some tests, too? It helps when you start Python with the -bb flag: $ ./python -bb -c 'import imaplib; imaplib.Debug=5; imaplib.IMAP4("mail.rtmq.infosathse.com")' 52:01.86 imaplib version 2.58 52:01.86 new IMAP4 connection, tag=PNFO Traceback (most recent call last): File "", line 1, in File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 184, in __init__ self.welcome = self._get_response() File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 907, in _get_response resp = self._get_line() File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 1009, in _get_line self._mesg('< %s' % line) File "/home/heimes/dev/python/py3k/Lib/warnings.py", line 62, in warn globals) File "/home/heimes/dev/python/py3k/Lib/warnings.py", line 102, in warn_explicit raise message BytesWarning: str() on a bytes instance
msg57254 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-11-08 14:59
I will see what I can do but it may take a while.
msg57430 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-11-12 21:42
Index: Lib/imaplib.py =================================================================== --- Lib/imaplib.py (revision 58956) +++ Lib/imaplib.py (working copy) @@ -228,7 +228,7 @@ self.port = port self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.sock.connect((host, port)) - self.file = self.sock.makefile('rb') + self.file = self.sock.makefile('r', encoding='ASCII', newline='') def read(self, size): ------------- This patch fixes the issue but I am not entirely sure that it is correct. I quickly looked at IMAP RFC and there does seem to be spec for CHARSET in which case, that will have to be used instead of ASCII. It requires more research and imap knowledge which I can't claim. As for the tests, we need a imap server to connect to. Perhaps, google wouldn't mind being used for this purpose?
msg59609 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2008-01-09 16:17
You're correct in pointing out that IMAP4 supports arbitrary encodings, so simply hard-coding ASCII is not correct. The encoding isn't connection-level, but applies to particular sequences of bytes in the connection stream. To correctly interpret the bytes as characters, decoding must be integrated with the rest of the protocol implementation.
msg61918 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-01-31 18:03
IMAP doesn't really support multiple charsets (just looked at RFC 3501). There are two places where character sets other than ASCII is used. One is in the SEARCH command; there's an optional parameter which can indicate that the search strings are in a non-ASCII character set. The other is in transmission of message literals (email messages) back and forth. So probably setting the default encoding at this level isn't quite right, as you should definitely be reading raw bytes from the socket, not characters, but it isn't too far off. Looks like _command() needs a bit of work (it shouldn't try to quote bytes, only strings), and the documentation need to be improved, to say that non-ASCII search strings and message bodies should be passed as bytes encoded according to the specified CHARSET, but with those fixes it should work. Assuming that bytes are hashable in Python 3K.
msg71894 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-24 22:22
Is this still a problem?
msg71989 - (view) Author: Ismail Donmez (donmez) * Date: 2008-08-26 17:50
Still fails with beta2: >>> import imaplib >>> mail=imaplib.IMAP4("mail.rtmq.infosathse.com") Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.0/imaplib.py", line 185, in __init__ self.welcome = self._get_response() File "/usr/local/lib/python3.0/imaplib.py", line 912, in _get_response if self._match(self.tagre, resp): File "/usr/local/lib/python3.0/imaplib.py", line 1021, in _match self.mo = cre.match(s) TypeError: can't use a string pattern on a bytes-like object
msg71992 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-26 18:37
This may not be a real release blocker, but I want to raise the priority. It is a regression and we should try to fix it, especially if it's easy.
msg72459 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-09-04 02:12
This should be fixed but it's not a release blocker.
msg72479 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-09-04 04:58
Take a look at the thread here: http://mailman2.u.washington.edu/mailman/htdig/imap-protocol/2008-February/000811.html I think the summary is, arbitrary bytes may occur in some places, but they're likely to be UTF-8. Otherwise, it's mainly ASCII, but purposely left vague to see what convention developed.
msg74731 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 11:27
Here is a patch for imaplib: - add encoding attribute to IMAP4 class (as ftplib and see also issue 3727 for my poplib patch) - use makefile('r', encoding=self.encoding) instead of a binary file (mode='rb') - remove duplicate code in IMAP4_SSL I choosed ISO-8859-1 as the default charset. I tested the library on my local IMAP4 server using IMAP4 and IMAP4_SSL classes. But the library needs more unit tests as done for poplib.
msg74752 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-10-14 15:57
Victor, what kind of content have you tried this with? For instance, have you passed unencoded (Content-Transfer-Encoding: binary) binary data through it, by mailing a JPEG, for instance? These things are strings really only at the application level; the data is still bytes. In addition, the use of Latin-1 goes against the explicit directives of the IMAP group, doesn't it? They're pushing UTF-8. Bill On Tue, Oct 14, 2008 at 4:27 AM, STINNER Victor <report@bugs.python.org>wrote: > > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > > Here is a patch for imaplib: > - add encoding attribute to IMAP4 class (as ftplib and see also issue > 3727 for my poplib patch) > - use makefile('r', encoding=self.encoding) instead of a binary file > (mode='rb') > - remove duplicate code in IMAP4_SSL > > I choosed ISO-8859-1 as the default charset. I tested the library on > my local IMAP4 server using IMAP4 and IMAP4_SSL classes. But the > library needs more unit tests as done for poplib. > > ---------- > keywords: +patch > nosy: +haypo > Added file: http://bugs.python.org/file11786/imaplib_unicode.patch > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue1210> > _______________________________________ >
msg74760 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 18:14
IMAP_stream() is also broken because it uses os.popen2() which has been deprecated since long time and now replaced by subprocess. Here is a patch replacing os.popen2() by subprocess, but also using transparent conversion from/to unicode using io.TextIOWrapper().
msg74761 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 18:21
> what kind of content have you tried this with? I only tried the most basic commands like capability(). I retried with search() and... hey, search() has a charset argument!? It should reuse self.encoding. Same for sort(). Then I tried to get the content of an email but fetch(num, '(RFC822)') fails with "imaplib.abort: command: FETCH => unexpected response: 'Return-Path: <example@example.com'". RFC822 is not supported by imaplib? The test also fails with Python 2.5.
msg74767 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-10-14 19:31
Maybe the first thing to do is to expand the Lib/test/test_imaplib.py file, which right now is pretty darn minimal. We really need an IMAP server somewhere to test against, with a standard library of varied messages. Perhaps Python.org is running an IMAP server?
msg74775 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 22:14
The server can send raw 8 bits email in any charset (charset is specified in the email headers). That's why I think that it's better to keep bytes instead of the unicode conversion using a fixed charset. Each email can use a different charset. Types used in my new patch: - unicode: * IMAP commands (charset=ASCII) * untagged_responses keys (charset=ASCII) - bytes: * answer * regex * tagre attribute * untagged_responses values I chooosed to keep unicode for some variables to minimize the changes in imaplib library and to keep readable code. Patch TODO: - Remove the assert (added for quicker debugging) - Test more functions - Restore _checkquote() in _command() method or use _quote()/_checkquote() in method which need it. login() already quote the password (but why not the login?) I also wrote a patch for a "pure bytes string" version, but the patch is complex, long and the resulting module source code is hard to read.
msg74778 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 22:34
New version of my bytes patch: - fix IMAP4_stream: use subprocess.Popen() as my previous imap_stream.patch but use bytes instead of characters - fix IMAP4_SSL: sslobj wasn't set in IMAP4_SSL.open() but used, for example, in read() method; remove duplicate method (simplify the code) - IMAP4.read(): call file.read() multiple times if the result is smaller than size (needed especially for the SSL version); FIXME: does this function raise an error of EOF or just loop forever? should we stop the loop if data is b''?
msg74779 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-14 22:43
Oops, my previous patch didn't include changes to the documentation. New patch changes: - fix the documentation: os.popen2() => subprocess.Popen(); no more ssl() method: use socket() - use a buffer of 4096 bytes in read() method (as suggested in socket documentation) - break read() loop if read() returns an empty bytes string
msg75282 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-10-28 15:02
Can anyone review my last patch (imaplib_bytes-3.patch)?
msg75479 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-11-03 23:58
The assertion on line 813 is indented incorrectly. Please fix that. I'm concerned we really need better test coverage for this code, but it's doubtful we'll get that before 3.0 final is released. I think this is the best we're going to do, and nothing else about the code jumps out at me. Go ahead and land it after that minor fix.
msg75501 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-11-04 18:34
Le Tuesday 04 November 2008 00:59:02 Barry A. Warsaw, vous avez écrit : > The assertion on line 813 is indented incorrectly. Please fix that. Ooops. I'm using the following command because my editor is configured to remove the trailing spaces: svn diff --diff-cmd="/usr/bin/diff" -x "-ub" The line 813 was an assertion. I added many assertions to check types (for easier debug) but there are not needed anymore (my code is bugfreee, haha, no it's a joke). The new attached patch has no more assertion.
msg75527 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-05 19:40
Committed in r67107
History
Date User Action Args
2022-04-11 14:56:27 admin set github: 45551
2008-11-05 19:40:11 christian.heimes set status: open -> closedresolution: accepted -> fixedmessages: +
2008-11-04 18:34:03 vstinner set files: + imaplib_bytes-4.patchmessages: +
2008-11-04 18:30:11 draghuram set nosy: - draghuram
2008-11-04 18:29:07 exarkun set nosy: - exarkun
2008-11-04 18:27:16 vstinner set files: - imaplib_bytes-3.patch
2008-11-04 02:41:35 benjamin.peterson set assignee: benjamin.petersonnosy: + benjamin.peterson
2008-11-03 23:59:01 barry set keywords: - needs reviewmessages: +
2008-10-28 15:02:02 vstinner set keywords: + needs reviewmessages: +
2008-10-14 22:43:31 vstinner set files: + imaplib_bytes-3.patchmessages: +
2008-10-14 22:41:32 vstinner set files: - imaplib_bytes-2.patch
2008-10-14 22:34:57 vstinner set files: - imaplib_bytes.patch
2008-10-14 22:34:51 vstinner set files: + imaplib_bytes-2.patchmessages: +
2008-10-14 22:20:33 vstinner set files: - imaplib_stream.patch
2008-10-14 22:14:05 vstinner set files: + imaplib_bytes.patchmessages: +
2008-10-14 21:55:01 vstinner set files: - imaplib_unicode.patch
2008-10-14 19:31:11 janssen set messages: +
2008-10-14 18:21:13 vstinner set messages: +
2008-10-14 18:14:21 vstinner set files: + imaplib_stream.patchmessages: +
2008-10-14 17:36:08 vstinner set files: - unnamed
2008-10-14 15:57:11 janssen set files: + unnamedmessages: +
2008-10-14 11:27:46 vstinner set files: + imaplib_unicode.patchnosy: + vstinnermessages: + keywords: + patch
2008-10-02 12:54:03 barry set priority: deferred blocker -> release blocker
2008-09-26 22🔞07 barry set priority: release blocker -> deferred blocker
2008-09-18 05:42:32 barry set priority: deferred blocker -> release blocker
2008-09-04 04:58:21 janssen set messages: +
2008-09-04 02:12:11 barry set priority: release blocker -> deferred blockernosy: + barrymessages: +
2008-08-26 18:37:30 nnorwitz set priority: normal -> release blockermessages: +
2008-08-26 17:50:39 donmez set nosy: + donmezmessages: +
2008-08-24 22:22:34 nnorwitz set nosy: + nnorwitztype: crash -> behaviormessages: +
2008-01-31 18:03:17 janssen set nosy: + janssenmessages: +
2008-01-09 16:17:32 exarkun set nosy: + exarkunmessages: +
2008-01-06 22:29:45 admin set keywords: - py3kversions: Python 3.0
2007-11-12 21:42:33 draghuram set messages: +
2007-11-08 14:59:57 draghuram set messages: +
2007-11-08 13:53:28 christian.heimes set nosy: + christian.heimesmessages: +
2007-11-04 13:49:32 christian.heimes set priority: normalkeywords: + py3kresolution: accepted
2007-09-28 18:41:35 draghuram set messages: +
2007-09-27 14:39:47 draghuram set messages: +
2007-09-27 14:22:43 draghuram set nosy: + draghuram
2007-09-27 06:10:00 loewis set nosy: + loewismessages: +
2007-09-27 05:49:34 rtmq create