msg97759 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2010-01-14 11:56 |
binascii_b2a_uu() estimate the output string length using 2+bin_len*2. It's almost correct... except for bin_len=1. The result is a memory write into unallocated memory: $ ./python -c "import binascii; binascii.b2a_uu('x')" Debug memory block at address p=0x87da568: API 'o' 33 bytes originally requested The 3 pad bytes at p-3 are FORBIDDENBYTE, as expected. The 4 pad bytes at tail=0x87da589 are not all FORBIDDENBYTE (0xfb): at tail+0: 0x0a *** OUCH at tail+1: 0xfb at tail+2: 0xfb at tail+3: 0xfb The block was made by call #25195 to debug malloc/realloc. Data at p: 00 00 00 00 00 00 00 00 ... 00 00 00 21 3e 20 20 20 Fatal Python error: bad trailing pad byte Abandon Current output string length estimation for input string 0..10: >>> [len(binascii.b2a_uu("x"*bin_len)) for bin_len in xrange(10)] [2, 6, 6, 6, 10, 10, 10, 14, 14, 14] >>> [(2+bin_len*2) for bin_len in xrange(10)] [2, 4, 6, 8, 10, 12, 14, 16, 18, 20] The estimation is correct for all lengths... except for bin_len=1. And it's oversized for bin_len >= 9. The exact length is: 2+ceil(bin_len*8/6) <=> 2+(bin_len+5)*8//6 <=> 2+(bin_len+2)*4//3 Example with length 0..10: >>> [len(binascii.b2a_uu("x"*bin_len)) for bin_len in xrange(10)] [2, 6, 6, 6, 10, 10, 10, 14, 14, 14] >>> [(2+(bin_len+2)*4//3) for bin_len in xrange(10)] [4, 6, 7, 8, 10, 11, 12, 14, 15, 16] Attached patch uses the correct estimation. |
|
|
msg97760 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-01-14 12:32 |
>>> [len(binascii.b2a_uu("x"*bin_len)) for bin_len in xrange(10)] [2, 6, 6, 6, 10, 10, 10, 14, 14, 14] >>> [(2+(bin_len+2)*4//3) for bin_len in xrange(10)] [4, 6, 7, 8, 10, 11, 12, 14, 15, 16] How is this the correct estimation? The results are different. Try the following: >>> [(2+(bin_len+2)//3*4) for bin_len in xrange(10)] [2, 6, 6, 6, 10, 10, 10, 14, 14, 14] |
|
|
msg97764 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2010-01-14 13:43 |
> How is this the correct estimation? The results are different. The estimation have be bigger or equal, but not smaller. > Try the following: > >>> [(2+(bin_len+2)//3*4) for bin_len in xrange(10)] > [2, 6, 6, 6, 10, 10, 10, 14, 14, 14] Cool, it's not an estimation but the exact result :-) I prefer to leave the resize unchanged. The new patch uses your "estimation" ;-) |
|
|
msg97775 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-01-14 17:09 |
The patch doesn't apply cleanly against trunk (due to today's commits I fear, sorry). Also, it would be nice to add a test. |
|
|
msg97777 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2010-01-14 17:25 |
> The patch doesn't apply cleanly against trunk Because of r77497 (issue #770). No problem, here is the new patch. I'm now using a git-svn repository to keep all my patches. It's much easier to update them to trunk ;-) > Also, it would be nice to add a test. done |
|
|
msg97796 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-01-15 00:31 |
Patch committed in r77506, r77507, r77508 and r77509. Thank you! |
|
|