Issue 6409: 2to3 -j 4 generates malformed diffs (original) (raw)

Created on 2009-07-03 15:53 by exarkun, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
2to3.patch exarkun,2009-07-03 15:52
output_lock.diff gboutsioukis,2010-04-12 11:49
Messages (6)
msg90064 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-07-03 15:52
I ran 2to3 over Twisted r27084 like this: time ~/Projects/python/trunk/python /home/exarkun/Projects/python/trunk/Tools/scripts/2to3 -j 4 twisted/ > 2to3.patch Visual inspection revealed some curious defects, and attempting to apply it failed like this: $ patch -p0 < 2to3.patch patching file twisted/application/app.py patching file twisted/conch/client/connect.py patching file twisted/conch/client/default.py patch: **** malformed patch at line 129: @@ -1,6 +1,6 @@ The resulting file is attached.
msg90065 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-07-03 16:01
You seem to use the 'parallel' option. Some flush() or synchronization missing?
msg98590 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-01-31 02:05
I'm able to reproduce the bug. The problem is that "-j 4" option creates 4 working processes, and they are all writing to stdout at the same time. Main process: main() => refactor() => refactor_file() sends tasks to child processes Children: _child() => queue.get ~> refactor_file() => processed_file() => print_output() writes the diff to stdout using print() A solution would be to use a lock to ensure that only one process is writing to stdout at the same time... But we may loose all advantages of using different Python processes, it's a new kind of GIL :) A better solution would be to send output (the whole diff for one file) to the main process which will be the only process writing the stdout. I don't know multiprocessing enough to propose a patch. Note: flush() in each child process doesn't ensure that no process is writing to stdout at the same time.
msg102913 - (view) Author: George Boutsioukis (gboutsioukis) (Python committer) Date: 2010-04-12 02:00
Flushing stdout is still necessary, though not enough. The processes will still have to use some kind of synchronization, and the performance toll of adding a lock to synchronize output is negligible, given that printing to stdout takes a tiny amount of time compared to the rest of the 2to3 code. This patch seems to fix the issue without affecting performance.
msg102938 - (view) Author: George Boutsioukis (gboutsioukis) (Python committer) Date: 2010-04-12 11:49
I updated the patch to keep the new code as local as possible.
msg102974 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-04-12 21:12
Fixed in r80018. Thanks for the patch.
History
Date User Action Args
2022-04-11 14:56:50 admin set github: 50658
2010-04-12 21:12:27 benjamin.peterson set status: open -> closedresolution: fixedmessages: +
2010-04-12 11:49:08 gboutsioukis set files: + output_lock.diffmessages: +
2010-04-12 11:38:27 gboutsioukis set files: - output_lock.diff
2010-04-12 02:00:31 gboutsioukis set files: + output_lock.diffnosy: + gboutsioukismessages: +
2010-04-10 22:16:06 loewis set components: + 2to3 (2.x to 3.x conversion tool), - Library (Lib)
2010-04-03 15:46:06 vstinner set title: 2to3 generates malformed diffs -> 2to3 -j 4 generates malformed diffs
2010-01-31 02:05:44 vstinner set nosy: + vstinnermessages: +
2009-07-04 03:05:13 benjamin.peterson set assignee: benjamin.petersonnosy: + benjamin.peterson
2009-07-03 16:01:06 amaury.forgeotdarc set nosy: + amaury.forgeotdarcmessages: +
2009-07-03 15:53:44 exarkun create