Issue 8702: difflib: unified_diff produces wrong patches (again) (original) (raw)

I think difflib is behaving as intended here; changing to feature request.

Could you please clarify about the information loss? I'm not seeing it. As far as I can tell, the fact that unified_diff produces a list rather than a single string (as GNU diff effectively does) means that all necessary information about newlines is preserved, with no information loss:

newton:py3k dickinsm$ echo -n "one two" > 1.txt newton:py3k dickinsm$ echo -n "one two
" > 2.txt newton:py3k dickinsm$ ./python.exe Python 3.2a0 (py3k:81084:81085M, May 12 2010, 14:16:52) [GCC 4.2.1 (Apple Inc. build 5659)] on darwin Type "help", "copyright", "credits" or "license" for more information.

from difflib import unified_diff [47745 refs] list(unified_diff(list(open('1.txt')), list(open('2.txt')))) ['--- \n', '+++ \n', '@@ -1,2 +1,2 @@\n', ' one\n', '-two', '+two\n'] [53249 refs]

It looks to me as though the diff picks up the missing newline just fine.

The one problem with the above is that you can't do a ''.join() on it to give a meaningful diff, but I don't see that as a problem with the unified_diff function itself.

I'd be -1 on adding the "\ No newline at end of file" by default, since it complicates the unified_diff format unnecessarily (and would also affect backwards compatibility). I wouldn't have any objections to an extra option for this, though.