[Tutor] Re: removing line ends from Word text files (original) (raw)

Lee Harr missive at hotmail.com
Fri Jul 16 23:40:18 CEST 2004


Right now I have the problem that I want to remove the MS Word line end token from text files: When saving a text file as 'text only' line ends are displayed as '^M' in a shell (SGI IRIX (tcsh) and Mac (tcsh or bash)). I want to get rid of these elements for further processing of the file and have no idea how to access them in a Python script. Any idea how to replace the '^M' against a simple '\n'? (I already tried '\r\n' and various other combinations of characters, but apparently all aren't '^M'.) '^M' is one character.

As supplementary information: I'm using MacOSX (version 10.3.4) with Python 2.3 or SGI Irix with Python 2.1.1 .

See if your systems have a script called dos2unix. That is the one that I always use when I get one of those dossy files.

If you cannot find that, try searching through the lines for characters around ord(10) ord(11) ord(12) ord(13) as I am pretty sure that ^M is somewhere in there...


The new MSN 8: smart spam protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail



More information about the Tutor mailing list