[Tutor] removing line ends from Word text files (original) (raw)
Michael Janssen Janssen at rz.uni-frankfurt.de
Sat Jul 17 15:55:50 CEST 2004
- Previous message: [Tutor] removing line ends from Word text files
- Next message: [Tutor] removing line ends from Word text files
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 9 Jul 2004, Christian Meesters wrote:
Right now I have the problem that I want to remove the MS Word line end token from text files: When saving a text file as 'text only' line ends are displayed as '^M' in a shell (SGI IRIX (tcsh) and Mac (tcsh or bash)). I want to get rid of these elements for further processing of the file and have no idea how to access them in a Python script. Any idea how to replace the '^M' against a simple '\n'? (I already tried '\r\n' and various other combinations of characters, but apparently all aren't '^M'.) '^M' is one character.
You can allways ask Python when you want to know how it will represent this character: Read one line with "readline" and print its repr-string:
fo = open("filename") line = fo.readline() print repr(line)
repr gives you an alternative string representation of any objects. repr used on strings doesn't interpret backslash sequences like \n or \r. As you are on MAC, I would guess your newline character is a simple "\r".
you can also ask Python for the caracter's ordinal print ord(line[-2]) # just in case one newline consists of two chars print ord(line[-1])
It's probably best to do such investigations with an interactive Python session. But now since I've realized that readline is Unix-only, I don't think interactive mode is that much fun on MAC/Win: without readline you can't repeat your commands (without having to type them again and again). You can't use the cursor keys. Perhaps Idle offers elaborate line editing even on those systems.
regards
Michael
- Previous message: [Tutor] removing line ends from Word text files
- Next message: [Tutor] removing line ends from Word text files
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]