Issue 984714: unknown parsing error (original) (raw)

Issue984714

Created on 2004-07-03 19:39 by gazum, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
coding.patch nnorwitz,2004-07-21 03:16 patch 1 to address missing break, more strict -*- coding check
Messages (3)
msg21389 - (view) Author: Igor Sidorenkov (gazum) Date: 2004-07-03 19:39
I am getting "unknown parsing error" when trying to run a script with a following first line: #@+leo-encoding=cp1251. If I add a couple of empty lines or # -*- coding: cp1251 -*- then everything is ok. I am using ActiveState python 2.3.3 on Win2K server. ---------- Python ---------- error=22 File "test.py", line 1 SyntaxError: unknown parsing error Output completed (0 sec consumed) - Normal Termination ------------------------------ #@+leo-encoding=cp1251. #@+node:0::@file test.py #@+body for i in range(5): print i #@-body #@-node:0::@file test.py #@-leo
msg21390 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-07-21 03:16
Logged In: YES user_id=33168 Martin, I hope you don't mind me assigning this to you. I think you implemented the coding spec. I briefly read the PEP and while the code does what the PEP states (ie, use a regex), the behaviour doesn't match the examples. It also seems like it could be error prone to allow r'#.*coding[:=]' I think there are two issues. 1) in pythonrun.c in E_DECODE there is a missing break 2) the check for # -*- coding is not strict enough The patch makes the check r'# (-\*-)? coding[:=]' The attached patch addresses both issues, although I'm not sure you will agree #2 is a problem. Feel free to checkin, assign back to me or whatever. I'm not sure what the error message in pythonrun should be, right now it's "unknown decode error." Perhaps that should be "invalid encoding" or something?
msg21391 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-07-21 05:36
Logged In: YES user_id=21627 The patch is wrong. The PEP deliberately allows for arbitrary occurrences of the substring "coding", in particular inside "encoding". This was made so that other editors, like vi or LEO, can continue to use their own encoding declarations, and Python would recognize them. Unfortunately, LEO decided to add a full stop at the end of the line, so Python looks for an encoding named "cp1251.". We agree with the LEO author that this is a problem in LEO, and will be fixed. Alternatively, we could amend the PEP and declare that trailing dots are not part of the encoding name. The other part of the patch is correct; I have applied it as pythonrun.c 2.195.6.6 and 2.207. It would be even better if we could display the actual cause of the problem, but that is currently not supported in the parser.
History
Date User Action Args
2022-04-11 14:56:05 admin set github: 40502
2004-07-03 19:39:22 gazum create