Message 259718 - Python tracker (original) (raw)
Commas aren't legal characters in cookie keys, yet in Python 3.5, they're allowed:
bool(http.cookies._is_legal_key(',')) True
The issue lies in the use of _LegalChars constructing a regular expression.
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."
The issue arises in this line:
_is_legal_key = re.compile('[%s]+' % _LegalChars).fullmatch
Which was added in 88e1151e8e0242 referencing .
The problem is that in a regular expression, and in a character class in particular, the '-' character has a special meaning if not the first character in the class, which is "span all characters between the leading and following characters". As a result, the pattern has the unintended effect of including the comma in the pattern:
http.cookies.is_legal_key.self re.compile("[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!#$%&'*+-.^`|~:]+") pattern = _ pattern.fullmatch(',') <_sre.SRE_Match object; span=(0, 1), match=','> ord('+') 43 ord('.') 46 ''.join(map(chr, range(43,47))) '+,-.'
That's how the comma creeped in.
This issue is the underlying cause of https://bitbucket.org/cherrypy/cherrypy/issues/1405/testcookies-fails-on-python-35 and possible other cookie-related bugs in Python.
While I jest about regular expressions, I like the implementation. It just needs to account for the extraneous comma, perhaps by escaping the dash:
_is_legal_key = re.compile('[%s]+' % _LegalChars.replace('-', '\-')).fullmatch
Also, regression tests for keys containing invalid characters should be added as well.