Issue 28391: Multiple occurances of: Closing quotes separate words (original) (raw)

I am currently writing an extended options scanner including shlex-compatibility mode. So including numerous extended unittests for compatibility verification.

I am not sure whether the following behaviour of shlex.split() is correct:

Quote from manual: Parsing Rules: Closing quotes separate words ("Do"Separate is parsed as "Do" and Separate);

Case-0: Works sopts = """-a "Do"Separate """ resx = ["-a", '"Do"', 'Separate', ] shlex.split(sopts,posix=False) assert res == resx

Case-1: Fails - should work? sopts = """-a "Do"Separate"this" """ resx = ["-a", '"Do"', 'Separate', '"this"', ] shlex.split(sopts,posix=False) assert res == resx

Case-2: Works - should fail? sopts = """-a "Do"Separate"this" """ #@UnusedVariable resx = ["-a", '"Do"', 'Separate"this"', ] shlex.split(sopts,posix=False) assert res == resx

The Case-0 is as defined in the manuals.

Is Case-1 or Case-2 the correct behaviour? Which of Case-1, Case-2 is an error?

REMARK: I haven't found an eralier issue, so filing this here.

Case 2 (the actual behavior) is correct. Quotes within words are ignored, only a leading quoted string will result in a separate word. (That's recursive: try '"Do""This""Separate).

That said, we don't really care about non-posix mode, it's just there for backward compatibility. I don't know why we haven't changed the default for shlex.shlex to posix=True.