Re: ls output changes considered unacceptable (original) (raw)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Pádraig Brady
Subject: Re: ls output changes considered unacceptable
Date: Thu, 18 Feb 2016 01:17:09 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 17/02/16 17:46, Michael Stone wrote:

I backed out the default ls quoting style change for now in Debian. I admit I simply missed the original RFC, and I apologize for that. I've been hesitant to wade into this because the tone of much of the criticism is frankly unpleasant,

Heh that discussion wasn't too bad, compared to the suggestions I got privately to commit suicide. The vocal minority feel strongly about change.

The comments below are not disagreeing with your decision, just clarifying some points.

but I don't want to change a downstream default without some explanation. Here are my thoughts:

It's not possible to make an older version of ls output in this style. There are two problems in that. First, I'd prefer a change this dramatic to be something that could be configured consistently across an environment with differing releases.

That would mean something like this could not be added until end of support for Jessie (2020?). More proactively one could backport the option to Jessie, but leave the default as is there.

Second, I think that something this dramatic should be evaluated in the wild for a bit before becoming the default.

The main point about not being the default is that new users who arguably most benefit from this, will not know to enable it.

The main point for me about the feature itself is that even for experienced users, it allows them to always be able to easily use the output from ls. Usually the quoting isn't even visible, but sometimes I change to a "messy" directory of files, in which case I really want this enabled, so that I can actually interact with those files. Note messy files include those with spaces and quotes, but also with encoding errors, or just different file name encodings.

It's not an interoperable solution. It happens to work for some narrow use cases in some shells, but won't work for other use cases (e.g., copying from commandline to gui file dialog)

Copying from command line to GUI is not generally supported now anyway, as the GUI app often doesn't have the directory context. If passing the file to a (GUI) command on the command line then you'll generally need the quotes anyway. Also consider when ls outputs ?, file names having encoding errors, or just differing encodings, or even edge cases like xterm always using normalized composed (NFKC) for output, which all break cut and paste. ls using shell quoting by default avoids all these issues, by presenting a form that can be posted to the vast majority of interactive shells.

or shells (csh, dash, posh, yash).

Fair point, but this quoting format is in discussion to be POSIX standardized, so ls is being a bit ahead of the curve here, given it's supported by the vast majority of interactive shells in use.

These are quite rare:

https://qa.debian.org/popcon.php?package=csh https://qa.debian.org/popcon.php?package=tcsh https://qa.debian.org/popcon.php?package=posh https://qa.debian.org/popcon.php?package=yash

dash is rarely used as an interactive shell

Some of those shells are defaults on some OSs,

yes it would be appropriate to disable this feature by default on those OSs.

or worse, they could lead to dramatically different results between syntax tested interactively and syntax used in a shell script.

ls is really quite difficult to parse programmatically. I expanded on an example elsewhere, showing how awkward and non practical it is.

On no shell that I tested can you copy part of a quoted filename and tab complete it.

works on base 4.3.32 here

Support for tab completing through a quoted directory was extremely inconsistent even in shells that recognize the basic syntax. Ironically, the original solution (the ?'s) was more interoperable for many use cases.

I find the ? dangerous to use, as it can match many other files.

It's ambiguous. It isn't possible to tell from the output whether you're looking at a really weird filename or something that's been escaped. Recall the earlier point about the inability to configure a heterogenous environment consistently, and the lack of visual indication of the current QUOTINGSTYLE.

I agree with this. An improvement suggested by someone else, would improve both the ambiguity and the small alignment issue, by adding an extra space to ls -l output if any names are quoted. That would also give a better indication that ls was adding the quotes. I.E.:

-rw-rw-r--. 1 padraig padraig 580 Dec 14 04:01 blah_blah -rw-rw-r--. 1 padraig padraig 580 Dec 14 04:01 'blah blah'

You could with --color, even surround added quotes with (tputdim)...(tput dim)...(tputdim)...(tput sgr0) but that's probably going too far.

It's inconsistent. Some filenames are surrounded by quotes and some aren't. I don't think that satisfies the "least surprise" principle.

True, though that's not the case when parsing ls output (ignoring the fact how bad an idea that is)

(This is also true for some of the specifics raised in the point about interoperability; things will "just work" sometimes and completely fail at other times in ways which are not necessarily obvious.)

It's ugly and overly verbose. This one is completely subjective, but there seems to be some strong sentiment around it. As a practical matter, it doesn't matter much to me at all because I'm an old school unix guy and my filenames are lower case ascii with no spaces, except for the occasional Makefile or README. But I've seen what happens when you run ls with this quoting style on a bunch of filenames from a GUI system full of spaces and single quotes--the results are not pretty: 'I don'''t think this is optimal.docx' 'My eye'''s bleedin'''.docx' Simple.docx

Again these are edge cases. As Eric suggests it might be appropriate to use double quotes to simplify this "common" case.

In contrast, the case for the change doesn't seem that compelling:

"It's less ambiguous" -- Maybe in some ways, but are there real-world cases where it's a useful distinction?

Yes. I've hit this many times with spaces in names.

This was never a real barrier for experienced users, who could just ls | xxd or somesuch, and in practical terms inexperienced users are going to just use a gui to delete that wacky file anyway.

"It's possible to copy and paste filenames with unprintable characters at the command line" -- do people really do this on a routine basis? What is the relative size of that set compared to the set who will find it necessary to change the default back?

"Many of the criticisms are irrelevant for experienced users/admins" -- experienced users/admins weren't stymied by the original output/need to set a quoting style, either. The change is most beneficial to inexperienced users with certain use cases, which means it's valid to question whether inexperienced users are more likely to run into cases where the new output eases confusion or causes confusion.

I think the change is beneficial for all users. Again if you're in a dir with filenames that require various quotings to reference, then it's very awkward.

"If people don't like it, they can turn it off" -- well, yeah, but they can also just turn it on if they do like it, right?

answered above (new users won't know)

For myself, I consider ls to primarily be a visualization tool, so having output which maximizes human readability is a sine qua non.

This is a key point. I think one should also be able to easily use the output from ls. Hopefully these two goals aren't mutually exclusive, especially with the tweaks mentioned above.

If reducing the ambiguity of the output is really a goal, then the quoting should be pervasive/consistent rather than changing on a per-file basis. It should also be minimally verbose. I'd rather see the default be --quoting-style=c than --quoting-style=shell-escape.

Each to their own :) But that would be less directly usable on the shell.

I'm likely to set the quoting style to literal for my own account regardless of the default, because I'd rather save a few characters per filename when looking at files on an SMB share and I've never hit a case where the ? was insufficient.

Where to go with this? I feel pretty strongly that changing the default is a bad idea right now. Maybe with some broader & longer term experience we'd have a better idea of whether it's truly useful, which we can't have with something that's brand new. That said, I feel pretty strongly that Debian's coreutils package shouldn't needlessly diverge from upstream because I think consistency across distributions is valuable in itself. I'm kicking the can down the road for now, but in the long term I'd rather not carry a patch just to change an upstream default. I hope maintainers from some of the other distributions are following this, and that we can get a wider discussion; I suspect I'm not the only one who dropped the ball on the RFC. I'd be especially interested in seeing use cases from people who have found the new syntax to be a workflow improvement, and the specifics thereof.

thanks for your detailed thoughts on this, Pádraig.