Raise an error on redundant definition of separator in read_csv (original) (raw)
Is your feature request related to a problem?
When calling read_csv and specifying both sep and delim_whitespace, or both delimiter and delim_whitespace I get a ValueError in pandas 1.3.0.dev0+713.g9f792cd903. For example:df = pd.read_csv("my_data.csv", sep=' ', delim_whitespace=True)
anddf = pd.read_csv("my_data.csv", delimiter=' ', delim_whitespace=True)
give an error. However, when I specify both sep and delimiter, for example:df = pd.read_csv("my_data.csv", sep=' ', delimiter='.')sep is just silently ignored. I think it would make sense to raise an ValueError in this case as well.
Moreover, the error that is raised today gives the message Specified a delimiter with both sep and delim_whitespace=True; you can only specify one. regardless of whether I specify sep or delimiter together with delim_whitespace. I think it should be changed to Specified a delimiter with both delimiter and delim_whitespace=True; you can only specify one. when delimiter is used.
Describe the solution you'd like
Raise a ValueError when both sep and delimiter are used to specify the separator for read_csv.
Change the message "Specified a delimiter with both sep and delim_whitespace=True; you can only specify one." to "Specified a delimiter with both delimiter and delim_whitespace=True; you can only specify one." when both delimiter and delim_whitespace are specified.
API breaking implications
This will "break" code that specify both sep and delimiter. However, it is consistent with the behavior when you specify one of those parameters together with delim_whitespace. Moreover, a similar change has been done at some time between pandas 0.25.3 (the latest version provided by aptitude) and the development version. In the formerpd.read_csv("my_data.csv", delim_whitespace=True, sep=',')
doesn't cause a ValueError, but in the latter it does.
Describe alternatives you've considered
An alternative could be to issue a warning instead of an error, but an error is more consistent with the current behavior for the combination of delim_whitespace and (sep or delimiter).