Defaulting to_csv to infer compression · Issue #22004 · pandas-dev/pandas (original) (raw)
This issue follows up on #17900 by thanks @Dobatymo and @gfyoung with review from @jreback. #17900 added an 'infer'
option to compression in _get_handle
. The main user-facing benefit here is that df.to_csv
will be able to infer compression just like pandas.read_csv
. However, unlike read_csv
the default value for compression is None
rather than 'infer'
Unfortunately, much of the convenience of compression='infer'
is lost if you have to explicitly specify it. In summary, I think there is a major convenience to the following command to work and automatically perform gzip compression:
Compatibility assessment
Defaulting to infer would only affect users who are currently using paths with compression extensions but not actually compressing. That's pretty bad practice IMO. Hence, I'm in favor of breaking backwards compatibility and changing the default for compression to infer. It looks like this would go into the major release 0.24?