[DISCUSS] Reducing cadence of major arrow-rs releases introducing patch releases · Issue #5368 · apache/arrow-rs (original) (raw)

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

As more people use arrow, the overall burden to users from frequent major releases is increasing. Furthermore, the pace of breaking API changes is decreasing, so the burden on maintainers to avoid breaking changes is decreasing

As the arrow crate becomes more widely used in the ecosystem by projects other than DataFusion and other early adopters, the frequent major releases causes several issues:

  1. Crates must match the major arrow versions. For example, if a crate uses DataFusion that forces everything in the entire project to exactly that version of arrow-rs).
  2. parquet and arrow releases are coupled so releasing a version of parquet requires releasing a new version of arrow

The major version bumps imposes non trivial overhead on user crates. Some crates like arrow_serde have implemented clever, though complex, workaround like having feature flags for each arrow version (see the recent discussion with @chmp on arrow_serde chmp/serde_arrow#131)

Also, from what I can see many of the recent arrow-rs changes aren't really adding new APIs, they are more like filling in feature gaps and bugs, which also reflected in the slower pace of the last few releases.

Describe the solution you'd like
I propose we set a more regular major release cadence (e.g. every 3 months) and only do minor, compatible, releases between those releases.

This would absolutely require more maintainer effort, but at this stage in the project the effort may be more manageable as the APIs are in a pretty good place I think

Describe alternatives you've considered
I think there are various alternatives to trigger releases / what cadence. I don't have a hugely strong opinion in this matter

Additional context
At some point in the past we actually had fewer major releases -- see #1120

There was non trivial process overhead so we (well , really I) abandoned it and went YOLO on major releases as there wasn't really any maintenance bandwidth to do anything else