ENH: Multi-Column explode · Issue #39240 · pandas-dev/pandas (original) (raw)
Is your feature request related to a problem?
I have often looked for the option to use the explode
function on multiple columns at the same time.
Example:
A | B | C | D
------------------------
1 | a | [1,2] | [7,8]
2 | b | [3,4] | [9,1]
3 | c | [5,6] | [2,3]
===>
A | B | C | D
------------------
1 | a | 1 | 7
1 | a | 2 | 8
2 | b | 3 | 9
2 | b | 4 | 1
3 | c | 5 | 2
3 | c | 6 | 3
For obvious reasons exploding first one and then the other column will not yield the result we want here, because we will end up duplicating the values in the first explode during the second explode.
Describe the solution you'd like
Either it could be df.explode(["C","D"])
or it could be its own function df.multi_explode(["C","D"])
. Both ways could use more or less the same method that the existing explode function uses.
API breaking implications
The only issue I see with this are cases where the lists in the C and D column aren't the same length, in which case we would need to pad them.
Context
I don't know if this has been discussed previously, but if there is interest in this a Pull request concerning this feature I could try to implement it.