DataFrame self-joins · Issue #2996 · pandas-dev/pandas (original) (raw)

Given the following DataFrame

area point test value
A 11 0 1234234
A 11 1 12341234
A 16 0 234234
A 16 1 2343
A 16 2 234234
C 4 0 234234
C 4 1 234234

it would be nice if there were a way of grouping say columns area and point and comparing the value per test > 1 with the value for test - 1.

This can be done by iterating over df.groupby(['area', 'point', 'test']) and using the sorting provided by groupby() on the specified columns to compare current and previous values. However, it would be neat if this could also be done in a more Pandas-esque way using something akin to a SQL self-join.

NB request first made in pystatsmodels Google Group; was asked by Wes to create a Github issue for this.