DataFrame self-joins · Issue #2996 · pandas-dev/pandas (original) (raw)

Given the following DataFrame

area	point	test	value
A	11	0	1234234
A	11	1	12341234
A	16	0	234234
A	16	1	2343
A	16	2	234234
C	4	0	234234
C	4	1	234234

it would be nice if there were a way of grouping say columns area and point and comparing the value per test > 1 with the value for test - 1.

This can be done by iterating over df.groupby(['area', 'point', 'test']) and using the sorting provided by groupby() on the specified columns to compare current and previous values. However, it would be neat if this could also be done in a more Pandas-esque way using something akin to a SQL self-join.

NB request first made in pystatsmodels Google Group; was asked by Wes to create a Github issue for this.