ERR: better error reporting with .transform and an invalid output from a UDF · Issue #10165 · pandas-dev/pandas (original) (raw)
I think we can detect the returned values and if they are not scalars, give a nicer error message. Further a doc-note would be helpful.
In [4]: cols = pd.MultiIndex.from_tuples([('syn', 'A'), ('mis', 'A'), ('non', 'A'), ('syn', 'C'), ('mis', 'C'), ('non', 'C'),
('syn', 'T'), ('mis', 'T'), ('non', 'T'), ('syn', 'G'), ('mis', 'G'), ('non', 'G')])
In [5]: df = pd.DataFrame(np.random.randint(1, 10, (4,12)), columns=cols, index=['A', 'C', 'G', 'T'])
In [6]: df.groupby(axis=1, level=1).transform(lambda z: z.div(z.sum(axis=1), axis=0))
ValueError: Length mismatch: Expected axis has 4 elements, new values have 12 elements