API: Add equals method to NDFrames. by unutbu · Pull Request #5283 · pandas-dev/pandas (original) (raw)

here's a non-unique example; essentially the placement is a set index to locations (as opposed to
the unique case where .ref_locs computes the indexer), here is is 'set' (by the calling function). You need this for the non-uniques to map the items in a block to the ref_items as they both could be non-unique (even across blocks).

This may not answer your question about the unique case, which I am thinking because of the reindex actually DOES guarantee orderings.(certainly on the items), but maybe on the blocks (as I said I cannot prove that it does not work)

from pandas.core.internals import make_block, BlockManager
import numpy as np
from pandas import Index

index = Index(list('aaabbb'))
block1 = make_block(np.arange(12).reshape(3,4), list('aaa'), index, placement=[0,1,2])
block2 = make_block(np.arange(12).reshape(3,4)*10, list('bbb'), index, placement=[3,4,5])
block1.ref_items = block2.ref_items = index
bm1 = BlockManager([block1, block2], [index, np.arange(block1.shape[1])])
bm2 = BlockManager([block2, block1], [index, np.arange(block1.shape[1])])

print "before consolidation"
print bm1
print bm1.blocks[0]._ref_locs
print bm2.blocks[0]._ref_locs
print bm2
print bm1.blocks[0]._ref_locs
print bm2.blocks[0]._ref_locs

bm1._consolidate_inplace()
bm2._consolidate_inplace()

print "\nafter consolidation"
print bm1
print bm1.blocks[0]._ref_locs
print bm2
print bm2.blocks[0]._ref_locs

output

before consolidation
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [a, a, a], 3 x 4, dtype: int64
IntBlock: [b, b, b], 3 x 4, dtype: int64
[0 1 2]
[3 4 5]
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [b, b, b], 3 x 4, dtype: int64
IntBlock: [a, a, a], 3 x 4, dtype: int64
[0 1 2]
[3 4 5]

after consolidation
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [a, a, a, b, b, b], 6 x 4, dtype: int64
[0 1 2 3 4 5]
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [b, b, b, a, a, a], 6 x 4, dtype: int64
[3 4 5 0 1 2]