BUG: CategoricalIndex.format by topper-123 · Pull Request #35440 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation5 Commits5 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

Thanks @topper-123 for the PR.

The regression was caused by #35118. Categorical types other than object were also affected. maybe need to parameterise test with other values for cols

>>> pd.__version__
'1.2.0.dev0+10.g3b1d4f1ee'
>>>
>>> data = [[4, 2], [3, 2], [4, 3]]
>>> cols = [1, None]
>>> res = pd.DataFrame(data, columns=cols)
>>> print(res)
   1  NaN
0  4    2
1  3    2
2  4    3
>>>
>>> res = pd.DataFrame(data, columns=pd.CategoricalIndex(cols))
>>> print(res)
   1    NaN
0    4    2
1    3    2
2    4    3
>>>

>>> pd.__version__
'1.0.5'
>>>
>>> data = [[4, 2], [3, 2], [4, 3]]
>>> cols = [1, None]
>>> res = pd.DataFrame(data, columns=cols)
>>> print(res)
   1  NaN
0  4    2
1  3    2
2  4    3
>>>
>>> res = pd.DataFrame(data, columns=pd.CategoricalIndex(cols))
>>> print(res)
   1.0  NaN
0    4    2
1    3    2
2    4    3
>>>

I've temporarily put the whatsnewentry in the v.1.1.0 release note, because there isn't a v.1.1.1 version yet.

doc\source\whatsnew\v1.1.1.rst now merged to master

@@ -197,9 +197,6 @@ def _format_data(self, name=None):
# we are formatting thru the attributes
return None

def _format_with_header(self, header, na_rep="NaN") -> List[str]:
return header + [pprint_thing(x) for x in self._range]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added tests revealed that this method in master made the output from RangeIndex.format different than for Int64Index.format:

pd.RangeIndex(0, 18, 2).format() ['0', '2', '4', '6', '8', '10', '12', '14', '16'] pd.Int64Index(range(0, 18, 2)).format() ['0 ', '2 ', '4 ', '6 ', '8 ', '10', '12', '14', '16']

Notice the extra space for one-digit scalars in the Int64Index case. The outputs from the two methods are identical after merging this PR.

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request

Aug 4, 2020

simonjayhawkins pushed a commit that referenced this pull request

Aug 4, 2020

Co-authored-by: Terji Petersen contribute@tensortable.com