ENH: support .strftime for datetimelikes (closes #10086) by mortada · Pull Request #10110 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation44 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

mortada

@jreback

this works for DatetimeIndex, something broken for PeriodIndex. And should not work for TimedeltaIndex (as I don't think the strftime codes are meaningful). Further want to actually define .strftime on the index itself as well (maybe def strftime(self, format):).

needs tests for the same

@mortada

that makes sense

I've added to both DatetimeIndex and PeriodIndex. Also added a few tests.

@jreback

pls add strftime to API.rst
also I think a mini section in time series.rst would be ok too

jreback

@@ -694,6 +694,21 @@ def astype(self, dtype):
else: # pragma: no cover
raise ValueError('Cannot cast DatetimeIndex to dtype %s' % dtype)
def strftime(self, date_format):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, not to make this more complicated, but maybe move to tseries/base/DatelikeOps (new mixin), that both DatetimeIndex/PeriodIndex, but NOT TimedeltaIndex inherit from.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about something like that too, good idea for code reuse! Will update shortly

@mortada

@mortada

@jreback I added a mini section to basics.rst instead, but otherwise this is ready for review

jreback

@@ -1164,12 +1164,28 @@ You can also chain these types of operations:
s.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
You can also format the datetime values with ``Series.dt.strftime``:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe be explict and say that this is a string

@jreback

can you update for comments. also related to #10442. I want this method to be the 'primary' date-string formatting (though .astype(str) will be supported by calling what this calls, namely .format), but this should be the goto method

copy your example that you added to basics.rst and put in 0.17.0 (in a separate section, maybe datetime-string formatting or somesuch)

@mortada

@jreback updated according to your comments except for one thing: I couldn't actually find an invalid formatting string that would cause strftime to raise an exception

jreback

@@ -1238,12 +1238,28 @@ You can also chain these types of operations:
s.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
You can also format datetime values as strings with ``Series.dt.strftime``:
.. ipython:: python

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add a reference to the standard library docs for the strftime format here

jreback

@@ -32,6 +32,23 @@ Other enhancements
^^^^^^^^^^^^^^^^^^
- ``.as_blocks`` will now take a ``copy`` optional argument to return a copy of the data, default is to copy (no change in behavior from prior versions), (:issue:`9607`)
- Support ``.strftime`` for datetime-likes (:issue:`10110`)
For example:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all needs to be indented 2 spaces (to form a sub-block)

@jorisvandenbossche

We also have a tslib.format_array_from_datetime function. Can you check if this is faster? (also uses strftime, but the loop is cythonized)

@mortada

jreback

imask = ~mask
values[imask] = np.array([u('%s') % dt for dt in values[imask]])
formatter = lambda dt: dt.strftime(date_format) if date_format else u('%s') % dt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the if on the outside (and lambdas on the inside), then check is only done once

@jreback

@jorisvandenbossche this calls .format(date_format=...) which ultimately calls the cython tslib.format_array_from_datetime, so this is just syntatic sugar.

@mortada

@jreback reordered the lambdas and the if/else as you suggested, also added more unit tests

@jreback

ok, lgtm. ping when green.

@mortada

jorisvandenbossche

@@ -1280,12 +1280,29 @@ You can also chain these types of operations:
s.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
You can also format datetime values as strings with ``Series.dt.strftime`` which

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use :meth:Series.dt.strftime`` here as well, then it makes the link to the doc page

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although that is maybe not that important here as it is exactly the same as standard library

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure will do

@mortada

@mortada

jreback

@@ -2362,7 +2362,7 @@ def test_map(self):
f = lambda x: x.strftime('%Y%m%d')
result = rng.map(f)
exp = [f(x) for x in rng]
self.assert_numpy_array_equal(result, exp)
np.testing.assert_array_equal(result, exp)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change back. we don't use np.testing.* directly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I changed it because assert_numpy_array_equal() actually would fail in some of the testing environments. It seems like a problem with the older numpy version (1.8.x)

what should I change it to instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, you are not comparing numpy arrays (they are actually a list and an array), use exp = np.asarray(....) and I think it will work, you can also try tm.assert_almost_equal(...)

scratch that first part, use tm.assert_almost_equal; I though you were comparing the output of .strftime which was an np.array

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah cool I think tm.assert_almost_equal works, thanks! will update shortly

@jreback

@mortada couple of comments. ping when green.

@mortada

@mortada

@jreback