PERF: Release GIL on some datetime ops by chris-b1 · Pull Request #11263 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
This is a WIP, but far enough along I thought I'd share and see if the approach was reasonable.
This releases the GIL on most vectorized field accessors (e.g. dt.year) and conversion to and from Period. May be places it could be done - obviously would be nice for parsing, but I'm not sure that's possible.
ohh nice!
can u share some timings?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move nogil outside the loop
Here are some timings - getting a pretty nice speedup. In single-threaded case things are looking about flat.
In [1]: from pandas.util.testing import test_parallel In [2]: dti = pd.date_range('1900-1-1', periods=100000)
In [3]: def f(): ...: for i in range(4): ...: dti.year In [4]: @test_parallel(4) ...: def g(): ...: dti.year
In [8]: %timeit f() 10 loops, best of 3: 25.8 ms per loop
In [9]: %timeit g() 100 loops, best of 3: 7.71 ms per loop
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prob makes sense to define this as a c-function and make it nogil (the days_per_month......)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you declared field as char[:] instead would you be able to nogil the whole thing until raise?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chris-b1 loooks good. can you add a whatsnew note (perf) and squash.
chris-b1 changed the title
(WIP) PERF: Release GIL on some datetime ops PERF: Release GIL on some datetime ops
jreback added a commit that referenced this pull request
PERF: Release GIL on some datetime ops
@chris-b1 can you add these (clean then make again to see them)
warning: pandas/src/period.pyx:144:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:145:23: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:147:55: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:148:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:169:24: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:170:19: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:15: Use boundscheck(False) for faster access
warning: pandas/src/period.pyx:172:53: Use boundscheck(False) for faster access
building 'pandas._period' extension