PERF: Improve Styler to_excel
Performance by tehunter · Pull Request #47371 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation11 Commits16 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- closes PERF: Excel Styler treatment of CSS side expansion is slow #47352
- Tests added and passed if fixing a bug or adding a new feature
- All code checks passed.
- Added type annotations to new arguments/methods/functions.
- Added an entry in the latest
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.
Thomas Hunter and others added 5 commits
Hello @tehunter! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻
Comment last updated at 2022-06-29 14🔞30 UTC
ASV Results:
.. before after ratio
[6f0be79b] [db365209]
<styler-performance>
- 915±200ms 594±30ms 0.65 io.excel.WriteExcel.time_write_excel('openpyxl')
- 658±200ms 323±20ms 0.49 io.excel.WriteExcel.time_write_excel('xlsxwriter')
- 3.08±0.1s 934±40ms 0.30 io.excel.WriteExcelStyled.time_write_excel_style('xlsxwriter')
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
tehunter changed the title
PERF: Improve Styler Performance PERF: Improve Styler to_excel
Performance
@github-actions pre-commit
tehunter marked this pull request as ready for review
Overview of changes:
- Moved
CSSResolver.expand_*
methods (which are accessed in.atomize
) to a dictionary calledCSS_EXPANSIONS
which is keyed directly by the CSS property, instead of having to do a string replace and concatenation. CSSResolver.__call__
now accepts an iterable of declaration tuples(property, value)
in addition to the previous string type.- When an iterable is passed,
.parse
does not need to be called
- When an iterable is passed,
CSSExcelCell.__init__
no longer converts the css_styles (i.e.Styler.ctx
) from a list of tuples to a CSS string. Instead, the list is converted to a frozenset via a dictionary (to keep only the final instance of each property declaration)- The list of tuples was already obtained from the initial CSS string(s) in
Styler._update_ctx
, so we were previously parsing, reforming into string, and parsing again.
- The list of tuples was already obtained from the initial CSS string(s) in
CSSToExcelConverter.__call__
is now cached usinglru_cache
. The frozenset was necessary to make all the arguments hashable for the caching mechanism.
Benchmarks as of 8e56402
. before after ratio
[2b1184dd] [8e56402e]
<styler-performance>
- 260±10ms 233±5ms 0.90 io.excel.WriteExcel.time_write_excel('xlsxwriter')
- 620±70ms 450±20ms 0.73 io.excel.WriteExcel.time_write_excel('openpyxl')
- 2.51±0.02s 1.42±0.3s 0.57 io.excel.WriteExcelStyled.time_write_excel_style('openpyxl')
- 1.69±0.07s 608±10ms 0.36 io.excel.WriteExcelStyled.time_write_excel_style('xlsxwriter')
- ```
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request
Move CSS expansion lookup to dictionary
Implement simple CSSToExcelConverter cache
Eliminate list -> str -> list in CSSResolver
Allow for resolution of duplicate properties
Add performance benchmark for styled Excel
CLN: Clean up PEP8 issues
DOC: Update PR documentation
CLN: Clean up PEP8 issues
Fixes from pre-commit [automated commit]
Make Excel CSS case-insensitive
Test for ordering and caching
Pre-commit fixes
Remove built-in filter
Increase maxsize of Excel cache
Co-authored-by: Thomas Hunter Thomas.Hunter@ibm.com