BLD: Build wheels using cibuildwheel by lithomas1 · Pull Request #48283 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation36 Commits10 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

lithomas1

We can't switch to this instantly since there's still a lot of work to be done, but this is a good start(the wheels build).
This PR can be merged anytime, though, and separate PRs can be put up to address issues.

The upload script probably won't work(it should, though, it is copy pasted from numpy's), but the only way to check is to merge this.

Current issues:

@lithomas1

@lithomas1

mroeschke

mroeschke

mroeschke

mroeschke

mroeschke

mroeschke

@lithomas1 @mroeschke

Co-Authored-By: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

@lithomas1

WillAyd

success = True
exception = None
repaired_wheel_path = os.path.join(dest_dir, wheel_name)
with zipfile.ZipFile(repaired_wheel_path, "a") as zipf:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using zipfile versus wheel? Not sure what implementation differences there are but likely safer to use wheel unpack

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wheels are zip files underneath. I don't think using wheel is a good idea since wheel has no public Python API, and I don't want to do a bunch of subprocess calls here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's a bit unfortunate with wheel. The other consideration point is that the zipfile is mostly an implementation detail of wheel at the moment; may or may not last forever. There's been some talk about this over the past few years

https://discuss.python.org/t/making-the-wheel-format-more-flexible-for-better-compression-speed/3810

There's also apparently some file hashes that wheel takes care of that you wouldn't get this way (saw in wheel documentation). Not sure of the ultimate impacts of that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can stick with this for now. I'm just not a huge fan of The file hashes might be important, but given that pip installs the wheel without complaint, I'm going to ignore it for now. If anything fails, it will fail loudly in the future.

try:
# TODO: figure out how licensing works for the redistributables
base_redist_dir = (
f"C:/Program Files (x86)/Microsoft Visual Studio/2019/"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should maybe consider using a package to do this unless we have in house expertise on Windows libraries. delvewheel might be an option

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time I tried(~1 year back), delvewheel was doing creating extra directories in the repaired wheels.

Note: We test the pandas wheels on the windowsservercore docker images, which does not have the visual c++ redistributables. Any missing DLLS should fail the job there.

if wheel_path is None:
raise ValueError("Wheel path must be passed in if on 64-bit Windows")
print(f"Pulling docker image to test Windows 64-bit Python {py_ver}")
subprocess.run(f"docker pull python:{py_ver}-windowsservercore", check=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there will be too much improvement here(have not used that package myself). Looking at the examples, though, (e.g. client.containers.run("ubuntu:latest", "echo hello world")), it seems like the syntax is very similar to a subprocess call.

@lithomas1

@mroeschke

Looks pretty good to start. Could you show a commit where the workflow ran (and maybe save the wheel as an artifact)?

@lithomas1

@lithomas1

@lithomas1

If you scroll down on https://github.com/pandas-dev/pandas/actions/runs/3029167054, you can find the artifacts.

Note that the manylinux wheels are much larger, since I haven't passed the flags to strip wheels yet. I'll try to match flags in a follow up PR.
(The macosx wheels are normal size, even though they seem bigger since both the x86_64 and arm64 wheels are included in the artifact)

mroeschke

mroeschke

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment otherwise looks okay to merge for now and do the follow ups in subsequent PRs.

@lithomas1

The last remaining blocker I think is for someone with access to the Anaconda org to add the keys to this repo.

@lithomas1

@lithomas1

@TomAugspurger Sorry for the ping, but do you know how to get the tokens for the staging and nightly anaconda uploads?
I think you were the one to add the tokens originally to the pandas-wheels repo.

@TomAugspurger

@TomAugspurger

Oh, or I can just set the environment variable in this repo. Let me do that quick.

@TomAugspurger

@lithomas1 there's now a PANDAS_NIGHTLY_UPLOAD_TOKEN secret in GitHub that should have permission to upload to that group in anaconda.org.

@lithomas1

@lithomas1 there's now a PANDAS_NIGHTLY_UPLOAD_TOKEN secret in GitHub that should have permission to upload to that group in anaconda.org.

Thanks, can you also add the token for multibuild-wheels-staging under PANDAS_STAGING_UPLOAD_TOKEN?

@TomAugspurger

@lithomas1

@WillAyd Can you re-review again when you have time?
I think your review is blocking the merge.

(No rush though, this can go in anytime before 1.6/2.0)

WillAyd

@WillAyd

@lithomas1 sorry about that. Let's see how this goes - thanks for the ongoing improvements

mroeschke

This was referenced

Sep 21, 2022

@lithomas1

@lithomas1

@lithomas1

Going to put this in now given the approvals. Let's see how the nightlies work out this night.

phofl pushed a commit to phofl/pandas that referenced this pull request

Sep 22, 2022

Co-Authored-By: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

mroeschke added a commit that referenced this pull request

Sep 26, 2022

…8662)

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

scorecard-action does not have a major version tag.

Temporarily disabling github.repository check to ensure action now works.

https://pandas.pydata.org/docs/reference/api/pandas.merge.html It should be "str | bool" instead of just string

fixed type hint in merge.py

Update indicator type hint in _MergeOperation

Added type hint _MergeOperation init

DOC: Document default value for options.display.max_cols

display.max_cols has a default value of 20 when not running in a terminal such as Jupyter Notebook

Co-Authored-By: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com Co-authored-by: Ralf Gommers ralf.gommers@gmail.com Co-authored-by: Marc Garcia garcia.marc@gmail.com Co-authored-by: Luke Manley lukemanley@gmail.com Co-authored-by: Siddhartha Gandhi siddhartha.a.gandhi@gmail.com Co-authored-by: Torsten Wörtwein twoertwein@users.noreply.github.com Co-authored-by: Xiao Yuan yuanx749@gmail.com Co-authored-by: paradox-lab 57354735+paradox-lab@users.noreply.github.com Co-authored-by: Pedro Nacht 15221358+pnacht@users.noreply.github.com Co-authored-by: dataxerik dsshar@gmail.com Co-authored-by: jbrockmendel jbrockmendel@gmail.com Co-authored-by: Pablo 48098178+PabloRuizCuevas@users.noreply.github.com Co-authored-by: tmoschou 5567550+tmoschou@users.noreply.github.com Co-authored-by: Thomas Li 47963215+lithomas1@users.noreply.github.com Co-authored-by: Richard Shadrach 45562402+rhshadrach@users.noreply.github.com

noatamir pushed a commit to noatamir/pandas that referenced this pull request

Nov 9, 2022

Co-Authored-By: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

noatamir pushed a commit to noatamir/pandas that referenced this pull request

Nov 9, 2022

…ndas-dev#48662)

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

scorecard-action does not have a major version tag.

Temporarily disabling github.repository check to ensure action now works.

https://pandas.pydata.org/docs/reference/api/pandas.merge.html It should be "str | bool" instead of just string

fixed type hint in merge.py

Update indicator type hint in _MergeOperation

Added type hint _MergeOperation init

DOC: Document default value for options.display.max_cols

display.max_cols has a default value of 20 when not running in a terminal such as Jupyter Notebook

Co-Authored-By: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com

Co-authored-by: Matthew Roeschke 10647082+mroeschke@users.noreply.github.com Co-authored-by: Ralf Gommers ralf.gommers@gmail.com Co-authored-by: Marc Garcia garcia.marc@gmail.com Co-authored-by: Luke Manley lukemanley@gmail.com Co-authored-by: Siddhartha Gandhi siddhartha.a.gandhi@gmail.com Co-authored-by: Torsten Wörtwein twoertwein@users.noreply.github.com Co-authored-by: Xiao Yuan yuanx749@gmail.com Co-authored-by: paradox-lab 57354735+paradox-lab@users.noreply.github.com Co-authored-by: Pedro Nacht 15221358+pnacht@users.noreply.github.com Co-authored-by: dataxerik dsshar@gmail.com Co-authored-by: jbrockmendel jbrockmendel@gmail.com Co-authored-by: Pablo 48098178+PabloRuizCuevas@users.noreply.github.com Co-authored-by: tmoschou 5567550+tmoschou@users.noreply.github.com Co-authored-by: Thomas Li 47963215+lithomas1@users.noreply.github.com Co-authored-by: Richard Shadrach 45562402+rhshadrach@users.noreply.github.com

Labels

Build

Library building on various platforms

CI

Continuous Integration

Release