gh-115399: Upgrade bundled libexpat to 2.6.0 by sethmlarson · Pull Request #115431 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation16 Commits2 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

sethmlarson

Closes #115399

cc @hartwork

ambv

@@ -3720,7 +3720,7 @@ AS_VAR_IF([with_system_expat], [yes], [
LIBEXPAT_LDFLAGS=${LIBEXPAT_LDFLAGS-"-lexpat"}
LIBEXPAT_INTERNAL=
], [
LIBEXPAT_CFLAGS="-I\$(srcdir)/Modules/expat"
LIBEXPAT_CFLAGS="-I\$(srcdir)/Modules/expat -DXML_GE=1"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make a choice, error message from the Windows builds that are now failing cuz I need to update them as well:

XML_GE (for general entities) must be defined, non-empty, either 1 or 0 (0 to disable, 1 to enable; 1 is a common default

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably update the default value in line 3719 as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's that?

@ambv "GE" is short for general entities. The flag is documented in detail at https://libexpat.github.io/doc/api/latest/#XML_GE .

@sethmlarson I would have expected (meaning considered likely) a new line #define XML_GE 1 to go into CPython's own expat_config.h. Did you avoid that file on purpose? Its sibling XML_DTD seems to live there.

You should probably update the default value in line 3719 as well.

For anyone else wondering where (or what the line content is), that seems to be:

LIBEXPAT_CFLAGS=${LIBEXPAT_CFLAGS-""}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork Thank you for this, I don't know how but my grep XML_DTD turned up no results. That's definitely where it should go.

@hartwork

[..] I set it to 1 to keep general entities support. Our test suite didn't fail either way with this setting but I'm assuming we want to keep support?

@sethmlarson good call, yes. On a sidenote XML_DTD requires XML_GE so it's the only choice here with XML_DTD active anyway. Good 👍

@sethmlarson

Thanks for the guidance @hartwork! I rearranged the configuration to do as you said.

I also split the update itself into a separate commit from the SBOM update (which removes expat/expat_config.h from being a part of the expat project) so it's easier to backport to older branches. This PR can directly be backported to 3.12 but 3.11 and earlier will need only a011b5f

hartwork

Comment on lines 67 to 71

/* Namespace external symbols to allow multiple libexpat version to
co-exist. */
#include "pyexpatns.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sethmlarson could it be that this removal is accidental?

PS: Here's the Dockerfile that made me spot this, based on earlier version #98742 (review):

Copyright (c) 2022-2024 Sebastian Pipping sebastian@pipping.org

Licensed under the Apache License version 2.0

FROM alpine RUN apk add --update
diffutils
git
sed
&&
git clone --depth 1 https://github.com/python/cpython cpython-main
&&
( cd cpython-main && git rev-parse HEAD )
&&
git clone --depth 1 --branch libexpat-2.6.0 https://github.com/sethmlarson/cpython cpython-pr
&&
( cd cpython-pr && git rev-parse HEAD )
&&
git config --global advice.detachedHead false
&&
git clone --depth 1 --branch R_2_5_0 https://github.com/libexpat/libexpat libexpat_2_5_0
&&
git clone --depth 1 --branch R_2_6_0 https://github.com/libexpat/libexpat libexpat_2_6_0
&&
diff -r -u libexpat_2_5_0/expat/lib/ cpython-main/Modules/expat/ | tee 2-5-0.diff
&&
diff -r -u libexpat_2_6_0/expat/lib/ cpython-pr/Modules/expat/ | tee 2-6-0.diff
&&
sed -e '/^Only in /d' -e '/^(+++|---) /d' -e '/^diff /d' -i 2-5-0.diff 2-6-0.diff
&&
diff -u 2-5-0.diff 2-6-0.diff
&&
echo 'Diff is good.'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hugovk It feels like now we have to automate this diff comparison. Do you have any good ideas on when to trigger the workflow smartly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corona10 did you mean me or @hugovk?

To automate this, you'd need to extract an old and a new version from the diff between the push HEAD and the base commit and so on. I'm not sure if Expat releases are frequent enough to build automation like that and keep it working.

Pull request #100539 is about something related but different, but maybe still of interest to you.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corona10 PS: maybe a cheap version to cover the core need (unless misunderstood) would be to make CI assert that that file still includes file pyexpatns.h using grep? That would be orders of magnitude simpler 😃

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this, I've updated my commit to not remove these lines.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork
I meant @hugovk since he has good insight into automation workflow in CPython.
But in any case, the link you introduced looks cool to me, I think that it would be worth to take a look at :)

@sethmlarson

@sethmlarson

Yhg1s

hartwork

ambv

ambv approved these changes Feb 14, 2024

@sethmlarson

Backport to 3.11 and prior of this PR: #115468

3.12 will be handled automatically once this PR merges.

@miss-islington-app

Thanks @sethmlarson for the PR, and @Yhg1s for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

Feb 14, 2024

@sethmlarson @miss-islington

(cherry picked from commit 4b2d178)

Co-authored-by: Seth Michael Larson seth@python.org

@bedevere-app

gpshead pushed a commit that referenced this pull request

Feb 14, 2024

@miss-islington @sethmlarson

)

gh-115399: Upgrade bundled libexpat to 2.6.0 (GH-115431) (cherry picked from commit 4b2d178)

Co-authored-by: Seth Michael Larson seth@python.org

Reviewers

@ambv ambv ambv approved these changes

@hartwork hartwork hartwork approved these changes

@Yhg1s Yhg1s Yhg1s approved these changes

@erlend-aasland erlend-aasland Awaiting requested review from erlend-aasland

@corona10 corona10 Awaiting requested review from corona10

@ericsnowcurrently ericsnowcurrently Awaiting requested review from ericsnowcurrently ericsnowcurrently is a code owner