Alpha node swap support by ehashman · Pull Request #102823 · kubernetes/kubernetes (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation57 Commits10 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

ehashman

What type of PR is this?

/kind feature
/kind api-change
/sig node

What this PR does / why we need it:

Adds swap support per KEP-2400.

Which issue(s) this PR fixes:

Fixes #53533.

Special notes for your reviewer:

Design details in https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#design-details

Does this PR introduce a user-facing change?

Alpha swap support can now be enabled on Kubernetes nodes with the NodeSwapEnabled feature flag. See <website link> for details.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/

@k8s-ci-robot

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note

Denotes a PR that will be considered when it comes time to generate release notes.

do-not-merge/work-in-progress

Indicates that a PR should not merge because it is a work in progress.

kind/feature

Categorizes issue or PR as related to a new feature.

size/M

Denotes a PR that changes 30-99 lines, ignoring generated files.

kind/api-change

Categorizes issue or PR as related to adding, removing, or otherwise changing an API

sig/node

Categorizes an issue or PR as relevant to SIG Node.

cncf-cla: yes

Indicates the PR's author has signed the CNCF CLA.

needs-triage

Indicates an issue or PR lacks a `triage/foo` label and requires one.

needs-priority

Indicates a PR lacks a `priority/foo` label and requires one.

labels

Jun 12, 2021

@ehashman

@k8s-ci-robot

@ehashman: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

Use /test all to run the following jobs:

In response to this:

/test verify

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ehashman

/test pull-kubernetes-verify

@k8s-ci-robot k8s-ci-robot added size/XXL

Denotes a PR that changes 1000+ lines, ignoring generated files.

and removed size/M

Denotes a PR that changes 30-99 lines, ignoring generated files.

labels

Jun 16, 2021

@ehashman

/test pull-kubernetes-verify

@ehashman

@ehashman ehashman marked this pull request as ready for review

June 16, 2021 01:12

@ehashman

/test pull-kubernetes-node-kubelet-swap-ubuntu
/test pull-kubernetes-node-kubelet-swap-fedora

@k8s-ci-robot

@ehashman: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

Use /test all to run the following jobs:

In response to this:

/test pull-kubernetes-node-kubelet-swap-ubuntu
/test pull-kubernetes-node-kubelet-swap-fedora

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ehashman

derekwaynecarr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two updates requested:

Thanks!

switch m.memorySwapBehavior {
case kubelettypes.UnlimitedSwap:
// -1 = unlimited swap
lc.Resources.MemorySwapLimitInBytes = -1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just recording this as this swaps out of my own mental cache sometimes.

if a container has a defined memory limit X, it will still have MemoryLimitInBytes=X, but it may now use unbounded additional swap by setting MemorySwapLimitInBytes if UnlimitedSwap is enabled. This is consistent with existing behavior where --fail-swap-on was false because no kubelet enforced limit was written.

@@ -89,6 +90,21 @@ func (m *kubeGenericRuntimeManager) generateLinuxContainerConfig(container *v1.C
lc.Resources.HugepageLimits = GetHugepageLimitsFromResources(container.Resources)
if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwapEnabled) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a similar change in ResourceConfigForPod for pod level cgroup settings created by pod cgroup manager. I would expect them to match the container settings. I think memory backed volumes could ultimately use swap, but would like @sjenning to confirm. Either way, the cgroup settings for memory should match pod and container scopes.

@derekwaynecarr

@ehashman

@derekwaynecarr

thanks for updates.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm

"Looks good to me", indicates that a PR is ready to be merged.

label

Jul 6, 2021

@ehashman

@ehashman

@liggitt

/approve
for API/config bits

@k8s-ci-robot

@k8s-ci-robot k8s-ci-robot added the approved

Indicates a PR has been approved by an approver from all required OWNERS files.

label

Jul 6, 2021

@fejta-bot

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot

@ehashman: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-alpha-features 5584725 link /test pull-kubernetes-e2e-gce-alpha-features
pull-kubernetes-node-swap-fedora 5584725 link /test pull-kubernetes-node-swap-fedora
pull-kubernetes-node-kubelet-swap-ubuntu 5584725 link /test pull-kubernetes-node-kubelet-swap-ubuntu

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

tengqm added a commit to tengqm/website that referenced this pull request

Jul 8, 2021

@tengqm

tengqm added a commit to tengqm/website that referenced this pull request

Jul 9, 2021

@tengqm

tengqm added a commit to tengqm/website that referenced this pull request

Jul 11, 2021

@tengqm

tengqm added a commit to tengqm/website that referenced this pull request

Jul 18, 2021

@tengqm

timebertt added a commit to timebertt/gardener that referenced this pull request

Oct 14, 2021

@timebertt

timebertt added a commit to gardener/gardener that referenced this pull request

Oct 14, 2021

@timebertt

github.com/go-openapi/spec seems to be orphaned after previous make generate

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use release commit instead)

Fix linting errors: Assertion redeclared in this block (typecheck)

ref kubernetes-sigs/controller-runtime#1626

ref kubernetes/kubernetes#99494

Maps (e.g. labels, selectors, resource requirements) might be sorted differently than expected. Hence, use semantic equality instead of strict equality, as this is what matters to us. Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing some large confusing go struct representation.

ref kubernetes/kubernetes#102823

There were several changes in the fake clients that might cause the failure to happen just now.

These tests were not preparing the test objects correctly: they only updated them in memory but not on the fake client. This wasn't caught until now because the fake client mimicked the real json decoder, which didn't unset fields not present on the server. Now that the fake client zeroes fields, the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

Now that the c-r client zeroes fields before decoding into the object, we can drop our workarounds for this, so basically drop kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

webhookConfig.SetGroupVersionKind is not needed anymore with kubernetes-sigs/controller-runtime#1665

but with go 1.16.9

krgostev pushed a commit to krgostev/gardener that referenced this pull request

Apr 21, 2022

@timebertt

github.com/go-openapi/spec seems to be orphaned after previous make generate

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use release commit instead)

Fix linting errors: Assertion redeclared in this block (typecheck)

ref kubernetes-sigs/controller-runtime#1626

ref kubernetes/kubernetes#99494

Maps (e.g. labels, selectors, resource requirements) might be sorted differently than expected. Hence, use semantic equality instead of strict equality, as this is what matters to us. Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing some large confusing go struct representation.

ref kubernetes/kubernetes#102823

There were several changes in the fake clients that might cause the failure to happen just now.

These tests were not preparing the test objects correctly: they only updated them in memory but not on the fake client. This wasn't caught until now because the fake client mimicked the real json decoder, which didn't unset fields not present on the server. Now that the fake client zeroes fields, the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

Now that the c-r client zeroes fields before decoding into the object, we can drop our workarounds for this, so basically drop kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

webhookConfig.SetGroupVersionKind is not needed anymore with kubernetes-sigs/controller-runtime#1665

but with go 1.16.9

krgostev pushed a commit to krgostev/gardener that referenced this pull request

Jul 5, 2022

@timebertt

github.com/go-openapi/spec seems to be orphaned after previous make generate

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use release commit instead)

Fix linting errors: Assertion redeclared in this block (typecheck)

ref kubernetes-sigs/controller-runtime#1626

ref kubernetes/kubernetes#99494

Maps (e.g. labels, selectors, resource requirements) might be sorted differently than expected. Hence, use semantic equality instead of strict equality, as this is what matters to us. Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing some large confusing go struct representation.

ref kubernetes/kubernetes#102823

There were several changes in the fake clients that might cause the failure to happen just now.

These tests were not preparing the test objects correctly: they only updated them in memory but not on the fake client. This wasn't caught until now because the fake client mimicked the real json decoder, which didn't unset fields not present on the server. Now that the fake client zeroes fields, the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

Now that the c-r client zeroes fields before decoding into the object, we can drop our workarounds for this, so basically drop kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

webhookConfig.SetGroupVersionKind is not needed anymore with kubernetes-sigs/controller-runtime#1665

but with go 1.16.9

Labels

api-review

Categorizes an issue or PR as actively needing an API review.

approved

Indicates a PR has been approved by an approver from all required OWNERS files.

area/kubelet cncf-cla: yes

Indicates the PR's author has signed the CNCF CLA.

kind/api-change

Categorizes issue or PR as related to adding, removing, or otherwise changing an API

kind/feature

Categorizes issue or PR as related to a new feature.

lgtm

"Looks good to me", indicates that a PR is ready to be merged.

priority/important-soon

Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

release-note

Denotes a PR that will be considered when it comes time to generate release notes.

sig/node

Categorizes an issue or PR as relevant to SIG Node.

size/XXL

Denotes a PR that changes 1000+ lines, ignoring generated files.

triage/accepted

Indicates an issue or PR is ready to be actively worked on.