runc init hang up (original) (raw)

Description

In an environment with Kubernetes, containerd, and runc, creating a Pod fails due to a timeout. With debug level enabled, log.json is examined.

The runc init process first executes the C code void nsexec(void) and then the Go code init(). However, based on the captured logs, there is a 1-minute gap between the completion of void nsexec(void) and the entry log of init() (child process in init()). The load average is 7.43 with a 24-core CPU.

{"level":"debug","msg":"nsexec[191985]: =\u003e nsexec container setup","time":"2024-03-19T22:15:23+08:00"}
{"level":"debug","msg":"nsexec[191985]: update /proc/self/oom_score_adj to '-998'","time":"2024-03-19T22:15:23+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: ~\u003e nsexec stage-0","time":"2024-03-19T22:15:23+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: spawn stage-1","time":"2024-03-19T22:15:23+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: ~\u003e nsexec stage-1","time":"2024-03-19T22:15:24+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: unshare remaining namespaces (except cgroupns)","time":"2024-03-19T22:15:24+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: -\u003e stage-1 synchronisation loop","time":"2024-03-19T22:15:25+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: spawn stage-2","time":"2024-03-19T22:15:26+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: request stage-0 to forward stage-2 pid (192912)","time":"2024-03-19T22:15:26+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: stage-1 requested pid to be forwarded","time":"2024-03-19T22:15:27+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: signal completion to stage-0","time":"2024-03-19T22:15:27+08:00"}
{"level":"debug","msg":"nsexec-2[192912]: ~\u003e nsexec stage-2","time":"2024-03-19T22:15:27+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: forward stage-1 (192851) and stage-2 (192912) pids to runc","time":"2024-03-19T22:15:27+08:00"}
{"level":"debug","msg":"nsexec-1[192851]: \u003c~ nsexec stage-1","time":"2024-03-19T22:15:27+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: stage-1 complete","time":"2024-03-19T22:15:28+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: \u003c- stage-1 synchronisation loop","time":"2024-03-19T22:15:28+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: -\u003e stage-2 synchronisation loop","time":"2024-03-19T22:15:28+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: signalling stage-2 to run","time":"2024-03-19T22:15:28+08:00"}
{"level":"debug","msg":"nsexec-2[192912]: signal completion to stage-0","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: stage-2 complete","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: \u003c- stage-2 synchronisation loop","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"nsexec-0[191985]: \u003c~ nsexec stage-0","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"nsexec-2[192912]: \u003c= nsexec container setup","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"nsexec-2[192912]: booting up go runtime ...","time":"2024-03-19T22:15:29+08:00"}
{"level":"debug","msg":"child process in init()","time":"2024-03-19T22:17:02+08:00"}

Steps to reproduce the issue

Describe the results you received and expected

normal

What version of runc are you using?

runc version 1.1.12
commit: fa78728-dirty
spec: 1.0.2-dev
go: go1.20.7
libseccomp: 2.5.3

Host OS information

NAME="EulerOS"
VERSION="2.0 (SP12)"
ID="euleros"
VERSION_ID="2.0"
PRETTY_NAME="EulerOS 2.0 (SP12)"
ANSI_COLOR="0;31"

Host kernel information

Linux 5.10.0-136.12.0.86.h1425.eulerosv2r12.aarch64 #1 SMP Mon Dec 25 16:12:25 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux