Reduce contention on updating runner_id in ci_builds (#36154) · Issues · GitLab.org / GitLab · GitLab (original) (raw)

Skip to content

GitLab Next

Reduce contention on updating runner_id in ci_builds

From https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8395#note_243107996:

On the master, UPDATE ci_builds .. were taking 3/4 of all total_time today, see K003 for the master:image

This becoming more and more noticeable: every second, 8 out of 12 seconds spent by the master's CPUs were doing this work.

It's a contention problem - queries are blocked by some other queries that work on the same records. Note how low shared_blks_hit and shared_blks_read are, we don't have a lot of data to process, we just wait being blocked. It leads to having more and more Postgres backends blocked and sitting in active state. We should avoid this.

Can we rework it to use SELECT .. FOR UPDATE SKIP LOCKED or SELECT .. FOR UPDATE NOWAIT? This would completely solve the problem of waiting being blocked. Similar work was done recently for merge_requests by @stanhu, see !18481 (diffs).

This SQL query is likely happening in assign_runner!: https://gitlab.com/gitlab-org/gitlab/blob/febfd21ee6733f11b9713ab7ee968bae190c10ef/app%2Fservices%2Fci%2Fregister_job_service.rb#L95-116