Reference architectures | GitLab Docs (original) (raw)

The GitLab reference architectures are validated, production-ready environment designs for deploying GitLab at scale. Each architecture provides detailed specifications that you can use or adapt based on your requirements.

Available reference architectures

The following reference architectures are available as recommended starting points for your environment.

The architectures are named in terms of peak load, based on user count or requests per second (RPS). RPS is calculated based on average real data.

Each architecture is designed to be scalable and elastic. They can be adjusted accordingly based on your workload, upwards or downwards. For example, some known heavy scenarios such as using large monorepos or notable additional workloads.

For details about what each reference architecture is tested against, see the Testing Methodology section of each page.

GitLab package (Omnibus)

The following is the list of Linux package based reference architectures:

Cloud native hybrid

The following is a list of Cloud Native Hybrid reference architectures, where select recommended components can be run in Kubernetes:

Before you start

First, consider whether a self-managed approach is the right choice for you and your requirements.

Running any application in production is complex, and the same applies for GitLab. While we aim to make this as smooth as possible, there are still the general complexities based on your design. Typically you have to manage all aspects such as hardware, operating systems, networking, storage, security, GitLab itself, and more. This includes both the initial setup of the environment and the longer term maintenance.

You must have a working knowledge of running and maintaining applications in production if you decide to go down this route. If you aren’t in this position, our Professional Services team offers implementation services. Those who want a more managed solution long term, can explore our other offerings such as GitLab SaaS or GitLab Dedicated.

If you are considering using the GitLab Self-Managed approach, we encourage you to read through this page in full, specifically the following sections:

Deciding which architecture to start with

The reference architectures are designed to strike a balance between three important factors: performance, resilience, and cost. They are designed to make it easier to set up GitLab at scale. However, it can still be a challenge to know which one meets your requirements and where to start accordingly.

As a general guide, the more performant and/or resilient you want your environment to be, the more complex it is.

This section explains the things to consider when picking a reference architecture.

Expected load (RPS or user count)

The right architecture size depends primarily on your environment’s expected peak load. The most objective measure of this load is through peak Requests per Second (RPS) coming into the environment.

Each architecture is designed to handle specific RPS targets for different types of requests (API, Web, Git). These details are described in the Testing Methodology section on each page.

Finding out the RPS can depend notably on the specific environment setup and monitoring stack. Some potential options include:

If you can’t determine your RPS, we provide an alternative sizing method based on equivalent User Count by Load Category. This count is mapped to typical RPS values, considering both manual and automated usage.

Initial sizing guide

To determine which architecture to pick for the expected load, see the following initial sizing guide table:

Load Category Requests per Second (RPS) Typical User Count Reference Architecture
API Web Git Pull Git Push
X Small 20 2 2 1 1,000 Up to 20 RPS or 1,000 users
Small 40 4 4 1 2,000 Up to 40 RPS or 2,000 users
Medium 60 6 6 1 3,000 Up to 60 RPS or 3,000 users
Large 100 10 10 2 5,000 Up to 100 RPS or 5,000 users
X Large 200 20 20 4 10,000 Up to 200 RPS or 10,000 users
2X Large 500 50 50 10 25,000 Up to 500 RPS or 25,000 users
3X Large 1000 100 100 20 50,000 Up to 1000 RPS or 50,000 users

Before you select an initial architecture, review this section thoroughly. Consider other factors such as High Availability (HA) or use of large monorepos, as they may impact the choice beyond just RPS or user count.

If in doubt, start large, monitor, and then scale down

If you’re uncertain about the required environment size, consider starting with a larger size, monitoring it, and then scaling down accordingly if the metrics support your situation.

Starting large and then scaling down is a prudent approach when:

For example, if you have 3,000 users but also know that there’s automation at play that would significantly increase the concurrent load, then you could start with a 100 RPS / 5k User class environment, monitor it, and if the metrics support it, scale down all components at once, or one by one.

Standalone (non-HA)

For environments serving 2,000 or fewer users, it’s generally recommended to follow a standalone approach by deploying a non-HA, single, or multi-node environment. With this approach, you can employ strategies such as automated backups for recovery. These strategies provide a good level of recovery time objective (RTO) or recovery point objective (RPO) while avoiding the complexities that come with HA.

With standalone setups, especially single node environments, various options are available for installation and management. The options include the ability to deploy directly by using select cloud provider marketplaces that reduce the complexity a little further.

High Availability (HA)

High Availability ensures every component in the GitLab setup can handle failures through various mechanisms. However, to achieve this is complex, and the environments required can be sizable.

For environments serving 3,000 or more users, we generally recommend using an HA strategy. At this level, outages have a bigger impact against more users. All the architectures in this range have HA built in by design for this reason.

Do you need High Availability (HA)?

As mentioned previously, achieving HA comes at a cost. The environment requirements are sizable as each component needs to be multiplied, which comes with additional actual and maintenance costs.

For a lot of our customers with fewer than 3,000 users, we’ve found that a backup strategy is sufficient and even preferable. While this does have a slower recovery time, it also means you have a much smaller architecture and less maintenance costs as a result.

As a general guideline, employ HA only in the following scenarios:

Scaled-down High Availability (HA) approach

If you still need HA for fewer users, you can achieve it with an adjusted 3K architecture.

Zero-downtime upgrades

Zero-downtime upgrades are available for standard environments with HA (Cloud Native Hybrid is not supported). This allows for an environment to stay up during an upgrade. However, this process is more complex as a result and has some limitations as detailed in the documentation.

When going through this process, it’s worth noting that there may still be brief moments of downtime when the HA mechanisms take effect.

In most cases, the downtime required for doing an upgrade shouldn’t be substantial. Use this approach only if it’s a key requirement for you.

Cloud Native Hybrid (Kubernetes HA)

As an additional layer of HA resilience, you can deploy select components in Kubernetes, known as a Cloud Native Hybrid reference architecture. For stability reasons, stateful components such as Gitaly cannot be deployed in Kubernetes.

Cloud Native Hybrid is an alternative and more advanced setup compared to a standard reference architecture. Running services in Kubernetes is complex. Use this setup only if you have strong working knowledge and experience in Kubernetes.

GitLab Geo (Cross Regional Distribution / Disaster Recovery)

With GitLab Geo, you can achieve distributed environments in different regions with a full Disaster Recovery (DR) setup in place. GitLab Geo requires at least two separate environments:

If the primary site becomes unavailable, you can fail over to one of the secondary sites.

Use this advanced and complex setup only if DR is a key requirement for your environment. You must also make additional decisions on how each site is configured. For example, if each secondary site would be the same architecture as the primary or if each site is configured for HA.

Large monorepos / Additional workloads

Large monorepos or significant additional workloads can affect the performance of the environment notably. Some adjustments may be required depending on the context.

If this situation applies to you, reach out to your GitLab representative or our Support teamfor further guidance.

Cloud provider services

For all the previously described strategies, you can run select GitLab components on equivalent cloud provider services such as the PostgreSQL database or Redis.

For more information, see the recommended cloud providers and services.

Decision Tree

Read through the above guidance in full first before you refer to the following decision tree.

%%{init: { 'theme': 'base' } }%% graph TD L0A(What Reference Architecture should I use?) L1A(What is your expected load?)

L2A("60 RPS / 3,000 users or more?") L2B("40 RPS / 2,000 users or less?")

L3A("Do you need HA?
(or zero-downtime upgrades)") L3B[Do you have experience with
and want additional resilience
with select components in Kubernetes?]

L4A>Recommendation

60 RPS / 3,000 user architecture with HA
and supported reductions] L4B>Recommendation

Architecture closest to expected load with HA] L4C>Recommendation

Cloud Native Hybrid architecture
closest to expected load] L4D>"Recommendation

Standalone 20 RPS / 1,000 user or 40 RPS / 2,000 user
architecture with Backups"]

L0A --> L1A L1A --> L2A L1A --> L2B L2A -->|Yes| L3B L3B -->|Yes| L4C L3B -->|No| L4B

L2B --> L3A L3A -->|Yes| L4A L3A -->|No| L4D L5A("Do you need cross regional distribution
or disaster recovery?"
) --> |Yes| L6A>Additional Recommendation

GitLab Geo] L4A ~~~ L5A L4B ~~~ L5A L4C ~~~ L5A L4D ~~~ L5A

L5B("Do you have Large Monorepos or expect
to have substantial additional workloads?") --> |Yes| L6B>Additional Recommendations

Start large, monitor and scale down

Contact GitLab representative or Support] L4A ~~~ L5B L4B ~~~ L5B L4C ~~~ L5B L4D ~~~ L5B

classDef default fill:#FCA326 linkStyle default fill:none,stroke:#7759C2

Requirements

Before implementing a reference architecture, see the following requirements and guidance.

Supported machine types

The architectures are designed to be flexible in terms of machine type selection while ensuring consistent performance. While we provide specific machine type examples in each reference architecture, these are not intended to be prescriptive defaults.

You can use any machine types that meet or exceed the specified requirements for each component, such as:

This guidance is also applicable for any Cloud Provider services such as AWS RDS.

Any “burstable” instance types are not recommended due to inconsistent performance.

Supported disk types

Most standard disk types are expected to work for GitLab. However, be aware of the following specific call-outs:

Other disk types are expected to work with GitLab. Choose based on your requirements such as durability or cost.

Supported infrastructure

GitLab should run on most infrastructures such as reputable cloud providers (AWS, GCP, Azure) and their services, or self-managed (ESXi) that meet both:

However, this does not guarantee compatibility with every potential permutation.

See Recommended cloud providers and services for more information.

Large Monorepos

The architectures were tested with repositories of varying sizes that follow best practices.

**However, large monorepos (several gigabytes or more) can significantly impact the performance of Git and in turn the environment itself.**Their presence and how they are used can put a significant strain on the entire system from Gitaly to the underlying infrastructure.

The performance implications are largely software in nature. Additional hardware resources lead to diminishing returns.

If this applies to you, we strongly recommend you follow the linked documentation and reach out to your GitLab representative or our Support team for further guidance.

Large monorepos come with notable cost. If you have such a repository, follow these guidance to ensure good performance and to keep costs in check:

Additional workloads

These architectures have been designed and tested for standard GitLab setups based on real data.

However, additional workloads can multiply the impact of operations by triggering follow-up actions. You may need to adjust the suggested specifications to compensate if you use:

Generally, you should have robust monitoring in place to measure the impact of any additional workloads to inform any changes needed to be made. Reach out to your GitLab representative or our Support teamfor further guidance.

Load Balancers

The architectures make use of up to two load balancers depending on the class:

The specifics on which load balancer to use, or its exact configuration is beyond the scope of GitLab documentation. The most common options are to set up load balancers on machine nodes or to use a service such as one offered by cloud providers. If deploying a Cloud Native Hybrid environment, the charts can handle the external load balancer setup by using Kubernetes Ingress.

Each architecture class includes a recommended base machine size to deploy directly on machines. However, they may need adjustment based on factors such as the chosen load balancer and expected workload. Of note machines can have varying network bandwidth that should also be taken into consideration.

The following sections provide additional guidance for load balancers.

Balancing algorithm

To ensure equal spread of calls to the nodes and good performance, use a least-connection-based load balancing algorithm or equivalent wherever possible.

We don’t recommend the use of round-robin algorithms as they are known to not spread connections equally in practice.

Network Bandwidth

The total network bandwidth available to a load balancer when deployed on a machine can vary notably across cloud providers. Some cloud providers, like AWS, may operate on a burst system with credits to determine the bandwidth at any time.

The required network bandwidth for your load balancers depends on factors such as data shape and workload. The recommended base sizes for each architecture class have been selected based on real data. However, in some scenarios such as consistent clones of large monorepos, the sizes may need to be adjusted accordingly.

No swap

Swap is not recommended in the reference architectures. It’s a failsafe that impacts performance greatly. The architectures are designed to have enough memory in most cases to avoid the need for swap.

Praefect PostgreSQL

Praefect requires its own database server. To achieve full HA, a third-party PostgreSQL database solution is required.

We hope to offer a built-in solution for these restrictions in the future. In the meantime, a non-HA PostgreSQL server can be set up using the Linux package as the specifications reflect. For more details, see the following issues:

The following lists are non-exhaustive. Other cloud providers not listed here may work with the same specifications, but they have not been validated. For the cloud provider services not listed here, use caution, as each implementation can be notably different. Test thoroughly before using them in production.

The following architectures are recommended for the following cloud providers based on testing and real life usage:

Reference Architecture GCP AWS Azure Bare Metal
Linux package 🟢 🟢 🟢1 🟢
Cloud Native Hybrid 🟢 🟢

Additionally, the following cloud provider services are recommended for use as part of the architectures:

Cloud Service GCP AWS Azure Bare Metal
Object Storage 🟢 Cloud Storage 🟢 S3 🟢 Azure Blob Storage 🟢 MinIO
Database 🟢 Cloud SQL1 🟢 RDS 🟢 Azure Database for PostgreSQL Flexible Server
Redis 🟢 Memorystore 🟢 ElastiCache 🟢 Azure Cache for Redis (Premium)
  1. For optimal performance, especially in larger environments (500 RPS / 25k users or higher), use the Enterprise Plus edition for GCP Cloud SQL. You might need to adjust the maximum connections higher than the service’s defaults, depending on your workload.
  2. To ensure good performance, deploy the Premium tier of Azure Cache for Redis.

Best practices for the database services

If you choose to use a third-party external service, use an external database service that runs a standard, performant, and supported PostgreSQL version and take note of the following considerations:

  1. The HA Linux package PostgreSQL setup encompasses PostgreSQL, PgBouncer, and Consul. All of these components are no longer required when using a third party external service.
  2. For optimal performance, enable Database Load Balancing with Read Replicas. Match the node counts to those used in standard Linux package deployments. This approach is particularly important for larger environments (more than 200 requests per second or 10,000+ users).
  3. Database Connection Poolers are not required for this setup as the options vary per service. As a result, connection count configuration may need to be adjusted depending on the environment size. If Pooling is desired, a third party option needs to be explored as the GitLab Linux Package bundled PgBouncer is only compatible with the package bundled Postgres. Database Load Balancing can also be used to spread the load accordingly.
    • Ensure that if a pooler is included in a Cloud Provider service, it can handle the total load without bottlenecks. For example, Azure Database for PostgreSQL flexible server can optionally deploy a PgBouncer pooler in front of the database. However, PgBouncer is single threaded, which may cause bottlenecks under heavy load. To mitigate this issue, you can use database load balancing to distribute the pooler across multiple nodes.
  4. The number of nodes required for HA may vary depending on the service. The requirements for one deployment may vary from those for Linux package installations.
  5. To use GitLab Geo, the service should support cross-region replication.

Unsupported database services

The following database cloud provider services are not recommended due to lack of support or known issues:

Best practices for Redis services

Use an external Redis service that runs a standard, performant, and supported version. The service must support:

Redis is primarily single threaded. For environments targeting the 200 RPS / 10,000 users class or larger, separate the instances into cache & persistent data to achieve optimum performance.

Serverless variants of Redis services are not supported at this time.

Best practices for object storage

GitLab has been tested against various object storage providers that are expected to work.

Use a reputable solution that has full S3 compatibility.

Deviating from the suggested reference architectures

The further away you move from the reference architectures, the harder it is to get support. With each deviation, you introduce a layer of complexity that complicates troubleshooting potential issues.

These architectures use the official Linux packages or Helm Charts to install and configure the various components. The components are installed on separate machines (virtualized or Bare Metal). Machine hardware requirements listed in the Configuration column. Equivalent VM standard sizes are listed in the GCP/AWS/Azure columns of each available architecture.

You can run GitLab components on Docker, including Docker Compose. Docker is well supported and provides consistent specifications across environments. However, it is still an additional layer and might add some support complexities. For example, not being able to run strace in containers.

Unsupported designs

While we try to have a good range of support for GitLab environment designs, certain approaches don’t work effectively. The following sections detail these unsupported approaches.

Stateful components in Kubernetes

Running stateful components in Kubernetes, such as Gitaly Cluster, is not supported.

Gitaly Cluster is only supported on conventional virtual machines. Kubernetes strictly limits memory usage. However, the memory usage of Git is unpredictable, which can cause sporadic out of memory (OOM) termination of Gitaly pods. The OOM termination leads to significant disruptions and potential data loss. Hence, Gitaly is not tested or supported in Kubernetes. For more information, see epic 6127.

This applies to stateful components such as Postgres and Redis. You can use other supported cloud provider services, unless specifically called out as unsupported.

Autoscaling of stateful nodes

As a general guidance, only stateless components of GitLab can be run in autoscaling groups, namely GitLab Rails and Sidekiq. Other components that have state, such as Gitaly, are not supported in this fashion. For more information, see issue 2997.

This applies to stateful components such as Postgres and Redis. You can use other supported cloud provider services, unless specifically called out as unsupported.

Cloud Native Hybrid setups are generally preferred over autoscaling groups. Kubernetes better handles components that can only run on one node, such as database migrations and Mailroom.

Deploying one environment over multiple data centers

GitLab doesn’t support deploying a single environment across multiple data centers. These setups can result in significant issues, such as network latency or split-brain scenarios if a data center fails.

Several GitLab components require an odd number of nodes to function correctly, such as Consul, Redis Sentinel, and Praefect. Splitting these components across multiple data centers can negatively impact their functionality.

This limitation applies to all potential GitLab environment setups, including Cloud Native Hybrid alternatives.

For deploying GitLab over multiple data centers or regions, we offer GitLab Geo as a comprehensive solution.

Validation and test results

The GitLab Delivery: Framework team does regular smoke and performance tests for these architectures to ensure they remain compliant.

How we perform the tests

Testing is conducted using specific coded workloads derived from sample customer data, utilizing both the GitLab Environment Toolkit (GET) for environment deployment with Terraform and Ansible, and the GitLab Performance Tool (GPT) for performance testing with k6.

Testing is performed primarily on GCP and AWS using their standard compute offerings (n1 series for GCP, m5 series for AWS) as baseline configurations. These machine types were selected as a lowest common denominator target to ensure broad compatibility. Using different or newer machine types that meet the CPU and memory requirements is fully supported - see Supported Machine Types for more information. The architectures are expected to perform similarly on any hardware meeting the specifications, whether on other cloud providers or on-premises.

Performance targets

Each reference architecture is tested against specific throughput targets based on real customer data. For every 1,000 users, we test:

The above RPS targets were selected based on real customer data of total environmental loads corresponding to the user count, including CI and other workloads.

Network latency between components in test environments was observed at <5 ms but note this is not intended as a hard requirement.

Test coverage and results

Testing is designed to be effective and provide good coverage for all reference architecture targets. Testing frequency varies by architecture type and size:

Our testing also includes prototype variations of these architectures being explored for potential future inclusion. Test results are publicly available on the Reference Architecture wiki.

Cost calculator templates

The following table lists initial cost templates for the different architectures across GCP, AWS, and Azure. These costs were calculated using each cloud provider’s official calculator.

However, be aware of the following caveats:

For accurate estimate of costs for your environment, take the closest template and adjust it to match your specifications and expected usage.

Reference Architecture GCP AWS Azure
Linux package Linux package Linux package
Up to 20 RPS or 1,000 users Calculated cost Calculated cost Calculated cost
Up to 40 RPS or 2,000 users Calculated cost Calculated cost Calculated cost
Up to 60 RPS or 3,000 users Calculated cost Calculated cost Calculated cost
Up to 100 RPS or 5,000 users Calculated cost Calculated cost Calculated cost
Up to 200 RPS or 10,000 users Calculated cost Calculated cost Calculated cost
Up to 500 RPS or 25,000 users Calculated cost Calculated cost Calculate cost
Up to 1000 RPS or 50,000 users Calculated cost Calculated cost Calculated cost

Maintaining a reference architecture environment

Maintaining a reference architecture environment is generally the same as any other GitLab environment.

In this section you can find links to documentation for relevant areas and specific architecture notes.

Scaling an environment

The reference architectures are designed as a starting point, and are elastic and scalable throughout. You might want to adjust the environment for your specific needs after deployment for reasons such as additional performance capacity or reduced costs. This behavior is expected. Scaling can be done iteratively or wholesale to the next architecture size, if metrics suggest that a component is exhausted.

If a component is continuously exhausting its given resources, reach out to our Support team before performing any significant scaling.

For most components, vertical and horizontal scaling can be applied as usual. However, before doing so, be aware of the following caveats:

Conversely, if you have robust metrics in place that show the environment is over-provisioned, you can scale downwards. You should take an iterative approach when scaling downwards, to ensure there are no issues.

Scaling knock on effects

In some cases, scaling a component significantly may result in knock on effects for downstream components, impacting performance. The architectures are designed with balance in mind to ensure components that depend on each other are congruent in terms of specifications. Notably scaling a component may result in additional throughput being passed to the other components it depends on. As a result, they may need to be scaled as well.

The architectures have been designed to have elasticity to accommodate an upstream component being scaled. However, reach out to our Support team before you make any significant changes to your environment to be safe.

The following components can impact others when they have been significantly scaled:

Scaling from a non-HA to an HA architecture

In most cases, vertical scaling is only required to increase an environment’s resources. However, if you are moving to an HA environment, additional steps are required for the following components to switch over to their HA forms.

For more information, see the following documentation:

Upgrades

Upgrading a reference architecture environment is same as any other GitLab environment. The main Upgrade GitLab section has detailed steps on how to approach this.Zero-downtime upgrades are also available.

You should upgrade a reference architecture in the same order as you created it.

Monitoring

You can monitor your infrastructure and GitLab using various options. See the selected monitoring solution’s documentation for more information.

Update history

The following is a history of notable updates for reference architectures (2021-01-01 onward, ascending order). We aim to update it at least once per quarter.

You can find a full history of changes on the GitLab project.

2025:

2024:

2023:

2022:

2021: