Database Lab and Postgres.ai | GitLab Docs (original) (raw)

Internal users at GitLab have access to the Database Lab Engine (DLE) andpostgres.ai for testing performance of database queries on replicated production data. Unlike a typical read-only production replica, in the DLE you can also create, update, and delete rows. You can also test the performance of schema changes, like additional indexes or columns, in an isolated copy of production data.

Database Lab quick start

  1. Visit the console.
  2. Select Sign in with Google. (Not GitLab, as you need Google SSO to connect with our project.)
  3. After signing in, select the GitLab organization, select “Joe Bot” in the sidebar, and then visit “Ask Joe”.
  4. Select the database you’re testing against:
    • Most queries for the GitLab project run against gitlab-production-main.
    • If the query is for a CI table, select gitlab-production-ci.
    • If the query is for the container registry, select gitlab-production-registry.
  5. Type explain <Query Text> in the chat box to get a plan.

Access Database Lab Engine

Access to the DLE is helpful for:

To access the DLE’s services, you can:

For more assistance, use the #database Slack channel.

Query testing

You can access Database Lab’s query analysis features either:

Generate query plans

Query plans are an essential part of the database review process. These plans enable us to decide quickly if a given query can be performant on GitLab.com. Running the explain command generates an explain plan and a link to the Postgres.ai console with more query analysis. For example, running EXPLAIN SELECT * FROM application_settingsdoes the following:

  1. Runs explain (analyze, buffers) select * from application_settings; against a database clone.
  2. Responds with timing and buffer details from the run.
  3. Provides a detailed, shareable report on the results.

Making schema changes

Sometimes when testing queries, a contributor may realize that the query needs an index or other schema change to make added queries more performant. To test the query, run the exec command. For example, running this command:

exec CREATE INDEX on application_settings USING btree (restricted_visibility_levels)

creates the specified index on the table. You can test queries leveraging the new index. exec does not return any results, only the time required to execute the query.

Reset the clone

After many changes, such as after a destructive query or an ineffective index, you must start over. To reset your designated clone, run reset.

Checking indexes

Use Database Lab to check the status of an index with the meta-command \d <index_name>.

Caveats:

For example: \d index_design_management_designs_on_project_id produces:

Index "public.index_design_management_designs_on_project_id"
   Column   |  Type   | Key? | Definition
------------+---------+------+------------
 project_id | integer | yes  | project_id
btree, for table "public.design_management_designs"

In the case of an invalid index, the output ends with invalid, like:

Index "public.index_design_management_designs_on_project_id"
   Column   |  Type   | Key? | Definition
------------+---------+------+------------
 project_id | integer | yes  | project_id
btree, for table "public.design_management_designs", invalid

If the index doesn’t exist, JoeBot throws an error like:

ERROR: psql error: psql:/tmp/psql-query-932227396:1: error: Did not find any relation named "no_index".

Migration testing

For information on testing migrations, review ourdatabase migration testing documentation.

Access the console with psql

You must have AllFeaturesUser psql access to access the console with psql.

To access the database lab instances, you must:

Host lb-bastion.db-lab.gitlab.com
  # Typically, the username is `name` in `name@gitlab.com`
  # or your GitLab's username.
  # Check with the access provisioner if it is not working.
  # If not provided, defaults to your system username.
  User YOUR_USERNAME_HERE

  # Path to your SSH key. Adjust or remove if using a different key or SSH agent.
  IdentityFile ~/.ssh/id_ed25519

Host *.gitlab-db-lab.internal
  User YOUR_USERNAME_HERE  # Same as above.
  PreferredAuthentications publickey
  IdentityFile ~/.ssh/id_ed25519  # Same as above.
  ProxyCommand ssh lb-bastion.db-lab.gitlab.com -W %h:%p

Manual access through the Postgres.ai instances page

Team members with psql access, can gain direct access to a clone via psql. Access to psql enables you to see data, not just metadata.

To connect to a clone using psql:

  1. Create a clone from the desired instance.
    1. Provide a Clone ID: Something that uniquely identifies your clone, such as yourname-testing-gitlabissue.
    2. Provide a Database username and Database password: Connects psql to your clone.
    3. Select Enable deletion protection if you need to preserve your clone. Avoid selecting this option. Clones are removed after 12 hours.
  2. In the Clone details page of the Postgres.ai web interface, copy and run the command to start SSH port forwarding for the clone.
    1. You may notice that it’s suggested to run the command with the -N flag, meaning no shell will be started, so you should not expect any output if it runs successfully.
    2. Optionally, you can add LogLevel DEBUG3 to your ~/.ssh/configto output detailed debugging information.
    3. After running the command, leave it running to keep the port forwarding active, and then you can open a new terminal tab to do the next step.
  3. In the Clone details page of the Postgres.ai web interface, copy and run the psql connection string. Use the password provided at setup and set the dbname to gitlabhq_dblab (or check what databases are available by using psql -l with the same query string but dbname=postgres).

After you connect, use clone like you would any psql console in production, but with the added benefit and safety of an isolated writeable environment.

Simplified access through pgai Ruby gem

For instructions on using the pgai Ruby gem, see: Database Lab access using the pgai Ruby gem.