pgbouncer: Types of PostgreSQL connection pooling (original) (raw)
pgbouncer
is one of the most widely used tool for connection pooling. At CYBERTEC, we've successfully deployed it in many different situations. It has proven to be reliable as well as useful.
- Reducing connection overhead with connection pooling
- pgbouncer the best tool for connection pooling
- Types of connection pooling modes
- pgbouncer session pooling
- pgbouncer transaction pooling
- pgbouncer statement pooling
- Finally …
Before we dive into different pooling modes and their implications, why do we need a connection pooler in the first place? The reason is that we want to reduce the overhead of new connections. That is right. Creating new connections is not free of charge.
Reducing connection overhead with connection pooling
As stated already, connections and their creation are not free. In PostgreSQL, we have to fork an entire process to create a connection. In case a connection lives for a very long time, this is no problem. However, forking a process for just a very short query can be really expensive. Those costs are often underestimated by developers and DBAs alike.
Let's run a test using a simple script:
| | HansJurgensMini:~ hs$ cat /tmp/sample.sqlSELECT 1; | | ------------------------------------------------------ |
To maximize the result, I've used the most basic script possible.
Let's run a simple test: 10 seconds, 10 concurrent transactions given our SQL script. Test hardware: Mac Mini (M1).
| | HansJurgensMini:~ hs$ pgbench -c 10 -T 10 -j 10 -f /tmp/sample.sql teststarting vacuum...end.transaction type: /tmp/sample.sqlscaling factor: 1query mode: simplenumber of clients: 10number of threads: 10maximum number of tries: 1duration: 10 snumber of transactions actually processed: 2957846number of failed transactions: 0 (0.000%)latency average = 0.034 msinitial connection time = 5.708 mstps = 295825.128043 (without initial connection time) | | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
This gave us 295825 transactions per second. But now let's run the same test again. This time, each transaction will open a separate connection (-C
):
| | HansJurgensMini:~ hs$ pgbench -c 10 -T 10 -j 10 -f /tmp/sample.sql -C teststarting vacuum...end.transaction type: /tmp/sample.sqlscaling factor: 1query mode: simplenumber of clients: 10number of threads: 10maximum number of tries: 1duration: 10 snumber of transactions actually processed: 37713number of failed transactions: 0 (0.000%)latency average = 2.652 msaverage connection time = 2.439 mstps = 3770.311424 (including reconnection times) | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Wow, we reduced the speed to 3770 transactions per second which is a 98.7% drop in performance.
It's logical if you think about it. “SELECT 1
” is way cheaper than fork()
plus all the other overhead. Therefore pooling makes a lot of sense, because it allows us to recycle the connection.
pgbouncer
can enter the picture to address the problem. The key here is that a pgbouncer
connection has only 2 Kb or so in overhead. In other words, it is really efficient, and it can hold thousands of connections for very little overhead. Sure, there is a bit more latency - but overall it helps to greatly reduce the need to open and close connections.
A typical setup looks as follows:
pgbouncer
will sit between the database and the client. It will keep connections to the database open and make sure that the number of fork()
calls needed is significantly reduced.
Types of connection pooling modes
pgbouncer
has three types of pooling modes. This is important to understand because it offers a lot of potential to adjust the tool’s behavior to our exact needs.
The config file shows the following options:
| | ;; When server connection is released back to pool:;; session - after client disconnects (default);; transaction - after transaction finishes;; statement - after statement finishespool_mode = session | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
What do all those options mean? Let's discuss it in detail.
pgbouncer session pooling
Session pooling means that pgbouncer
keeps a set of connections to the server open. The clients will pick one and things will be routed through to the database.
It is also possible to use pgpool
to “narrow down” the number of connections really needed on the database side. This can be useful in case you have some crazy application-side connection pool which needs an unusually high number of open connections which should not make it to the backend database in the first place.
In case all connections are working, some of them have to wait until a slot in the pool is available. Often many applications using a connection pool access the same database. Many apps running pools that are too large might end up with far too many connections in the backend - which can in turn cause issues.
Every client will run the entire transaction and even the entire connection on the same “real” database connection.
pgbouncer transaction pooling
Sometimes a full connection is too much. So what about transaction pooling? Instead of mapping an entire client connection to a real database connection, it is enough to ensure that the same transaction will end up on the same host.
pgbouncer
will therefore map many client connections to the same physical connection and separate them by transaction. Often this is entirely sufficient.
pgbouncer statement pooling
Statement pooling is by far the most aggressive method. Often you don't need large transaction blocks. Suppose you want to look up names in a phone book 1 million times per second. Clearly those are quick, small queries and there are no large transactions spanning multiple statements. We therefore don’t have to worry about transactional visibility. We can simply load balance all those statements to any connection and pass the result on to the client.
The use case is as follows: Pump millions of short statements through the system which are not related to some large fancy business logic that needs heavy locking (e.g. SELECT FOR UPDATE) or anything of that kind.
Finally …
Depending on your needs, you can decide on the pooling method you want to use. Typically, session pooling is used, but we have also seen large-scale statement pooling out there.
If you want to know more about authentication with pgbouncer
, check out Laurenz Albe's blog, pgbouncer Authentication Made Easy.
Also, you can check out our pgbouncer blog archive.
You can download pgbouncer for free from pgbouncer.org: