This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the databases category.
Last Updated: 2024-12-03
PostgreSQL implements connection handling by “forking” it’s main OS process into a child process for each new connection. It is more expensive—in terms of time (e.g. do authentication by verifying passwords and TLS establishment) and memory mostly—to fork an OS process than it would be, for example, to spawn a new thread for an existing process. This might be irrelevant if the rate at which connections are opened and closed is low but becomes increasingly important to consider over time as this reality changes. Thus the need for a connection pooler. In other words, postgres will be faster and consume less RAM (estimate 10mb per connection) if it has to have less direct connections and if you buffer through a pooler.
The connection pooler should accept many more connections from the web-server than postgres itself connects. (e.g. 10 to 1 as an arbitrary start point -- YMMV)
Some frameworks like Ruby on Rails include their own, built-in connection pool mechanism.
Default pool size should probably be equal to around the number of available CPU cores (which can be like 50+ on some servers)
Using pgbouner will hurt your performance (perhaps by 2.5X) if you don't need it. This blogger found that at under 60 concurrent users there is no point: https://www.percona.com/blog/2018/06/27/scaling-postgresql-with-pgbouncer-you-may-need-a-connection-pooler-sooner-than-you-expect/
Note that pgbouncer
may need to do auth with the database if on another machine.
Note that pgbouncer has various timeout settings (that themselves have defaults).
- server_lifetime
- "Pgbouncer will close any server connection that has been connected longer than serverlifetime whenever possible according to poolmode. Imagine a pool under constant load. Without serverlifetime, pgbouncer will open N server connections and use they for months. It's not so bad, but not always desirable. In some cases the database server processes will take more and more memory over time (cache of stored procedures, prepared statements, etc). serverlifetime will close old server connections without the application being aware of it (new server connection will be established later, if needed). This will happen regardless of how active the connection was."
- server_idle_timeout
- "On the other hand, serveridletimeout will close connections to the server that have not been used by clients during this time. Usecase: typically a few connections to the server are sufficient for this pool. But sometimes we have a peak of activity and we open 50 connections. When we processed this peak, these additional connections are no longer needed, we only need a few of them. serveridletimeout allows you to close server connections that have not been used for longer than the specified time. serverlifetime will also close such unneeded connections, but a separate serveridle_timeout option will allow a shorter interval to be set"
Pgbouncer modes
Transaction pooling mode seeks to reduce server connection idle time like so:
The pooler assigns a server connection to a client when it begins a transaction. The pooler releases the connection assignment as soon as the client’s transaction completes.
If you want to view the stats that PgBouncer collects to determine how your application's connections are running, who's connecting, and any other information, then PgBouncer has it's own database that it creates to provide you with this information. Using one of the admins for PgBouncer, you'd connect like: psql -p 6432 -U app_admin -h 0.0.0.0 pgbouncer
and you can run SHOW POOLS;
here and SHOW CLIENTS;
and SHOW STATS;
(i.e. a bunch of commands it makes availabl in postgres)
Persistent connections are bad when you’re connecting directly to PostgreSQL. Persistent connections to pgBouncer are the best thing you can imagine, assuming you can run in transaction pooling mode.