This application serves as a wrapper around pgbench and performs the following tasks:
-
Launches the pgbench workload;
-
Collects system information:
- CPU;
- RAM;
- The rest of the information is specified in the application configuration.
and database information:
- version;
- DB settings;
- The rest of the information is specified in the application configuration.
-
Generates a report summarizing database performance, configuration, and the server environment.
- OS: Ubuntu 22.04
- Pre-installed PostgreSQL client applications:
psql
,pgbench
. For example, using:
sudo apt install postgresql-client
sudo apt install postgresql-contrib
-
Python 3.10+ and pip3:
-
Create a Python virtual environment in the project's root directory for ease of use:
cd pg_perfbench
python3.10 -m venv venv
source venv/bin/activate
- Install additional packages for Python, for example, using:
pip install -r requirements.txt
- For the tool to work, the database must be accessible under the user postgres or another specified user with SUPERUSER rights
- Before running the tests, install and configure Docker access for the user who will be running the tool:
sudo apt-get install docker
sudo apt-get install docker.io
chmod 666 /var/run/docker.sock
- To ensure successful use, first thoroughly explore the capabilities of the tool and run the tests.
pg_perfbench supports three modes: benchmark
, collect info
and join
.
A report is generated for all modes.
- In
benchmark
mode, the application loads the configured database instance and сollects information about the server environment and database configuration . - In
collect info
mode, the application сollects information about the server environment and database configuration. - In
join
mode, the application compares reports with each other.
Parameter | Description |
---|---|
--help , -h |
Lists all the available options or flags that can be used with the command, along with a short description of what each option does and after which exit occurs. |
--log-level |
Application logging level: info , debug , error .Default - info |
--clear-logs |
Clearing logs from the tool's previous session. Logs are located in the 'logs' folder of the root directory. |
Note: Port forwarding to the target database occurs, so make sure to use an available local forwarding port for --pg-port (default value is 5432).
Parameter | Description |
---|---|
--connection-type |
Connection types: ssh , docker , local |
- Bash scripts are used to collect system information, and the commands within them must be available for execution
by the application's user, for example:
To allow thepostgres
user to executelshw
without a password, add the following privileges:
# ========================================
# Actions on the data base host
# ========================================
sudo visudo
>>>
postgres ALL=(root) NOPASSWD: /usr/bin/lshw
- Ensure that the postgres user has the privilege to clear the file
system cache on the database server. Execute the following command on the
database
host:
# ========================================
# Actions on the data base host
# ========================================
echo ' $(whoami) ALL=(ALL) NOPASSWD: /bin/sh -c echo 3 | /usr/bin/tee /proc/sys/vm/drop_caches' | sudo EDITOR='tee -a' visudo -f /etc/sudoers
Parameter | Description |
---|---|
--ssh-port |
Port for ssh connection (default: 22) |
--ssh-host |
IP address of the remote server |
--ssh-key |
Path to the ssh connection private key file (must be configured) |
--remote-pg-host |
Host of the remote server's database (default: 127.0.0.1) |
--remote-pg-port |
Port of the remote server's database (default: 5432) |
- During the operation of pg_perfbench, it is necessary to set local environment variables within the session connecting
to the database host. When establishing an SSH connection, you must first update the AcceptEnv parameter in the SSH
configuration file (/etc/ssh/sshd_config) on the database server. Specify the argument pattern as 'ARG_*' to allow
multiple variables to be passed in:
/etc/ssh/sshd_config:
...
# Allow client to pass locale environment variables
AcceptEnv LANG LC_* ARG_*
...
- To archive the instance logs, install tar(Ubuntu example) on the data base server:
sudo apt update
sudo apt install tar
The SSH connection user is postgres
. Configure SSH access keys on the database server to establish a connection to the postgres user:
# ========================================
# Actions on the pg_perfbench host
# ========================================
# create an ssh key at the path you specify
mkdir -p path/to/your/postgres_keys/.ssh
ssh-keygen -t rsa -b 4096 -C "postgres" -f postgres_keys/.ssh/id_rsa
ls postgres_keys/.ssh
>>
id_rsa id_rsa.pub
chmod 700 postgres_keys/.ssh
chmod 644 postgres_keys/.ssh/id_rsa.pub
chmod 600 postgres_keys/.ssh/id_rsa
scp postgres_keys/.ssh/id_rsa.pub <user>@<database_server_address>:/tmp
# ========================================
# Actions on the data base server
# ========================================
cat >> /etc/ssh/sshd_config << EOL
PubkeyAcceptedKeyTypes=+ssh-rsa
EOL
systemctl restart sshd
mkdir /var/lib/postgresql/.ssh
cat /tmp/id_rsa.pub >> /var/lib/postgresql/.ssh/authorized_keys
chmod 700 /var/lib/postgresql/.ssh
chown -R postgres:postgres /var/lib/postgresql/.ssh
see more details on benchmark configuration over an SSH connection here.
For establishing a local connection to the database, you need to set the connection type:
--connection-type=local
When using a local connection, the application must be configured in the postgres user's environment.
su - postgres
git clone https://github.com/TantorLabs/pg_perfbench.git
cd pg_perfbench
For more details on configuring the local database load, see here.
Preconfigure access to Docker for the user who is running the tool.
Parameter | Description |
---|---|
--container-name |
Name of creating container |
see more details on benchmark configuration for a database in a Docker container here.
The flags pg_host
and pg_port
are optional parameters for forwarding the address and port
from the current host to the database host, used directly by the tool
.
Parameter | Description |
---|---|
--pg-port |
Forwarded port (default 5432 , relative to the current host) |
--pg-host |
Forwarded address (default 127.0.0.1 , relative to the current host) |
--pg-user |
User of database (must be configured or set "postgres") |
--pg-database |
Database used for testing |
--pg-user-password |
Password for database connection (optional) |
--pg-data-path |
Path to the PostgreSQL data directory (relative to the database host) |
--pg-bin-path |
Path to the PostgreSQL bin directory (relative to the database host) |
--collect-pg-logs |
Enable database logging (logging must be configured by the user) |
--custom-config |
Use custom PostgreSQL config |
Parameter | Description |
---|---|
--report-name |
Report name and chart time series |
Parameter | Description |
---|---|
--benchmark-type |
The benchmark to use: default , custom |
--workload-path |
Path to the load scripts directory |
--pgbench-clients |
pgbench benchmarking arguments: --clients, is set as an array (e.g. 1,2,3) |
--pgbench-time |
pgbench benchmarking arguments: --time, is set as an array (e.g. 1,2,3) |
--pgbench-path |
Specify the pgbench path (relative to the current host) |
--psql-path |
Specify the psql path (relative to the current host) |
--init-command |
Terminal command to create a table schema (relative to the current host) |
--workload-command |
Terminal command for loading the database (relative to the current host) |
--pgbench-time
- pgbench
benchmarking arguments; The report diagram will display tps/clients;
--pgbench-clients
- pgbench
benchmarking arguments; The report diagram will display tps/time
- Use placeholders to set values in the table schema and load testing commands:
configure placeholders like
'ARG_'+ <DB/Workload options>
.
For example, you can configure pgbench by specifying the path of the load files (this example describes the full set of arguments for ssh connection):
python -m pg_perfbench --mode=benchmark \
--collect-pg-logs \
--report-name=ssh-pg-custom-benchmark \
--custom-config=/tmp/user_postgresql.conf \
--log-level=debug \
--ssh-port=22 \
--ssh-key=path/to/private_key \
--ssh-host=10.128.0.141 \
--remote-pg-host=127.0.0.1 \
--remote-pg-port=5432 \
--pg-host=127.0.0.1 \
--pg-port=5435 \
--pg-user=postgres \
--pg-user-password=test_user_password \
--pg-database=tdb \
--pg-data-path=/var/lib/postgresql/data \
--pg-bin-path=/usr/lib/postgresql/15/bin \
--benchmark-type=custom \
--pgbench-clients=5,10,50 \
--workload-path=/path/to/workload \
--pgbench-path=/usr/bin/pgbench \
--psql-path=/usr/bin/psql \
--init-command="cd ARG_WORKLOAD_PATH && ARG_PSQL_PATH -p ARG_PG_PORT -h ARG_PG_HOST -U postgres ARG_PG_DATABASE -f ARG_WORKLOAD_PATH/table-schema.sql" \
--workload-command="ARG_PGBENCH_PATH -p ARG_PG_PORT -h ARG_PG_HOST -U ARG_PG_USER --no-vacuum --file=ARG_WORKLOAD_PATH/custom_script_1.sql --file=ARG_WORKLOAD_PATH/custom_script_2.sql ARG_PG_DATABASE -c ARG_PGBENCH_CLIENTS -j 20 -T 10"
or standard pgbench load (this example shows all arguments needed to load a database in a Docker container):
python -m pg_perfbench --mode=benchmark \
--log-level=debug \
--report-name=docker-pg-default-benchmark \
--collect-pg-logs \
--custom-config=/tmp/user_postgresql.conf \
--container-name=cntr_expected \
--pg-host=127.0.0.1 \
--pg-port=5435 \
--pg-user=postgres \
--pg-user-password=test_user_password \
--pg-database=tdb \
--pg-data-path=/var/lib/postgresql/tantor-se-1c-15/data \
--pg-bin-path=/opt/tantor/db/15/bin \
--benchmark-type=default \
--pgbench-time=600,1200 \
--init-command="ARG_PGBENCH_PATH -i --scale=100 --foreign-keys -p ARG_PG_PORT ARG_PG_HOST -U postgres ARG_PG_DATABASE" \
--workload-command="ARG_PGBENCH_PATH -p ARG_PG_PORT -h ARG_PG_HOST -U ARG_PG_USER ARG_PG_DATABASE -c 5 -j 5 -T ARG_PGBENCH_TIME --no-vacuum"
See more details about workload configuration here.
Example benchmark report - benchmark-report.html
The mode of information collection is described here.
Collection of remote server system information via an SSH connection:
python -m pg_perfbench --mode=collect-sys-info \
--report-name=benchmark-ssh-default-2 \
--log-level=debug \
--connection-type=ssh \
--ssh-port=22 \
--ssh-key=/tmp/private_key \
--ssh-host=10.177.143.88
Collection of all database information in a Docker container:
python -m pg_perfbench --mode=collect-db-info \
--report-name=test-collect-db-info-docker-container \
--log-level=debug \
--container-name=cntr_result \
--pg-host=127.0.0.1 \
--pg-port=5435 \
--pg-user=postgres \
--pg-database=tdb \
--pg-data-path=/var/lib/postgresql/data \
--pg-bin-path=/usr/lib/postgresql/15/bin
Collection of all information from the remote database server via an SSH connection:
python -m pg_perfbench --mode=collect-all-info \
--report-name=all-info-report \
--log-level=debug \
--connection-type=ssh \
--ssh-port=22 \
--ssh-key=/tmp/private_key \
--ssh-host=10.177.143.88 \
--remote-pg-host=127.0.0.1 \
--remote-pg-port=5432 \
--pg-host=127.0.0.1 \
--pg-port=5435 \
--pg-user=postgres \
--pg-database=tdb \
--pg-data-path=/var/lib/postgresql/16/main \
--pg-bin-path=/usr/lib/postgresql/16/bin
Examples of information collection reports - collect-all-info-docker.html
Parameter | Description |
---|---|
--report-name |
Report name |
--join-task |
A JSON file containing a set of merge criteria,which are items of sections, should be located in join_tasks at the root of the project |
--input-dir |
Directory with reports on the load of a single database instance, files with a 'join' prefix are ignored. The default directory set is 'report' |
--reference-report |
The report specified as a benchmark for comparison with other reports. By default, the first report listed alphabetically in the --input-dir path is selected |
Template for the comparison criteria file:
{
"description": "Comparison of database performance across different configurations in the same environment using the same PostgreSQL version",
"items": [
"sections.<section_name_of_report_struct>.reports.<report_name>.data",
....
"sections.system.reports.uname_a.data",
]
}
Example of argument configuration in join mode:
python -m pg_perfbench --mode=join \
--join-task=task_compare_dbs_on_single_host.json \
--reference-report=benchmark_report.json \
--input-dir=/path/to/some/reports
Examples of join report - join_report.html .
See more details about report configuration here.
You can configure the JSON report template file from pg_perfbench/reports/templates
.
Add or remove reports of the following types:
- "shell_command_file" - a report with the result of executing the specified bash script relative to the database host in the
pg_perfbench/commands/bash_commands
directory, for example:
"example_bash_report": {
"header": "example_bash_header",
"state": "collapsed",
"item_type": "plain_text",
"shell_command_file": "bash_example.sh",
"data": ""
}
It is necessary to first ensure that the script executes correctly for the utility's user.
- "sql_command_file" - a report with the result of executing the specified SQL script in the database located in the
pg_perfbench/commands/sql_commands
directory, for example:
"example_sql_report": {
"header": "example_sql_header",
"state": "collapsed",
"item_type": <"plain_text", "table">,
"sql_command_file": "sql_example.sql",
"data": ""
}
When testing the tool, a Docker connection is used. Preconfigure access to Docker for the user who is running the tool.
- specify the
user
from which the tool is run:
echo ' <pg_perfbench user> ALL=(ALL) NOPASSWD: /bin/sh -c echo 3 | /usr/bin/tee /proc/sys/vm/drop_caches' | sudo EDITOR='tee -a' visudo -f /etc/sudoers
- set PYTHONPATH variable:
cd pg_perfbench
>>
pg_perfbench requirements_dev.txt tests
LICENSE log README.md requirements.txt
export PYTHONPATH=$(pwd)
- single test running. Example of executing unit tests:
python -m unittest tests.context.test_docker_connection_context -v --failfast
- executing all tests:
python -m unittest discover tests