Installing pgbadger on Linux for PostgreSQL log analysis

As a DBA, we often encounter scenarios when we need to track PostgreSQL database performance by analyzing workloads and identifying bottlenecks if any. There comes pgbadger – an open-source log analyzer for PostgreSQL. It parses the PostgreSQL log files and gives you a nice overview of the PostgreSQL cluster including various query metrics like the number of query executions, their types and duration, session details, Locks statistics, warnings, errors and other event patterns etc. All of these details will be available in a nice HTML format on your web browser.

Installation:

Prerequisite:

The pgbadger is written in pure Perl and uses a JavaScript library (flotr2) to draw graphs. Hence, you need to ensure that a modern Perl distribution is available in your system. Charts are rendered using a JavaScript library and Your web browser will do all the work. Nothing additional is required here.

Use the below command to install perl if it is not already installed in your system.

$ yum install -y perl perl-devel

Either of the below two methods can be followed to install pgbadger.

Method 1:

The first step is to download the latest pgbadger installation package. The Official releases are published on the GitHub Release page of pgBadger. While I am writing this blog post, the latest available version is 11.4.

Let’s run the below command on the Linux command line interface to download the pgbadger version 11.4.

$ wget https://github.com/darold/pgbadger/archive/v11.4.tar.gz

Alternatively, you can clone the pgbadger GitHub repo as well if you have Git installed in your system.

$ git clone https://github.com/dalibo/pgbadger.git

Once the download is complete, you have to extract the archive from the tarball.

$ tar xzvf v11.4.tar.gz

[user@server pgbadger_demo]$ tar xzvf v11.4.tar.gz
pgbadger-11.4/
pgbadger-11.4/.editorconfig
pgbadger-11.4/.gitignore
pgbadger-11.4/CONTRIBUTING.md
pgbadger-11.4/ChangeLog
pgbadger-11.4/HACKING.md
pgbadger-11.4/LICENSE
pgbadger-11.4/MANIFEST
pgbadger-11.4/META.yml
pgbadger-11.4/Makefile.PL
pgbadger-11.4/README
pgbadger-11.4/README.md
pgbadger-11.4/doc/
pgbadger-11.4/doc/pgBadger.pod
pgbadger-11.4/pgbadger
pgbadger-11.4/resources/
pgbadger-11.4/resources/.gitignore
pgbadger-11.4/resources/LICENSE
pgbadger-11.4/resources/README
pgbadger-11.4/resources/bean.js
pgbadger-11.4/resources/bootstrap.css
pgbadger-11.4/resources/bootstrap.js
pgbadger-11.4/resources/font/
pgbadger-11.4/resources/font/FontAwesome.otf
pgbadger-11.4/resources/font/fontawesome-webfont.eot
pgbadger-11.4/resources/fontawesome.css
pgbadger-11.4/resources/jqplot.barRenderer.js
pgbadger-11.4/resources/jqplot.canvasAxisTickRenderer.js
pgbadger-11.4/resources/jqplot.canvasTextRenderer.js
pgbadger-11.4/resources/jqplot.categoryAxisRenderer.js
pgbadger-11.4/resources/jqplot.cursor.js
pgbadger-11.4/resources/jqplot.dateAxisRenderer.js
pgbadger-11.4/resources/jqplot.highlighter.js
pgbadger-11.4/resources/jqplot.pieRenderer.js
pgbadger-11.4/resources/jqplot.pointLabels.js
pgbadger-11.4/resources/jquery.jqplot.css
pgbadger-11.4/resources/jquery.jqplot.js
pgbadger-11.4/resources/jquery.js
pgbadger-11.4/resources/patch-jquery.jqplot.js
pgbadger-11.4/resources/pgbadger.css
pgbadger-11.4/resources/pgbadger.js
pgbadger-11.4/resources/pgbadger_slide.js
pgbadger-11.4/resources/underscore.js
pgbadger-11.4/t/
pgbadger-11.4/t/01_lint.t
pgbadger-11.4/t/02_basics.t
pgbadger-11.4/t/03_consistency.t
pgbadger-11.4/t/exp/
pgbadger-11.4/t/exp/stmt_type.out
pgbadger-11.4/t/fixtures/
pgbadger-11.4/t/fixtures/cloudsql.log.gz
pgbadger-11.4/t/fixtures/light.postgres.log.bz2
pgbadger-11.4/t/fixtures/logplex.gz
pgbadger-11.4/t/fixtures/pg-syslog.1.bz2
pgbadger-11.4/t/fixtures/pgbouncer.log.gz
pgbadger-11.4/t/fixtures/rds.log.bz2
pgbadger-11.4/t/fixtures/stmt_type.log
pgbadger-11.4/tools/
pgbadger-11.4/tools/README.pgbadger_tools
pgbadger-11.4/tools/README.updt_embedded_rsc
pgbadger-11.4/tools/pgbadger_tools
pgbadger-11.4/tools/updt_embedded_rsc.pl
[user@server pgbadger_demo]$

Now go to the extracted directory.

$ cd pgbadger-11.4

Compile the makefile.

$ perl Makefile.PL

[user@server pgbadger-11.4]$ perl Makefile.PLChecking if your kit is complete...
Looks good
Writing Makefile for pgBadger
[user@server pgbadger-11.4]$

Now install the pgbadger using the below command.

$ make && sudo make install

[user@server pgbadger-11.4]$ make && sudo make install
which: no pod2markdown in (/usr/local/rvm/gems/ruby-2.4.1/bin:/usr/local/rvm/gems/ruby-2.4.1@global/bin:/usr/local/rvm/rubies/ruby-2.4.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/rvm/bin:/home/cloud_user/.local/bin:/home/user/bin)
cp pgbadger blib/script/pgbadger
/usr/bin/perl -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/pgbadger
echo "=head1 SYNOPSIS" > doc/synopsis.pod
./pgbadger --help >> doc/synopsis.pod
echo "=head1 DESCRIPTION" >> doc/synopsis.pod
sed -i.bak 's/ +$//g' doc/synopsis.pod
rm doc/synopsis.pod.bak
sed -i.bak '/^=head1 SYNOPSIS/,/^=head1 DESCRIPTION/d' doc/pgBadger.pod
sed -i.bak '4r doc/synopsis.pod' doc/pgBadger.pod
rm doc/pgBadger.pod.bak
Manifying blib/man1/pgbadger.1p
rm doc/synopsis.pod
[sudo] password for user: 
which: no pod2markdown in (/sbin:/bin:/usr/sbin:/usr/bin)
echo "=head1 SYNOPSIS" > doc/synopsis.pod
./pgbadger --help >> doc/synopsis.pod
echo "=head1 DESCRIPTION" >> doc/synopsis.pod
sed -i.bak 's/ +$//g' doc/synopsis.pod
rm doc/synopsis.pod.bak
sed -i.bak '/^=head1 SYNOPSIS/,/^=head1 DESCRIPTION/d' doc/pgBadger.pod
sed -i.bak '4r doc/synopsis.pod' doc/pgBadger.pod
rm doc/pgBadger.pod.bak
Manifying blib/man1/pgbadger.1p
Installing /usr/local/share/man/man1/pgbadger.1p
Installing /usr/local/bin/pgbadger
Appending installation info to /usr/lib64/perl5/perllocal.pod
rm doc/synopsis.pod
[user@server pgbadger-11.4]$

Finally, verify the installation by running the below command which will display the pgbadger installed version 🎉

$ pgbadger -V

[user@server pgbadger-11.4]$ pgbadger -V
pgBadger version 11.4
[user@server pgbadger-11.4]$

Method 2:

The pgbadger package is also available at the PostgreSQL yum repository. If you have already installed the repository configuration package from the official PostgreSQL repository, then pgbadger can be installed by running the below command.

$ sudo yum install pgbadger

[user@server ~]$ sudo yum install pgbadger
Loaded plugins: amazon-id, search-disabled-repos
Resolving Dependencies
--> Running transaction check
---> Package pgbadger.noarch 0:11.4-1.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==============================================================================================================================================================================================================================================
 Package                                                 Arch                                                  Version                                                       Repository                                                  Size
==============================================================================================================================================================================================================================================
Installing:
 pgbadger                                                noarch                                                11.4-1.rhel7                                                  pgdg-common                                                346 k

Transaction Summary
==============================================================================================================================================================================================================================================
Install  1 Package

Total download size: 346 k
Installed size: 1.5 M
Is this ok [y/d/N]: y
Downloading packages:
warning: /var/cache/yum/x86_64/7Server/pgdg-common/packages/pgbadger-11.4-1.rhel7.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 442df0f8: NOKEY========================-                                 ]  0.0 B/s | 224 kB  --:--:-- ETA 
Public key for pgbadger-11.4-1.rhel7.noarch.rpm is not installed
pgbadger-11.4-1.rhel7.noarch.rpm                                                                                                                                                                                       | 346 kB  00:00:01     
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsqlrpms-hackers@pgfoundry.org>"
 Fingerprint: 68c9 e2b9 1a37 d136 fe74 d176 1f16 d2e1 442d f0f8
 Package    : pgdg-redhat-repo-42.0-14.noarch (@/pgdg-redhat-repo-latest.noarch)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgbadger-11.4-1.rhel7.noarch                                                                                                                                                                                               1/1
  Verifying  : pgbadger-11.4-1.rhel7.noarch                                                                                                                                                                                               1/1

Installed:
  pgbadger.noarch 0:11.4-1.rhel7                                                                                                                                                                                                              

Complete!
[user@server ~]$

PostgreSQL Configuration:

Since pgbadger reads the PostgreSQL logs and picks up the information, it is essential to make some changes on Postgresql.Conf file so that necessary information is available in the PostgreSQL log file for pgbadger to read.

The first and foremost thing is to enable query logging. Change the log_min_duration_statement parameter value accordingly on Postgresql.Conf file. The value ‘0’ signifies that every statement will be logged which may not be ideal for a very busy server. You can change the value accordingly based on your requirement. Suppose, you want to log queries running for more than 2 seconds, then set the parameter value(in milliseconds) as 2000.

** Do not enable both log_min_duration_statement, log_duration and log_statement altogether, this will result in wrong counter values and excessive log size.

** Do not enable log_statement as its log format will not be parsed by pgBadger. If you have log_statement set to 'all' nothing will be logged through the log_min_duration_statement directive and pgbadger will not be able to parse the log.

** If you don’t want all details about queries, set log_min_duration_statement to -1 to disable it and enable log_duration in your postgresql.conf file. It will only report the duration and number of queries.

The log_min_duration_statement should always be preferred (recommended).

log_min_duration_statement = 0

The next important parameter to look for is log_line_prefix whose value can be specified based on the value of another parameter log_destination . The log_destination value tells PostgreSQL to generate logs in the specified format. By default, PostgreSQL generates logs in standard error (stderr) format. In this format, each log message is prefixed with the information specified by the parameter log_line_prefix.

With stderr log format, log_line_prefix must be having at least a time escape sequence %t, %m or %n and the process-related escape sequence %p or %c as shown below.

log_line_prefix = '%t [%p]: '

It also lets you add the user, database name, application name and client IP address as well as shown below.

log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '

Another thing to remember is that your log messages should be in English with or without locale support. The below settings ensure that:

 lc_messages='en_US.UTF-8'
 lc_messages='C'

The usual postgresql.conf configuration for pgBadger looks like this:

log_min_duration_statement = 0
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
lc_messages='en_US.UTF-8'
lc_messages='C'

The above changes require you to perform a PostgreSQL restart.

systemctl restart postgresql

Generating pgbadger report:

Now that everything is in place, let’s proceed with generating a pgbadger report.

The simplest way to analyze a particular log and generate the pgbadger report is to specify the log file path to pgbadger and it will generate the report to your present working directory.

$ pgbadger <log file>

For Example:

[user@server ~]$ sudo pgbadger /var/lib/pgsql/12/data/log/postgresql-Fri.log
[========================>] Parsed 1521 bytes of 1521 (100.00%), queries: 0, events: 1
LOG: Ok, generating html report...
[user@server ~]$

You can also analyze all your log files at once by using the wildcard * while specifying log names. Also, you can specify the output directory and file name as well by specifying the appropriate switch as shown below.

[user@server]$ sudo pgbadger /var/lib/pgsql/12/data/log/postgresql-* -O /var/www/pgbadger -o pgbadger.html
[========================>] Parsed 15914 bytes of 15914 (100.00%), queries: 0, events: 20
LOG: Ok, generating html report...
[user@server]$

The pgbadger report can be generated incrementally at a certain interval and you can even specify the retention period of the report files as well. This can be accomplished by setting up a cron job. For example, the below cron job will generate pgbadger report daily at 01:00 AM with the retention set as one week.

$ crontab -e
0 1 * * * /usr/bin/pgbadger --R 1 -I -q /var/lib/pgsql/12/data/log/postgresql-* -O /var/www/pgbadger/

You can use the help command as mentioned below to get the complete list of options and their usage in pgbadger.

$ pgbadger --help

[user@server ~]$ pgbadger --help

Usage: pgbadger [options] logfile [...]

    PostgreSQL log analyzer with fully detailed reports and graphs.

Arguments:

    logfile can be a single log file, a list of files, or a shell command
    returning a list of files. If you want to pass log content from stdin
    use - as filename. Note that input from stdin will not work with csvlog.

Options:

    -a | --average minutes : number of minutes to build the average graphs of
                             queries and connections. Default 5 minutes.
    -A | --histo-average min: number of minutes to build the histogram graphs
                             of queries. Default 60 minutes.
    -b | --begin datetime  : start date/time for the data to be parsed in log
                             (either a timestamp or a time)
    -c | --dbclient host   : only report on entries for the given client host.
    -C | --nocomment       : remove comments like /* ... */ from queries.
    -d | --dbname database : only report on entries for the given database.
    -D | --dns-resolv      : client ip addresses are replaced by their DNS name.
                             Be warned that this can really slow down pgBadger.
    -e | --end datetime    : end date/time for the data to be parsed in log
                             (either a timestamp or a time)
    -E | --explode         : explode the main report by generating one report
                             per database. Global information not related to a
                             database are added to the postgres database report.
    -f | --format logtype  : possible values: syslog, syslog2, stderr, jsonlog,
                             cvs, pgbouncer, logplex, rds and redshift. Use this
                             option when pgBadger is not able to detect the log
                             format.
    -G | --nograph         : disable graphs on HTML output. Enabled by default.
    -h | --help            : show this message and exit.
    -H | --html-outdir path: path to directory where HTML report must be written
                             in incremental mode, binary files stay on directory
                             defined with -O, --outdir option.
    -i | --ident name      : programname used as syslog ident. Default: postgres
    -I | --incremental     : use incremental mode, reports will be generated by
                             days in a separate directory, --outdir must be set.
    -j | --jobs number     : number of jobs to run at same time. Run as single
                             by default or when working with csvlog.
    -J | --Jobs number     : number of log file to parse in parallel. Process one
                             file at a time by default or when csvlog is used.
    -l | --last-parsed file: allow incremental log parsing by registering the
                             last datetime and line parsed. Useful if you want
                             to watch errors since last run or if you want one
                             report per day with a log rotated each week.
    -L | --logfile-list file:file containing a list of log file to parse.
    -m | --maxlength size  : maximum length of a query, it will be restricted to
                             the given size. Default truncate size is 100000.
    -M | --no-multiline    : do not collect multiline statement to avoid garbage
                             especially on errors that generate a huge report.
    -n | --nohighlight     : disable SQL code highlighting.
    -N | --appname name    : only report on entries for given application name
    -o | --outfile filename: define the filename for the output. Default depends
                             on the output format: out.html, out.txt, out.bin,
                             out.json or out.tsung. This option can be used
                             multiple time to output several format. To use json
                             output the Perl module JSON::XS must be installed,
                             To dump output to stdout use - as filename.
    -O | --outdir path     : directory where out file must be saved.
    -p | --prefix string   : the value of your custom log_line_prefix as
                             defined in your postgresql.conf. Only use it if you
                             aren't using one of the standard prefixes specified
                             in the pgBadger documentation, such as if your
                             prefix includes additional variables like client ip
                             or application name. See examples below.
    -P | --no-prettify     : disable SQL queries prettify formatter.
    -q | --quiet           : don't print anything to stdout, not even a progress
                             bar.
    -Q | --query-numbering : add numbering of queries to the output when using
                             options --dump-all-queries or --normalized-only.
    -r | --remote-host ip  : set the host where to execute the cat command on
                             remote logfile to parse locally the file.
    -R | --retention N     : number of weeks to keep in incremental mode. Default
                             to 0, disabled. Used to set the number of weeks to
                             keep in output directory. Older weeks and days
                             directory are automatically removed.
    -s | --sample number   : number of query samples to store. Default: 3.
    -S | --select-only     : only report SELECT queries.
    -t | --top number      : number of queries to store/display. Default: 20.
    -T | --title string    : change title of the HTML page report.
    -u | --dbuser username : only report on entries for the given user.
    -U | --exclude-user username : exclude entries for the specified user from
                             report. Can be used multiple time.
    -v | --verbose         : enable verbose or debug mode. Disabled by default.
    -V | --version         : show pgBadger version and exit.
    -w | --watch-mode      : only report errors just like logwatch could do.
    -W | --wide-char       : encode html output of queries into UTF8 to avoid
                             Perl message "Wide character in print".
    -x | --extension       : output format. Values: text, html, bin, json or
                             tsung. Default: html
    -X | --extra-files     : in incremental mode allow pgBadger to write CSS and
                             JS files in the output directory as separate files.
    -z | --zcat exec_path  : set the full path to the zcat program. Use it if
                             zcat or bzcat or unzip is not in your path.
    -Z | --timezone +/-XX  : Set the number of hours from GMT of the timezone.
                             Use this to adjust date/time in JavaScript graphs.
    --pie-limit num        : pie data lower than num% will show a sum instead.
    --exclude-query regex  : any query matching the given regex will be excluded
                             from the report. For example: "^(VACUUM|COMMIT)"
                             You can use this option multiple times.
    --exclude-file filename: path of the file which contains all the regex to
                             use to exclude queries from the report. One regex
                             per line.
    --include-query regex  : any query that does not match the given regex will
                             be excluded from the report. You can use this
                             option multiple times. For example: "(tbl1|tbl2)".
    --include-file filename: path of the file which contains all the regex of
                             the queries to include from the report. One regex
                             per line.
    --disable-error        : do not generate error report.
    --disable-hourly       : do not generate hourly report.
    --disable-type         : do not generate report of queries by type, database
                             or user.
    --disable-query        : do not generate query reports (slowest, most
                             frequent, queries by users, by database, ...).
    --disable-session      : do not generate session report.
    --disable-connection   : do not generate connection report.
    --disable-lock         : do not generate lock report.
    --disable-temporary    : do not generate temporary report.
    --disable-checkpoint   : do not generate checkpoint/restartpoint report.
    --disable-autovacuum   : do not generate autovacuum report.
    --charset              : used to set the HTML charset to be used.
                             Default: utf-8.
    --csv-separator        : used to set the CSV field separator, default: ,
    --exclude-time  regex  : any timestamp matching the given regex will be
                             excluded from the report. Example: "2013-04-12 .*"
                             You can use this option multiple times.
    --include-time  regex  : only timestamps matching the given regex will be
                             included in the report. Example: "2013-04-12 .*"
                             You can use this option multiple times.
    --exclude-db name      : exclude entries for the specified database from
                             report. Example: "pg_dump". Can be used multiple
                             time.
    --exclude-appname name : exclude entries for the specified application name
                             from report.  Example: "pg_dump".  Can be used
                             multiple time.
    --exclude-line regex   : pgBadger will start to exclude any log entry that
                             will match the given regex. Can be used multiple
                             time.
    --exclude-client name  : exclude log entries for the specified client ip.
                             Can be used multiple time.
    --anonymize            : obscure all literals in queries, useful to hide
                             confidential data.
    --noreport             : prevent pgBadger to create reports in incremental
                             mode.
    --log-duration         : force pgBadger to associate log entries generated
                             by both log_duration = on and log_statement = 'all'
    --enable-checksum      : used to add a md5 sum under each query report.
    --journalctl command   : command to use to replace PostgreSQL logfile by
                             a call to journalctl. Basically it might be:
                                journalctl -u postgresql-9.5
    --pid-dir path         : set the path where the pid file must be stored.
                             Default /tmp
    --pid-file file        : set the name of the pid file to manage concurrent
                             execution of pgBadger. Default: pgbadger.pid
    --rebuild              : used to rebuild all html reports in incremental
                             output directories where there's binary data files.
    --pgbouncer-only       : only show PgBouncer related menu in the header.
    --start-monday         : in incremental mode, calendar's weeks start on
                             sunday. Use this option to start on monday.
    --normalized-only      : only dump all normalized query to out.txt
    --log-timezone +/-XX   : Set the number of hours from GMT of the timezone
                             that must be used to adjust date/time read from
                             log file before beeing parsed. Using this option
                             make more difficult log search with a date/time.
    --prettify-json        : use it if you want json output to be prettified.
    --month-report YYYY-MM : create a cumulative HTML report over the specified
                             month. Requires incremental output directories and
                             the presence of all necessary binary data files
    --noexplain            : do not process lines generated by auto_explain.
    --command CMD          : command to execute to retrieve log entries on
                             stdin. pgBadger will open a pipe to the command
                             and parse log entries generated by the command.
    --no-week              : inform pgbadger to not build weekly reports in
                             incremental mode. Useful if it takes too much time.
    --explain-url URL      : use it to override the url of the graphical explain
                             tool. Default: http://explain.depesz.com/?is_public=0&is_anon=0&plan=
    --tempdir DIR          : set directory where temporary files will be written
                             Default: File::Spec->tmpdir() || '/tmp'
    --no-process-info      : disable changing process title to help identify
                             pgbadger process, some system do not support it.
    --dump-all-queries     : dump all queries found in the log file replacing
                             bind parameters are included in the queries at
                             their respective placeholders position.

pgBadger is able to parse a remote log file using a passwordless ssh connection.
Use the -r or --remote-host to set the host ip address or hostname. There's also
some additional options to fully control the ssh connection.

    --ssh-program ssh        path to the ssh program to use. Default: ssh.
    --ssh-port port          ssh port to use for the connection. Default: 22.
    --ssh-user username      connection login name. Default to running user.
    --ssh-identity file      path to the identity file to use.
    --ssh-timeout second     timeout to ssh connection failure. Default 10 secs.
    --ssh-option  options    list of -o options to use for the ssh connection.
                             Options always used:
                                 -o ConnectTimeout=$ssh_timeout
                                 -o PreferredAuthentications=hostbased,publickey

Log file to parse can also be specified using an URI, supported protocol are
http[s] and [s]ftp. The curl command will be used to download the file and the
file will be parsed during download. The ssh protocol is also supported and will
use the ssh command like with the remote host use. See examples bellow.

Examples:

    pgbadger /var/log/postgresql.log
    pgbadger /var/log/postgres.log.2.gz /var/log/postgres.log.1.gz /var/log/postgres.log
    pgbadger /var/log/postgresql/postgresql-2012-05-*
    pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log
    pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" /var/log/postgresql.log
    cat /var/log/postgres.log | pgbadger -
    # Log prefix with stderr log output
    pgbadger --prefix '%t [%p]: user=%u,db=%d,client=%h' /pglog/postgresql-2012-08-21*
    pgbadger --prefix '%m %u@%d %p %r %a : ' /pglog/postgresql.log
    # Log line prefix with syslog log output
    pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' /pglog/postgresql-2012-08-21*
    # Use my 8 CPUs to parse my 10GB file faster, much faster
    pgbadger -j 8 /pglog/postgresql-10.1-main.log

Use URI notation for remote log file:

    pgbadger http://172.12.110.1//var/log/postgresql/postgresql-10.1-main.log
    pgbadger ftp://username@172.12.110.14/postgresql-10.1-main.log
    pgbadger ssh://username@172.12.110.14:2222//var/log/postgresql/postgresql-10.1-main.log*

You can use together a local PostgreSQL log and a remote pgbouncer log file to parse:

    pgbadger /var/log/postgresql/postgresql-10.1-main.log ssh://username@172.12.110.14/pgbouncer.log

Generate Tsung sessions XML file with select queries only:

  pgbadger -S -o sessions.tsung --prefix '%t [%p]: user=%u,db=%d ' /pglog/postgresql-10.1.log

Reporting errors every week by cron job:

    30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html

Generate report every week using incremental behavior:

    0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` -o /var/reports/pg_errors-`date +\%F`.html -l /var/reports/pgbadger_incremental_file.dat

This supposes that your log file and HTML report are also rotated every week.

Or better, use the auto-generated incremental reports:

    0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/

will generate a report per day and per week.

In incremental mode, you can also specify the number of week to keep in the
reports:

    /usr/bin/pgbadger --retention 2 -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/

If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can
use pgBadger as follow to exclude these period from the report:

    pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log

This will help avoid having COPY statements, as generated by pg_dump, on top of
the list of slowest queries. You can also use --exclude-appname "pg_dump" to
solve this problem in a simpler way.

You can also parse journalctl output just as if it was a log file:

    pgbadger --journalctl 'journalctl -u postgresql-9.5'

or worst, call it from a remote host:

    pgbadger -r 192.168.1.159 --journalctl 'journalctl -u postgresql-9.5'

you don't need to specify any log file at command line, but if you have other
PostgreSQL log file to parse, you can add them as usual.

To rebuild all incremental html reports after, proceed as follow:

    rm /path/to/reports/*.js
    rm /path/to/reports/*.css
    pgbadger -X -I -O /path/to/reports/ --rebuild

it will also update all resource files (JS and CSS). Use -E or --explode
if the reports were built using this option.

pgBadger also support Heroku PostgreSQL logs using logplex format:

    heroku logs -p postgres | pgbadger -f logplex -o heroku.html -

this will stream Heroku PostgreSQL log to pgbadger through stdin.

pgBadger can auto detect RDS and cloudwatch PostgreSQL logs using
rds format:

    pgbadger -f rds -o rds_out.html rds.log

CloudSQL Postgresql logs it's fairly normal PostgreSQL log but encapsulated in
JSON format. It is auto detected too by pgBagder but in case you need to force
the log format, use `jsonlog`

    pgbadger -f jsonlog -o cloudsql_out.html cloudsql.log

This is the same than with the jsonlog extension, the json format is different
but pgBadger can parse both format.

To create a cumulative report over a month use command:

    pgbadger --month-report 2919-05 /path/to/incremantal/reports/

this will add a link to the month name into the calendar view in
incremental reports to look at report for month 2019 May.
Use -E or --explode if the reports were built using this option.

[user@server ~]$

Finally, I would say it’s one of the best freely available tools for PostgreSQL log analysis and identifying performance issues. The shareable and feature-rich HTML reports & graphs have made it the go-to tool when it comes to capturing and sharing the database performance report with stakeholders 📊

All about Databases - Oracle, PostgeSQL, Mysql, Big Data .....

Monday, August 7, 2023

Installing pgbadger on Linux for PostgreSQL log analysis

No comments:

Post a Comment