Thursday, July 15, 2021

Memory Notification: Library Cache Object loaded into SGA / ORA-600 [KGL-heap-size-exceeded] (Doc ID 330239.1)

 In this Document

Symptoms
Changes
Cause
Solution
References

APPLIES TO:

Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Cloud Service - Version N/A and later
Information in this document applies to any platform.

SYMPTOMS

Messages like the following are reported in the alert log:

       Memory Notification: Library Cache Object loaded into SGA
       Heap size <heap size K> exceeds notification threshold (51200K)

       ...:KGL object name : <OBJECT_NAME>

CHANGES

 

CAUSE

These are warning messages that are not causing process failure. They appear as a result of event messaging mechanism and memory manager introduced starting with 10gR2 database release. As large objects in the shared pool can potentially cause problems this warning threshold was implemented. Items/SQLs which allocate more space than this warning threshold, outputs a warning to the alert log. This warning is only to inform that a given heap object exceeds the defined threshold size and a trace file is generated so that a DBA can check potentially expensive - from shared memory point of view - objects. These are purely warning messages and have no impact on the database functionality, although they are designed to indicate possible tuning opportunities in customers' applications. The messages do not imply that an ORA-4031 is about to happen immediately unless the size of shared pool is very small.

SOLUTION

A. _kgl_large_heap_warning_threshold

A hidden parameter _kgl_large_heap_warning_threshold, that sets the KGL heap size warning threshold, was introduced starting with 10gR2.

Warnings are written if heap size in shared pool exceeds this threshold: 

_kgl_large_heap_warning_threshold => maximum heap size before KGL writes warnings to the alert log

 

Besides checking the memory management from the application code (recommended) you can set _kgl_large_heap_warning_threshold to a reasonable high value or zero to prevent these warning messages. The value needs to be set in bytes.

For example:

1. If using a SPFILE:
    =============
    (logged in as "/ as sysdba")

SQL> alter system set "_kgl_large_heap_warning_threshold"=83886080 scope=spfile;

SQL> shutdown immediate

SQL> startup

2. If using a PFILE:

    ============
    Edit the PFILE and add:

_kgl_large_heap_warning_threshold=83886080

 

The default threshold in 10.2.0.1 is only 2M. Starting with 10.2.0.2 the threshold was increased to 50MB after regression tests, so this should be a reasonable and recommended value in most cases.

 

B. _kgl_large_heap_assert_threshold

In 12.1.0.2 database release (that includes the fix Bug 15898589 - enhancement to restrict the size of SGA base library cache heaps) an enhancement to restrict the size of SGA base library cache heaps was introduced in order to avoid running out of space inside shared pool and hence ORA-4031 errors. With this fix, a new hidden parameter, _kgl_large_heap_assert_threshold, was also introduced. 

_kgl_large_heap_assert_threshold => maximum heap size before KGL raises an internal error

Its value represents the maximum heap size before raising the ORA-00600 internal error like:

ORA-00600: internal error code, arguments: [KGL-heap-size-exceeded], [0x7FF91F844240], [6], [532279608], [], [], [], [], [], [], [], []

 

The default value for _kgl_large_heap_assert_threshold can be too small for objects with a large number of partitions/sub-partitions in which case the value should be increased to avoid the ORA-600 error.

For example setting:
"_kgl_large_heap_assert_threshold" = 1572864000 will raise the threshold for the assert to 1500 MB.

 

To check the current value of the parameters, run the following query:

select
nam.ksppinm NAME,
nam.ksppdesc DESCRIPTION,
val.KSPPSTVL
from
x$ksppi nam,
x$ksppsv val
where nam.indx = val.indx and nam.ksppinm like '%kgl_large_heap_%_threshold%';

For example, in 19.3.0.0 the default values for the two parameters are:

SQL> select
2 nam.ksppinm NAME,
3 nam.ksppdesc DESCRIPTION,
4 val.KSPPSTVL
5 from
6 x$ksppi nam,
7 x$ksppsv val
8 where nam.indx = val.indx and nam.ksppinm like '%kgl_large_heap_%_threshold%';

NAME
--------------------------------------------------------------------------------
DESCRIPTION
--------------------------------------------------------------------------------
KSPPSTVL
--------------------------------------------------------------------------------
_kgl_large_heap_warning_threshold
maximum heap size before KGL writes warnings to the alert log
52428800

 

_kgl_large_heap_assert_threshold
maximum heap size before KGL raises an internal error
524288000

 

Before increasing the warning thresholds you can inspect the generated trace file. The ORA-00600 will generate both a trace file in the trace directory and an incident file under the incident id within the incident directory.

The top part of the trace file tells us the SQL that was being run when the error was hit:

----- Current SQL Statement for this session (sql_id=<Sql Id>) -----

This immediately shows us the SQL/ data objects that were accessed.  The KGL object name in trace file can by found by searching for 'LibraryHandle'.
After identifying the failing SQL/ object/s in isolation see/check if the memory requirement can be reduced.

A full list of known issues is given in:
    Note 1662971.1 - ORA-600 [KGL-heap-size-exceeded]
Each bug has a short description that indicates the circumstances where it is hit. The bug list can be shortened by selecting your database release to show only those issues that may affect you.

 

Please be aware that by setting _kgl_large_heap_warning_threshold to 0 in 12.1.0.2 exposes the problem described in:

     Bug 22330282 - "Heap size 0K exceeds notification threshold" alert messages when "_kgl_large_heap_warning_threshold" is set to 0 (Document: 22330282.8).

To fix this problem either:

1. Apply Patch 22330282 or open a backport request with Oracle Support if the patch is not available for your database release and/or platform.

OR

2. Workaround the issue by setting:
    a. Set _kgl_large_heap_warning_threshold to a very large value.
    b. Set both _kgl_large_heap_warning_threshold and _kgl_large_heap_assert_threshold to zero.

* Before increasing the thresholds it should be taken into consideration the increase of SGA_TARGET and SHARED_POOL_SIZE by 15%, to make more free room in SGA.

 

NOTE 1: In 11.2.0.4 or 12.1.0.1 in order to have the _kgl_large_heap_assert_threshold parameter available and avoid the Bug 22330282, apply Patch 22330282 or open a backport request with Oracle Support if the patch is not available for your database release and/or platform.
NOTE 2: The maximum value, that can be set for the 2 hidden parameters, is 2147483647 (‭1,99GB)‬. Any value beyond that limit raises the following error:
ERROR at line 1:
ORA-02017: integer value required

Monday, June 14, 2021

PG_SQUEEZE – SHRINKS TABLES BETTER THAN VACUUM

 pg_squeeze is an open source PostgreSQL extension that enables automatic and transparent fixing of bloated tables – without extensive table locking. The process fully works in the background.

KEY BENEFITS

  • More aggressive space reduction
  • Close-to lock-free table reorganization
  • Ability to move tables between tablespaces (without downtime)
  • Ability to cluster tables (without downtime)
  • Builtin advanced scheduling
  • Fully Open Source

pg_squeeze is not a replacement for autovacuum – it is an add-on to perform cleanups even better

PostgreSQL uses a mechanism called “MVCC” (Multi Version Concurrency Control) to store data. As the name already suggests, MVCC will hold various versions of a row to support as much concurrency as possible. At some point, those additional rows must be removed from the storage system and this is where VACUUM comes along.

Unlike with built-in commands “VACUUM FULL” or “CLUSTER”, with “pg_squeeze” there are no extended periods of full-table locking, and consequently reads and writes are not blocked during the rebuild. Also the rebuilding process is very efficient due to a novel approach of using transaction log files and logical decoding (instead of triggers) to capture possible data changes to the table being rebuilt. First of all, this helps save disk space and IO throughput, but even more importantly, it enables very short locking times, making it a perfect fit for mission-critical OLTP systems.

How does pg_squeeze work?

  1. The extension is implemented as a background-worker process (a framework introduced in version 9.4) that periodically monitors user-defined tables.
  2. When it detects that a table has exceeded the “bloat threshold”,
  3. it kicks in and rebuilds that table automatically.

 

 

Rebuilding happens concurrently in the background with minimal storage and computational overhead due to the use of Postgres’ built-in replication slots together with logical decoding to extract possible table changes happening during the rebuild from XLOG. Bloat threshold is of course configurable and the bloat ratio calculation is based on the Free Space Map (taking FILLFACTOR also into account) or under certain conditions on the “pgstattuple” extension when it’s available. Additionally, many customization parameters like “minimum table size” can be set with non-suitable tables being ignored. Moreover, reordering using an index or moving the table or indexes to a new tablespace are possible.

PG_SQUEEZE 1.3 DOWNLOAD

The newest version of pg_squeeze can be downloaded from github.com.

DOWNLOAD PG_SQUEEZE >>

INTRODUCING PG_SQUEEZE – A POSTGRESQL EXTENSION TO AUTO-REBUILD BLOATED TABLES

One of the few areas where out-of-the-box functionality by PostgreSQL is not 100% satisfying, is the “bloat problem”. Combating bloat, or just trying to ensure that your table data is physically ordered according to some column(s) (a.k.a. clustering) required accepting some inconvenient compromises until now. Extended periods of full table locking (no read or write activities) with built-in VACUUM FULL or CLUSTER commands or involving third party tooling, usually meaning “pg_repack”, were necessary. “pg_repack” offers good benefits like a lot smaller full-lock time, ordering by specific columns, but needs a bit of fiddling around – installing the extension, identifying bloated tables, running their command line client, and for larger tables it could also temporarily grow the disk size unnecessarily as it uses triggers to store the modifications made to tables during the pre-building phase.

To alleviate the situation, on behalf of the Cybertec development team, I’m really glad to announce a new bloat-painkiller called “pg_squeeze”! I myself, with my stereotypically calm Nordic temper, don’t usually get too excited by a piece of software, but this time as a day-to-day PostgreSQL user I must say that I’m really impressed – absolutely great piece of work! And also I wonder why nothing like that came about earlier.

What does pg_squeeze do exactly?

pg_squeeze is a PostgreSQL extension implementing a background worker process (one per DB) that periodically monitors tables defined by the user and when it detects a table crossing the “bloat threshold”it kicks in and rebuilds that table automatically! Rebuilding happens concurrently in the background with minimal storage and computational overhead due to using Postgres’ built-in replication slots together with logical decoding to extract possible table changes happening during the rebuild from XLOG. Bloat threshold is configurable and bloat ratio calculation is based on the free space map or under certain conditions based on concepts of “pgstattuple” extension. Additionally minimum table size can be set, with smaller tables being ignored. Additional requirement for the table to be considered for rebuilding is that they need to have a primary key or unique constraint defined.

Sample setup

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Download and install the extension
 
git clone …
 
export PGCONFIG=/usr/bin/pg_config       # point it to your desired Postgres installation
 
make && sudo make install
 
cat <<-EOF>> testcluster/postgresql.conf
 
wal_level=logical
 
max_replication_slots = 1
 
shared_preload_libraries = 'pg_squeeze'
 
EOF
 
pg_ctl -D  testcluster restart
 
psql -c “CREATE EXTENSION pg_squeeze”
 
psql -c “INSERT INTO squeeze.tables \
 
(tabschema, tabname, first_check) \
 
VALUES ('public', 'foo', now());”
 
psql -c “SELECT squeeze.start_worker()”    # PS! not needed when we define the list of “squeezed”
 
# databases in postgresql.conf

Details

In addition to the above mentioned option to list databases and tables taking part in the auto-rebuild, also following “rebuild decision” aspects can be configured for every table by adjusting values in the “squeeze.tables” table. NB! Only this table and start/stop_worker() calls are meant to be the only “user interface” provided for the extension, with other tables/functions meant for internal use (although it is possible to launcher).

  • Bloat threshold in percentage (“free_space_extra” param), defaulting to 50%
  • Minimum disk space in megabytes the table must occupy to be eligible for processing
  • An index name to physically order tuples according to keys of that index
  • New tablespace for table/indexes can be specified
  • Minimum time between two rebuilds (safety)

Additionallyy provided on global (database) level:

  • squeeze.log – table storing rebuild events and their durations to see which tables get bloated the most.
  • squeeze.errors – table contains errors that happened during squeezing. Normally errors should only be of type “DDL being performed” (adding of a new column for example) or “max_xlock_time” reached.
  • squeeze.max_xlock_time – parameter specifying maximum exclusive lock holding time during the “table switch”.

Also due to using logical decoding, this means that only newer versions of PostgreSQL starting from version 9.4 can be taken into consideration.

Currently “pg_squeeze” only supports the latest PostgreSQL 9.6, but we’re thinking of backporting it also to 9.5.

Configuring automatic start of “squeeze” workers

If you want the background worker to start automatically during startup of the whole PostgreSQL cluster, add entries like this to “postgresql.conf”, selecting appropriate databases and a role for the worker. More explanations on that are available from the README.

1
2
3
squeeze.worker_autostart = 'mydb1 mydb2'
 
squeeze.worker_role = postgres

Grab the code here a try it out! Questions and feedback welcome.