Thursday, May 31, 2018

Cluster Health Monitor (CHM) FAQ (Doc ID 1328466.1)

In this Document
Purpose
Questions and Answers
 What is the Cluster Health Monitor?
 What is the purpose of the Cluster Health Monitor?
 What platform does Cluster Health Monitor support and where can I get the Cluster Health Monitor?
 What is the resource name for Cluster Health Monitor in 11.2.0.2 or higher?
 Is stop/start ora.crf affecting clusterware function or cluster database function?
 Can the Cluster Health Monitor be installed on a single node, non-RAC server?
 Do Engineered Systems like Exadata have a default usage with CHM and if so, any specific version??
 Where is oclumon?
 How do I collect the Cluster Health Monitor data?
 Why does “diagcollection.pl --collect --chmos” return “Cannot parse master from output: ERROR : in reading init file” error?
 How do you get the syntax of different options and explanations for those options for diagcollection.pl and oclumon?
 What is IPD/OS?
 How is the Cluster Health Monitor different from OSWatcher?
 Is the Cluster Health Monitor replacing OSWatcher?
 How much of overhead does the Cluster Health Monitor cause?
 Does CHM on Multiple Node configurations (e.g. 4 to 8 nodes) have scaling concerns?
 Will CDB and PDB result in any new information or special conditions using CHM?
 How much of disk space is needed for the Cluster Health Monitor?
 How do I find out the size of data collected and saved by the Cluster Health Monitor in my system?
 How can I increase the size of the Cluster Health Monitor repository ?
 What platforms can I run the Cluster Health Monitor?
 What steps are needed to install 11.2.0.2 when the Cluster Health Monitor from OTN is already running?
 Where does the Cluster Health Monitor from OTN installed in Linux?
 What logs and data should I gather before logging a SR for the Cluster Health Monitor error?
 How do I increase the trace level the Cluster Health Monitor?
 Can I use procwatcher to get the pstack of the Cluster Health Monitor regularly?
 What are the processes and components for the Cluster Health Monitor?
 What is oclumon?
 What is definition of some of the files like *.bdb, _db.* , *.ldb , log.* files created by tool in the BDB (Berkeley Database) location directory ?
 Because it takes many days / weeks to resolve a problem like the node reboot or performance degradation, is there any way to keep the Cluster Health Monitor data for that long so that it can be replayed any time later when needed ?
 Where is the location for the log files for the Cluster Health Monitor from OTN (pre 11.2.0.2)?
 How do I fix the problem that the time in the oclumon report is in UTC time zone instead of the time zone of my server?
 Can I install CHM from OTN on 11.2.0.2? What if I stop and disable CHM resource (ora.crf) on 11.2.0.2?
 Where is the trace file for client like oclumon? How do I increase the trace level for oclumon?
 Can the Directory path to the CHM Repository be same on all nodes if shared storage is used?
 How much of data (how long in time) does the node store CHM data locally when it cannot communicate with the master?
 How often does CHM collect the system metric data? Can this be changed?
 What is the default CHM retention time? 
 How can you reduce the size of bdb file that became big for any reason?
 Can you set up CHM to run locally on each node?
 Can CHM be used on a single node non-RAC server?
 How to start and stop CHM that is installed as a part of GI in 11.2 and higher?
 What is the reason for following alerts in cluster on all the nodes?
 

<Internal_Only> What does "oclumon showtrail ..." and "oclumon showobjects ..." show? What are the meaning of trail and what are the objects that are listed by showobjects?
 Database - RAC/Scalability Community
References

APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 12.1.0.2 [Release 10.1 to 12.1]
Information in this document applies to any platform.

PURPOSE


The Cluster Health Monitor FAQ is an evolving document that answers common questions about the Cluster Health Monitor

QUESTIONS AND ANSWERS

What is the Cluster Health Monitor?

The Cluster Health Monitor collects OS statistics (system metrics) such as memory and swap space usage, processes, IO usage, and network related data. The Cluster Health Monitor collects information in real time and usually once a second. The Cluster Health Monitor collects OS statistics using OS API to gain performance and reduce the CPU usage overhead. The Cluster Health Monitor collects as much of system metrics and data as feasible that is restricted by the acceptable level of resource consumption by the tool.

What is the purpose of the Cluster Health Monitor?

The Cluster Health Monitor is developed to provide system metrics and data for troubleshooting many different types of problems such as node reboot and hang, instance eviction and hang, severe performance degradation, and any other problems that need the system metrics and data.

By monitoring the data constantly, users can use the Cluster Health Monitor detect potential problem areas such as CPU load, memory constraints, and spinning processes before the problem causes an unwanted outage.

What platform does Cluster Health Monitor support and where can I get the Cluster Health Monitor?

The Cluster Health Monitor is NOT supported on Linux Itanium,  and IBM Linux Z and HP-UX.

The Cluster Health Monitor is integrated part of 11.2.0.2 Oracle Grid Infrastructure for Linux (not on Linux Itanium and IBM Linux Z) and Solaris (Sparc 64 and x86-64 only), so installing 11.2.0.2 Oracle Grid Infrastructure on those platforms will automatically install the Cluster Health Monitor. AIX will have the Cluster Health Monitor starting from 11.2.0.3. The Cluster Health Monitor is also enabled for Windows (except Windows Itanium) in 11.2.0.3.

Prior to 11.2.0.2 on Linux (not on Linux Itanium and IBM Linux Z), the Cluster Health Monitor can be downloaded from OTN.

http://www-content.oracle.com/technetwork/products/clustering/downloads/ipd-download-homepage-087212.html

The OTN version for Windows is not available.  Please upgrade to 11.2.0.3 if you need CHM for Windows.

What is the resource name for Cluster Health Monitor in 11.2.0.2 or higher?

ora.crf is the Cluster Health Monitor resource name that ohasd manages. Issue “crsctl stat res –t –init” to check the current status of the Cluster Health Monitor.

Is stop/start ora.crf affecting clusterware function or cluster database function?

No, stop/start ora.crf resource will stop and start Cluster Health Monitor and its data collection, it will not affect clusterware or database functionality.

Can the Cluster Health Monitor be installed on a single node, non-RAC server?

The Cluster Health Monitor Standalone for LINUX x86 and x86-64 can be downloaded from OTN, it can be installed on a single node, non-RAC server without the need to install Grid Infrastructure or CRS. For other platform, it is required to install Grid Infrastructure or CRS to get CHM/OS.

Do Engineered Systems like Exadata have a default usage with CHM and if so, any specific version??

Engineered systems use the default GI stack that includes CHM functionality as shipped on standard platforms. At this time there are no specific extensions for Engineered systems but this may change in future releases.

Where is oclumon?

If the CHM is installed as a part of 11.2 installation on the supported platform, then the location of oclumon is in GI_HOME/bin directory.

If the CHM is manually installed using the CHM file from OTN, then the location of oclumon is in:
Linux : /usr/lib/oracrf/bin
Windows : C:\Program Files\oracrf\bin

How do I collect the Cluster Health Monitor data?

As grid user, using command “<GI_HOME>/bin/diagcollection.pl --collect --chmos” will produce output for all data that is collected in the repository. There may be too much data and may take long time, so the suggestion is limit the query to an interesting time interval.

For example, issue “<GI_HOME>/bin/diagcollection.pl --collect --crshome $ORA_CRS_HOME --chmos --incidenttime <start time of interesting time period> --incidentduration 05:00”

The above outputs the report that covers 5 hours from the time specified by incidenttime.
The incidenttime must be in MM/DD/YYYYHH:MN:SS where MM is month, DD is date, YYYY is year, HH is hour in 24 hour format, MN is minute, and SS is second. For example, if you want to put the incident time to start from 10:15 PM on June 01, 2011, the incident time is 06/01/201122:15:00. The incidenttime and incidentduration can be changed to capture more data.

Alternatively, ‘oclumon dumpnodeview -allnodes -v -last "11:59:59" > your-filename’ if diagcollection.pl fails with any reason. This will generate a report from the repository up to last 12 hours. The -last value can be changed to get more or less data.

Another example of using oclumon is 'oclumon dumpnodeview -allnodes -v -s "2012-06-01 22:15:00" -e "2012-06-02 03:15:00" > /tmp/chm.log '.  The difference in this command is that it specifies the start (-s flag) and end time (-e flag).
In this case, the time format used is "YYYY-MM-DD HH24:MI:SS" like "2007-11-12 23:05:00".

Why does “diagcollection.pl --collect --chmos” return “Cannot parse master from output: ERROR : in reading init file” error?

This is due to bug 10048487 that affects 11.2.0.2. As a result, the bug in the script causes the diagcollection.pl to never be able to retrieve the master node.

The workaround for this is to issue
oclumon dumpnodeview -allnodes -v -last “amount of data needed”
For example, oclumon dumpnodeview -allnodes -v -last “01:00:00”
will provide last one hour of data from all nodes.

How do you get the syntax of different options and explanations for those options for diagcollection.pl and oclumon?

Issue “<GI_HOME>/bin/diagcollection.pl –h” and “oclumon –h”. You may need to drill down further to get information for different options.

What is IPD/OS?

The IPD/OS is an old name for the Cluster Health Monitor. The names can be used interchangeably although Oracle now calls the tool Cluster Health Monitor.

How is the Cluster Health Monitor different from OSWatcher?

OSWatcher collects OS statistics by running regular unix commands such as vmstat, top, ps, iostat, netstat, mpstat, and meminfo. The private.net file can be configured in OSWatcher to issue traceroute command over the private interconnect to test the private interconnect. OSWatcher also runs in a user priority, so OSWatcher often cannot run when CPU load is heavy.

Is the Cluster Health Monitor replacing OSWatcher?

The Cluster Health Monitor has many advantages over OSWatcher, and the most significant is that the Cluster Health Monitor runs in real time and usually once a second, so the Cluster Health Monitor will collect data even when OSWatcher cannot. However, there are some information such as top, traceroute, and netstat that the Cluster Health Monitor does not collect, so running the Cluster Health Monitor while running OSWatcher is ideal. Both tools complement each other rather than supplement.
On the other hand, if only one of the tools can be used, then Oracle recommends that the Cluster Health Monitor is used.

How much of overhead does the Cluster Health Monitor cause?

In today's server environment, the Cluster Health Monitor uses approximately less than 3% of the server's capacity for CPU. The overhead of using the Cluster Health Monitor is minimal.  However. CHM on the server with large number of disks or IO devices and more CPUs/memory would use more CPU than CHM on a server that does not have many disks and CPUs/memory.

Does CHM on Multiple Node configurations (e.g. 4 to 8 nodes) have scaling concerns?

CHM functionality is designed to scale automatically with the cluster. While each node hosts an osysmond daemon, the ologgerd daemon services multiple osysmonds. Should a cluster grow large enough another ologgerd daemon is spawned to manage the increased load. The user is responsible for increasing the CHM data repository size as nodes are added to ensure sufficient retention time is maintained. This is recommended to be 72 hours.

Will CDB and PDB result in any new information or special conditions using CHM?

As CHM is collecting OS metrics there currently are no CDB or PDB specific metrics collected. There are currently no special conditions that are triggered when hosting a multitenant (CDB) database.

How much of disk space is needed for the Cluster Health Monitor?

The Cluster Health Monitor takes up 1GB space by default on all nodes in the cluster. The approximate amount of data collected is 0.5 GB per node per day. The size of the repository can increase to collect and save data up to 3 days, and this will increase the disk usage appropriately.

How do I find out the size of data collected and saved by the Cluster Health Monitor in my system?

“oclumon manage -get repsize” will show the size in seconds.
To estimate the space required, use the following formula:

# of nodes * 720MB * 3 = Size required for 3 days retention 
eg. for 4 node cluster: 4 * 720 * 3 = 8,640MB (8.4GB)

How can I increase the size of the Cluster Health Monitor repository ?

“oclumon manage -repos resize <number in seconds less than 259200>”. Setting the value to 259200 will collect and save the data for 72 hours (3 days). It is recommended to set 72 hours of retention based on above formula. This space needs to be available on all node in the cluster. Please resize the repositories or moving them if necessary in order to achieve 72 hours of retention.

What platforms can I run the Cluster Health Monitor?

11.2.0.1 and earlier: Linux only (download from OTN)
11.2.0.2: Solaris (Sparc 64 and x86-64 only), and Linux.
11.2.0.3: AIX, Solaris (Sparc 64 and x86-64 only), Linux, and Windows.

Cluster Health Monitor is NOT available for any Itanium platform such as Linux Itanium and Windows Itanium.

What steps are needed to install 11.2.0.2 when the Cluster Health Monitor from OTN is already running?

Remove the Cluster Health Monitor from OTN before upgrading the CRS or installing Grid Infrastructure.

Where does the Cluster Health Monitor from OTN installed in Linux?

$CRF_HOME is set /usr/lib/oracrf on Linux by default if the Cluster Health Monitor is from OTN. This is the Cluster Health Monitor home location.

What logs and data should I gather before logging a SR for the Cluster Health Monitor error?

1) provide 3-4 pstack outputs over a minute for osysmond.bin
2) output of strace -v for osysmond.bin about 2 minutes.
3) strace -cp <osysmond.bin pid> for about 2 min
4) oclumon dumpnodeview -v output for that node for 2 min.
5) output of "uname -a"
6) outpuft of "ps -eLf | grep osysmond.bin"
7) The ologgerd and sysmond log files in the CRS_HOME/log/<host name> directory from all nodes

How do I increase the trace level the Cluster Health Monitor?

Increase the log level for the daemons using,
oclumon debug log all allcomp:<trace level from 0 to 3>

Higher the trace level, more detailed tracing is done, so do not forget to reset the trace level back to 1 (the default trace level when the CHM is first installed) by issuing "oclumon debug log all allcomp:1"

Can I use procwatcher to get the pstack of the Cluster Health Monitor regularly?

Procwatcher version 030810 can now be used to monitor IPD procs. Just add the proc names to the CLUSTERPROCS list. The change is that Procwatcher is now smarter about picking the path of the executable so now it can find the IPD daemons if it is looking for them.

What are the processes and components for the Cluster Health Monitor?

Cluster Logger Service (ologgerd)
1) in 11.1, there is a master ologgerd that receives the data from other nodes and saves them in the repository (Berkeley database). It compresses the data before persisting to save the disk space. In an environment with multiple nodes, a replica ologgerd is also started on a node where the master ologgerd is not running. The master ologgerd will sync the data with replica ologgerd by sending the data to the replica ologgerd. The replica ologgerd takes over if the master ologgerd dies. A new replica ologgerd starts when the replica ologgerd dies. There is only one master ologgerd and one replica ologgerd per cluster.

To find the master olggerd, one can use the following command:
oclumon manage -get master

2) in 12.1 and above, because the CHM no longer stores the data to BDB but stores to the management database, CHM no longer needs to run ologgerd on multiple nodes, so ologgerd runs only on the node where the management database runs.

System Monitor Service (osysmond) – the sysmond process collects the system statistics of the local node and sends the data to the master ologgerd. A sysmond process runs on every node and collects the system statistics including CPU, memory usage, platform info, disk info, nic info, process info, and filesystem info.

What is oclumon?

OCLUMON command-line tool - use oclumon command line to query the CHM repository to display node-specific metrics for a specified time period.

You can also use oclumon to query and print the durations and the states for a resource on a node during a specified time period. These states are based on predefined thresholds for each resource metric and are denoted as red, orange, yellow, and green, indicating decreasing order of criticality.

What is definition of some of the files like *.bdb, _db.* , *.ldb , log.* files created by tool in the BDB (Berkeley Database) location directory ?

*.bdb & _db.* - These are files created for the berkeley db which stores the data collected.

log.* - These are berkeley bdb logfiles which preserve changes before making them to the db files. We have checkpointing setup and it reuses the log files.

*.ldb - This is the local logging file and MUST be present on all servers.

Do not delete above files except in case of trying to reduce the size of bdb file that get grow to a large size.  To reduce the size of bdb file, refer to the question "How can you reduce the size of bdb file that became big for any reason?" in this document.

Because it takes many days / weeks to resolve a problem like the node reboot or performance degradation, is there any way to keep the Cluster Health Monitor data for that long so that it can be replayed any time later when needed ?

The Cluster Health Monitor is designed to store data up to 3 days as best as it can by increasing the size of the repository up to 2GB. If you want to store data more than that, one way is to zip the output from ‘oclumon dumpnodeviews’ or ‘diagcollection’ regularly (like every hour).

Before 12.1.0.2, another way is to archive the whole BDB regularly (like every day) by making a copy of BDB file in the BDB location directory.

The way that CHMOS reads archived BDB is to start it in debug mode. It starts by using
ologdbg -d <bdb location>
After it starts, issue the oclumon dumpnodeview to get the data from the archived BDB.
For example, issue
oclumon dumpnodeview -n <node name> -s <start time> -e <end time> -v

Where is the location for the log files for the Cluster Health Monitor from OTN (pre 11.2.0.2)?

Check directory /usr/lib/oracrf/log/* for the alert<nodename>.log and other subdir for each daemons (SYSMOND, LOGGERD, OPROXYD) log.

How do I fix the problem that the time in the oclumon report is in UTC time zone instead of the time zone of my server?

The time in the repository is in UTC, and by default, oclumon shows the time in UTC. Check README, it shows UTC if ORACRF_TZ not set. Setting ORACRF_TZ should fix the time zone issue.

Can I install CHM from OTN on 11.2.0.2? What if I stop and disable CHM resource (ora.crf) on 11.2.0.2?

You cannot install CHM from OTN if there is any conflicting install, so installing CHM from OTN on servers that has 11.2.0.2 Grid Infrastructure will not work.  Disabling CHM resource (ora.crf) on 11.2.0.2 will still keep the installation; hence, OTN install will fail.

Where is the trace file for client like oclumon? How do I increase the trace level for oclumon?

The 'log' file for oclumon is in log/<hostname>/clients/oclumon.log.

Generally its not generated because, at the log level 0, there is no log data.
To see logs at higher log level one needs to do the following
1. oclumon [Enter the interactive mode]
2. query> debug log all allcomp:3

After this, any command execution will produce finer logs in oclumon.log

Can the Directory path to the CHM Repository be same on all nodes if shared storage is used?

One can set CHM repository at a shared storage under the same directory although it is recommended not to do so. One reason is the performance issue. In such a case, each node’s repository location is under the directory named as its hostname.

How much of data (how long in time) does the node store CHM data locally when it cannot communicate with the master?

The local repository size is small for nodes that need to save the local CHM data when it cannot communicate with the master.

With a sampling interval of 1 second, ideally it will be around 1 hour of data. With 11.2.0.3, we have moved to sampling interval of 5 seconds, hence, in that case the data that can be retained is 4-5 hours of data.

How often does CHM collect the system metric data? Can this be changed?

In pre-11.2.0.3, the CHM collection interval is usually once a second, but this can change depending on the the amount of data getting collected.  In 11.2.0.3, the CHM collection interval is changed to once every 5 seconds.

Currently, the collection interval can not be changed.

What is the default CHM retention time? 

In pre-11.2.0.2 CHM available from OTN, the default data retention time was 24 hours.

In 11.2.0.2, the retention time is determined by the size.   The default size has changed to 1GB. Depending on how large the cluster is, the default retention time is different.  For example, it is usually 6.9 hours for a one-node cluster when sampling interval is 1 second.   Please issue "oclumon manage -get repsize" to find out the retention time of your cluster.  The output is in seconds.

With sampling interval moving to 5 seconds in 11.2.0.3, the retention time becomes 5 times retention time with sampling interval 1 second.

It is recommended to set 72hours retention time.

How can you reduce the size of bdb file that became big for any reason?

You can manage repository size in terms of space using below command. This feature is present from 11.2.0.3.

oclumon manage -repos changesize <memsize>.

As a temporary work around, you can kill ologgerd and delete the contents in the BDB directory. osysmond should respawn ologgerd and new bdb file will get created. The past data is lost when this is done.

Please note the minimum size must be >= 1024 MB (1 GB), otherwise CRS-9100 "Error setting Cluster Health Monitor repository size" will be reported.

Can you set up CHM to run locally on each node?

On OTN, one can do that by installing CHM on each node independently although it is not recommended.

The Cluster Health Monitor that comes with the Grid Infrastructure install image must run with only one master ologgerd, so it can not be set  up to run locally on each node.

Can CHM be used on a single node non-RAC server?

The CHM available on OTN can be used on a single node non-RAC server, but only Linux and Windows version of CHM is available from OTN.  The CHM that comes with GI in 11.2 and higher must run with GI (RAC)

How to start and stop CHM that is installed as a part of GI in 11.2 and higher?

The ora.crf resource in 11.2 GI (and higher) is the resource for CHM, and the ora.crf resource is managed by ohasd. Starting and stopping ora.crf resource starts and stops CHM.

To stop CHM (or ora.crf resource managed by ohasd)
$GRID_HOME/bin/crsctl stop res ora.crf -init

To start CHM (or ora.crf resource managed by ohasd)
$GRID_HOME/bin/crsctl start res ora.crf -init

What is the reason for following alerts in cluster on all the nodes?


The ologgerd to fails to start, so the agent tries to restart ologgerd

alert.log host01
===================
2017-01-11 17:55:23.636 [OLOGGERD(14559)]CRS-8500: Oracle Clusterware OLOGGERD process is starting with operating system process ID 14559
2017-01-11 17:57:03.675 [OLOGGERD(15133)]CRS-8500: Oracle Clusterware OLOGGERD process is starting with operating system process ID 15133
2017-01-11 17:58:48.689 [OLOGGERD(15930)]CRS-8500: Oracle Clusterware OLOGGERD process is starting with operating system process ID 15930
2017-01-11 18:00:33.705 [OLOGGERD(16687)]CRS-8500: Oracle Clusterware OLOGGERD process is starting with operating system process ID 16687
The issue is caused because GIMR/mgmtdb is not created.
It's mandatory to create this database in 12.1.0.2 as it is used to store CHM log files. Ref - Doc ID 1568402.1
Please follow step "3B. For 12.1.0.2 only:" from Doc ID 1589394.1 to create mgmtdb.





Database - RAC/Scalability Community

To discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Database - RAC/Scalability Community


REFERENCES

NOTE:1554116.1 - Cluster Health Monitor (CHM/OS) osysmond.bin High Resource (CPU, Memory and FD etc) Usage

No comments:

Post a Comment