The release of RIPS 3.2 introduces a health check feature that allows RIPS administrators to get information about the health status of their RIPS installation. It provides a simple way to monitor important system parameters, like the memory and disk usage, or the database response time. Nagios can be used to check the health condition of a RIPS instance regularly using a script that acquires this information through the API, making it possible to evaluate if the system is running as expected or if some action is required.
You need Nagios already up and running and a reachable RIPS API instance with the corresponding credentials. The user must have ROLE_CHIEF to be able to read out the health information. You can get an overview about access controls here .
Note: On-Premises customers should update RIPS to run the latest and compatible API version.
The script can be downloaded from the following locations:
The script should be copied to /usr/local/bin/check_rips.sh with permissions to be executed by the Nagios user. The following commands can be used:
In Nagios you have to create a command and a service. The command makes use of our script to allow Nagios to monitor and track the health condition of the RIPS instance. The code below is an example on how to configure a command in Nagios to use our script.
The command has to be used by a service.
You can find more information about how to configure Nagios at https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/configobject.html .
The script needs to know the following information in order to get data from the API:
|RIPS_BASE_URI||URI that points to a RIPS API|
|RIPS_USER||Username for API (with ROLE_CHIEF)||firstname.lastname@example.org|
|RIPS_PASSWORD||Password for API||userpassword|
All parameters have to be provided in a configuration file at /etc/rips/nagios_env in the format VARIABLE=VALUE, for example:
|Example of a configuration file /etc/rips/nagios_env|
You can also use a different location for the configuration file through the environment variable RIPS_ENV_FILE. This file must be readable by the Nagios user.
Once RIPS is integrated into Nagios the health check is performed automatically every few seconds, depending on the configuration. A successful check can look like the following in the Nagios web interface:
Script Exit Codes
The script gets the health information using the API and provides the following outputs:
|0||OK - healthy||The health status was successful acquired and everything is normal|
|1||WARNING - problems detected||The health status was successful acquired, but some problems need to be checked as soon as possible|
|2||CRITICAL - problems detected||The health status was successful acquired, but something critical demands immediately action|
|3||UNKNOWN - unexpected output||Cannot get the health status, because the API output is unexpected and cannot be parsed|
|3||UNKNOWN - curl error||Cannot get the health status, because a connection with the API cannot be made|
The health status is communicated through exit codes from 0 to 2 (OK, Warning, Critical). If the exit code is 3 something unexpected has happened (e.g., invalid credentials) and the health status could not be retrieved.