Raymii.org
Quis custodiet ipsos custodes?Home | About | All pages | Cluster Status | RSS Feed
Corosync Pacemaker - Execute script on failover
Published: 20-11-2013 | Author: Remy van Elst | Text only version of this article
❗ This post is over ten years old. It may no longer be up to date. Opinions may have changed.
With Corosync/Pacemaker there is no easy way to simply run a script on failover. There are good reasons for this, but sometimes you want to do something simple. This tutorial describes how to change the Dummy OCF resource to execute a script on failover.
Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below:
I'm developing an open source monitoring app called Leaf Node Monitoring, for windows, linux & android. Go check it out!
Consider sponsoring me on Github. It means the world to me if you show your appreciation and you'll help pay the server costs.
You can also sponsor me by getting a Digital Ocean VPS. With this referral link you'll get $200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!
In this example it is a script which triggers a few SNMP traps, sends an alert
to Nagios and sends some data to Graphite. SNMP alone could be done with the
ocf:heartbeat:ClusterMon
resource, but the other stuff not.
This is a very very simple way of doing it, I find it more a quick hack. For example, the script path is hard coded. For me that is not a problem because both the script as the Dummy resource are managed via Ansible, so I can change them any time.
Start by copying the Dummy resource over to a new resource. On Ubuntu the resource files are located here:
/usr/lib/ocf/resource.d/heartbeat/
In there, copy the Dummy
file to a new resource, for example FailOverScript
.
If you don't have the Dummy resource, you can also find it here.
Edit the name and description:
Name:
meta_data() {
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="FailOverScript" version="0.9">
<version>1.0</version>
Description:
<longdesc lang="en">
Script ran on Failover
</longdesc>
<shortdesc lang="en">Script ran on Failover</shortdesc>
Make sure the script you want to execute is placed on the host, and is
executable (chmod +x /usr/local/bin/script
).
A bit lower in the file, edit the dummy_start
function. Add the script path
below the if [ $? = $OCF_SUCCESS ]; then
and above the return $OCF_SUCCESS
lines. Like so:
dummy_start() {
dummy_monitor
/usr/local/bin/failover.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
touch ${OCF_RESKEY_state}
}
After that has been done, replace all instances of Dummy and dummy with your name of choice:
sed -i 's/Dummy/FailOverScript' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
sed -i 's/dummy/failoverscript' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
Test the script using the ocf-tester
program to see if you have any mistakes:
ocf-tester -n resourcename /usr/lib/ocf/resource.d/heartbeat/FailOverScript
Output:
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/FailOverScript...
/usr/sbin/ocf-tester: 214: /usr/sbin/ocf-tester: xmllint: not found
* rc=127: Your agent produces meta-data which does not conform to ra-api-1.dtd
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support master/slave (optional)
Tests failed: /usr/lib/ocf/resource.d/heartbeat/FailOverScript failed 1 tests
Oops. Seems we need xmllint
. On Ubuntu, install it:
apt-get install libxml2-utils
Test again, you'll see it will pass:
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/FailOverScript...
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support master/slave (optional)
/usr/lib/ocf/resource.d/heartbeat/FailOverScript passed all tests
As an extra test, to see if the script you've created is correctly executed, you can do a test start of the resource:
export OCF_ROOT=/usr/lib/ocf
bash -x /usr/lib/ocf/resource.d/heartbeat/FailOverScript start
To use this resource, add it like so:
crm configure primitive script ocf:heartbeat:FailOverScript op monitor interval="30"
If you want to test it, you can for example let the script send you an email. Put a node in standby and see if you get an email.
Tags: cluster , corosync , crm , heartbeat , high-availability , network , pacemaker , tutorials