Question

我尝试通过外部代理配置Pacemaker群集事件通知，以便在发生故障转移时接收通知我搜索了下面的链接

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/s1-eventnotification-HAAR.html

http://floriancrouzat.net/2013/01/monitor-a-pacemaker-cluster-with-ocfpacemakerclustermon-andor-external-agent/

但不明白这是怎么做到的你能不能一步一步地解释一下。

谢谢你，
兰詹。

Answer 1

RedHat文档很简洁，但Florian的博客文章非常详细，最后的参考资料很有帮助。

问的问题有点模糊，所以我回答了我认为你问的问题。

简要地说，总结一下Florian的帖子，ClusterMon是一个在幕后运行ocf:pacemaker:ClusterMon的资源代理（crm_mon）。

我的（SLES 11 SP3）资源的文档说：

# crm ra info ocf:pacemaker:ClusterMon
Runs crm_mon in the background, recording the cluster status to an HTML file (ocf:pacemaker:ClusterMon)

This is a ClusterMon Resource Agent.
It outputs current cluster status to the html.

Parameters (* denotes required, [] the default):

user (string, [root]): 
    The user we want to run crm_mon as

update (integer, [15]): Update interval
    How frequently should we update the cluster status

extra_options (string): Extra options
    Additional options to pass to crm_mon.  Eg. -n -r

pidfile (string, [/tmp/ClusterMon_undef.pid]): PID file
    PID file location to ensure only one instance is running

htmlfile (string, [/tmp/ClusterMon_undef.html]): HTML output
    Location to write HTML output to.

Operations' defaults (advisory minimum):

    start         timeout=20
    stop          timeout=20
    monitor       timeout=20 interval=10

但是，真正的力量是extra_options因为这允许您让资源代理告诉crm_mon如何处理结果。具体来说，extra_options是作为crm_mon的命令行选项逐字传递的。

正如弗洛里安所提到的，更近期的crm_mon（实际上正在做什么工作）的年份并没有内置SMTP（电子邮件）或SNMP支持。但是，它仍然支持外部代理（通过-E开关）。

因此，要了解extra_options的作用，您应该咨询man crm_mon。

从您链接到的RedHat文档中，第一个＆＃34; extra_options＆＃34;值-T pacemaker@example.com -F pacemaker@nodeX.example.com -P PACEMAKER -H mail.example.com告诉crm_mon发送电子邮件至pacemaker@example.com，来自pacemaker@nodeX.example.com，主题前缀为PACEMAKER，通过邮件主机（smtp服务器）mail.example。 COM。

第二个＆＃34; extra_options＆＃34;您引用的RedHat文档中的示例值-S snmphost.example.com -C public告诉crm_mon使用名为public的社区将SNMP陷阱发送到snmphost.example.com。

第三个＆＃34; extra_options＆＃34;示例的值为-E /usr/local/bin/example.sh -e 192.168.12.1。这告诉crm_mon运行外部程序/usr/local/bin/example.sh，它还指定了外部收件人＆＃39;实际上它只是被抛入一个环境变量CRM_notify_recipient，它在产生脚本之前被导出。

运行外部代理程序时，crm_mon会调用为每个集群事件提供的脚本（包括成功的监视操作！）。这个脚本继承了一堆环境变量，告诉你发生了什么。

来自：http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-notification-external.html 设置的环境变量是：

CRM_notify_recipient    The static external-recipient from the resource definition.
CRM_notify_node The node on which the status change happened.
CRM_notify_rsc  The name of the resource that changed the status.
CRM_notify_task The operation that caused the status change.
CRM_notify_desc The textual output relevant error code of the operation (if any) that caused the status change.
CRM_notify_rc   The return code of the operation.
CRM_notify_target_rc    The expected return code of the operation.
CRM_notify_status   The numerical representation of the status of the operation.

脚本的工作是使用这些环境变量并对它们做一些合理的事情。什么是合理的＆＃34;取决于你的环境。

Florian博客中的SNMP陷阱示例假设您熟悉SNMP陷阱。如果没有，那么这是一个完全不同的问题，超出了资源代理的范围。

使用SNMP陷阱的示例提供了一个良好的条件语句，用于标识不成功的监视器事件或不是监视事件的事件。

使用可用信息执行任何操作的监视脚本的脚手架实际上是Florian博客文章中引用的snmp陷阱shell脚本的精简版本。它看起来像：

#!/bin/bash

# if [[ unsuccessful monitor operation ]] or [[ not monitor op ]]
if [[ ${CRM_notify_rc} != 0 && ${CRM_notify_task} == "monitor" ]] || \
   [[ ${CRM_notify_task} != "monitor" ]] ; then

    # Do whatever you want with the information available in the
    # environment variables mentioned above that will do something
    # meaningful for you.

    # EG: Fire off an email attempting to be human readable
    # SUBJ="${CRM_notify_task} ${CRM_notify_desc} for ${CRM_notify_rsc} "
    # SUBJ="$SUBJ on ${CRM_notify_node}"
    # MSG="The ${CRM_notify_task} operation for ${CRM_notify_rsc} on "
    # MSG="$MSG ${CRM_notify_node} exited with status ${CRM_notify_rc} "
    # MSG="$MSG (${CRM_notify_desc}) and we expected ${CRM_notify_target_rc}"
    # echo "$MSG" | mail -s "$SUBJ" you@host.com


fi
exit 0

但是，如果您遵循Florian的建议并克隆资源，则脚本将在每个节点上运行。对于非常好的SNMP陷阱。但是，如果您正在执行从脚本发送电子邮件等操作，您可能不希望实际克隆它。

Answer 2

两个节点：

cat << 'EOL'>/usr/local/bin/crm_e-mail.sh
#!/bin/bash
echo "Please check your installation @ http://domain.com.com:2224 & http://domain.com.com/clustermon.html" | mail -s "Cluster Change Detected" sysalert@domain.com
EOL
chmod 700 /usr/local/bin/crm_e-mail.sh
chown root.root /usr/local/bin/crm_e-mail.sh

一个节点：

pcs resource create ClusterMon-SMTP ClusterMon user=root \
update=10 extra_options="-E /usr/local/bin/crm_e-mail.sh --watch-fencing" \
pidfile=/var/run/crm_mon-smtp.pid clone

使用ocf：pacemaker：ClusterMon和/或外部代理监视Pacemaker群集

2 个答案: