Especially when starting to experiment with Heartbeat, strange problems may occur that are not easy to understand. However, there are several utilities that may be used to take a closer look at the Heartbeat internal processes.
To check the current state of your cluster, use the program crm_mon. This displays the current DC as well as all of the nodes and resources that are known to the current node.
For some reason, the connection between your nodes is broken. Most often, this is the result of a badly configured firewall. This also may be the reason for a split brain condition, where the cluster is partitioned.
Use the command crm_resource -L to learn about your current resources.
Try to run the resource agent manually. With LSB, just run scriptname start and scriptname stop. To check an OCF script, set the needed environment variables first. For example, when testing the IPaddr OCF script, you have to set the value for the variable ip by setting an environment variable that prefixes the name of the variable with OCF_RESKEY_. For this example, run the command:
export OCF_RESKEY_ip=<your_ip_address> /usr/lib/ocf/resource.d/heartbeat/IPaddr validate-all /usr/lib/ocf/resource.d/heartbeat/IPaddr start /usr/lib/ocf/resource.d/heartbeat/IPaddr stop
If this fails, it is very likely that you missed some mandatory variable or just mistyped a parameter.
You may always add the -V parameter to your commands. If you do that multiple times, the debug output becomes very verbose.
If you know the IDs of your resources, which you can get with crm_resource -L, remove a specific resource with crm_resource -C -r resource id.
For additional information about high availability on Linux and Heartbeat including configuring cluster resources and managing and customizing a Heartbeat cluster, see http://www.linu-ha.org.