Continuing to explore consul for service discovery, I added some health checks to the redis cluster from the previous post (World's quickest demo of consul).

As you can see below if a health check is critical, then the service is removed from the DNS listing.

If a health check is a warning, then it continues being included in the DNS listings.

health-check

The demo shows there are two nodes advertising on hostname slave.redis.service.consul. When one of them critically fails a health check then it is removed from the DNS listings.

In the example above and the configuration below, there is a check:

  • disk at 98%+ fail critically
  • disk at 50% fail with warning
  • otherwise ok

Consul requires each health check to return: success (exit 0), warning (exit 1) or critical (exit 2 or higher).

Here's the health check file used in the demo above:

{
  "check": {
    "id": "redis-disk-level",
    "name": "Persistent disk check",
    "notes": "Persistent disk 98%+ critical; 50%+ warning",
    "script": "/path/to/health_check disk",
    "interval": "20s"
  }
}

The health check command (health_check disk) is in its own bash script rather than compacted into a hard-to-read one liner:

#!/bin/bash

case $1 in  
  disk)
    persistent_disk_level=$(df | grep /var/vcap/store | awk '{ print $5 }' | sed -e 's/%//')
    echo "Disk level $persistent_disk_level%"
    if [[ $persistent_disk_level -ge 98 ]]; then
      exit 2
    fi
    if [[ $persistent_disk_level -ge 50 ]]; then
      exit 1
    fi
    ;;
  *)
    echo "Usage: health_check {disk}"

    ;;

esac  
exit 0  

BOSH for redis updated

The redis BOSH release is now upgraded with the health checks above and fixes the provisioning of a persistent disk for each node.

See the README for deploying redis on BOSH with consul enabled.