2011-11-02

Monitoring Faye shards with GOD

Faye is the thing you want to configure, start and forget about. Maybe update to latest version sometimes...

That's what we did on our project. We continue developing business logic while node is working pretty stable. Sometimes node can crash, for example: https://github.com/jcoglan/faye/issues/106 or https://github.com/jcoglan/faye/issues/37. Yes, those issues are fixed, but new ones can appear (after update for example). Or what if we want to restart Faye if it's using more than X megabytes of memory? Or how to restart all shards at one time?

So, to keep Faye working 24/7 I've configured God - very nice tool to monitor processes.
Note: I've been thinking about using monit for it, but I've chosen God because of Ruby syntax and nicer way to configure everything.


This post is mostly about monitoring shards of Faye distributed to many servers/ports, but you can also use God configs (pasted below) to monitor 1 instance.



I've been talking a lot about sharding in my previous posts but I completely forgot to say how to run many Faye instances on 1 server on different ports.

Author of Faye (James Coglan) wrote an example how to run Faye on Node.js. Basically it's all about writing a file with next contents:
  var http = require('http'),
      faye = require('faye');
  var bayeux = new faye.NodeAdapter({mount: '/faye', timeout: 45});
  bayeux.listen(8000);

But this script doesn't support dynamic ports, so it's easy to fix. First, install modules optimist and util:
  npm install optimist util
and modify your script:

  var http = require('http'),
      faye = require('faye'),
      argv = require("optimist").argv,
      util = require("util");
  var bayeux = new faye.NodeAdapter({mount: '/faye', timeout: 45});
  bayeux.listen(argv.port);
To start Faye use next command (let's name our script file faye.js):
  node faye.js --port 8000



Now you can start few shards of Node.js on 1 server on different ports. But it's not so comfortable to stop/stop/restart many instances of node server manually. God can do it for you. All we need is 1 simple config file for it.

Installing God
God is a gem, so it's very simple to install it:
  gem install god
Check God's homepage to see additional details, if needed.

Configuration (case when you use faye_shards gem and have faye.yml config)
If you use faye_shards gem and have faye.yml configured - all you need is just to modify few paths. Or you can just create same yml but just for God.

require 'yaml'

rails_env   = ENV['RAILS_ENV']  || "development"
rails_root  = ENV['RAILS_ROOT'] || "/var/.../app/current"
faye_script = ENV['FAYE_SCRIPT']  || "/var/.../faye/faye.js"

shards = YAML.load(File.read(rails_root.to_s + "/config/faye.yml"))[rails_env]['shards'] || []
hostname = `hostname`.gsub /\n/, ''

shards.reject!{ |shard_config| shard_config['run_on'] and shard_config['run_on'] != hostname }

shards.each_with_index do |shard, id|

  God.watch do |w|
    w.dir      = "#{rails_root}"
    w.name     = "node-#{id}"
    w.group    = 'node'
    w.interval = 5.seconds
    w.start    = "node #{faye_script} --port #{shard['port']}"
    w.log      = "#{rails_root}/log/god_node.log"

#    w.uid = 'user'
#    w.gid = 'user'

    # restart if memory usage is > 500mb
    w.transition(:up, :restart) do |on|
      on.condition(:memory_usage) do |c|
        c.above = 500.megabytes
        c.times = 2
      end
    end

    # determine the state on startup
    w.transition(:init, { true => :up, false => :start }) do |on|
      on.condition(:process_running) do |c|
        c.running = true
      end
    end

    # determine when process has finished starting
    w.transition([:start, :restart], :up) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        c.interval = 10.seconds
      end

      # failsafe
      on.condition(:tries) do |c|
        c.times = 5
        c.transition = :start
        c.interval = 10.seconds
      end
    end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
      end
    end

  end
end

There is one more option in yml file which is not required and not used by faye_shards gem but used here: run_on. It's a hostname of server where this shard should be running. Let's say you have 2 servers running 2 instances of Faye each, so yml will look like:

production:
  shards:
    -
      local_host: 10.1.1.1
      port: 42000
      host: server1.myapp.com
      run_on: server1
    -
      local_host: 10.1.1.1
      port: 42001
      host: server1.myapp.com
      run_on: server1
    -
      local_host: 10.1.1.2
      port: 42000
      host: server2.myapp.com
      run_on: server2
    -
      local_host: 10.1.1.2
      port: 42001
      host: server2.myapp.com
      run_on: server2

NOTE: to find out what's the hostname of your server just run hostname command.
If you will start God with this config on all your servers - node will be started only on configured. run_on option can be skipped, so it will start every shard from yml file.

Configuration (case you don't use faye_shards gem)
If you have less agile deployment schema - you can, for example, modify God script like:

require 'yaml'

rails_env   = ENV['RAILS_ENV']  || "development"
rails_root  = ENV['RAILS_ROOT'] || "/var/.../app/current"
faye_script = ENV['FAYE_SCRIPT']  || "/var/.../faye/faye.js"

ports = [8000, 8001, 8002]

ports.each_with_index do |port, id|

  God.watch do |w|
    w.dir      = "#{rails_root}"
    w.name     = "node-#{id}"
    w.group    = 'node'
    w.interval = 5.seconds
    w.start    = "node #{faye_script} --port #{port}"
    w.log      = "#{rails_root}/log/god_node.log"

#    w.uid = 'user'
#    w.gid = 'user'

    # restart if memory usage is > 500mb
    w.transition(:up, :restart) do |on|
      on.condition(:memory_usage) do |c|
        c.above = 500.megabytes
        c.times = 2
      end
    end

    # determine the state on startup
    w.transition(:init, { true => :up, false => :start }) do |on|
      on.condition(:process_running) do |c|
        c.running = true
      end
    end

    # determine when process has finished starting
    w.transition([:start, :restart], :up) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        c.interval = 10.seconds
      end

      # failsafe
      on.condition(:tries) do |c|
        c.times = 5
        c.transition = :start
        c.interval = 10.seconds
      end
    end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
      end
    end

  end
end
If you want to run Faye using not current user but another one - uncomment lines

#    w.uid = 'user'
#    w.gid = 'user'

but in this case you will have to start god using sudo.

Running god
So, we have config which can start/stop Faye and monitor that it's not using more than 500Mb of memory. Let's save it in file called node.god. How to run it? Very simple:
  god -c node.god
It will start god which will automatically start all needed Node servers.
Note: Passing -D option will start not demonized god - it can be helpful to see errors or warnings while configuring script. 
Note: don't forget to run it in nohup mode on server.


Start/Stop God and Node
If God daemon is running - you can send some useful commands to it:

  • god stop node - to stop all nodes
  • god stop node-0 - to stop first node
  • got start node - start all nodes
  • got terminate - kill god and all processes started by it
Type 'god -h' to see more...

That's it. Now you can sleep while your Faye is monitored by God.

1 comment: