Statsd is a nodejs daemon for easy stats aggregation from folks at etsy. If you have n’t come across statsd I suggest you read blog post from etsy devops team here.

I like the fact that they chose UDP to aggregate stats since its blazing fast. Graphite is chosen as a backend to store the tracked data. Graphite receives data and creates a data point every minute.

Statsd is a nodejs daemon which accepts UDP packets, stores the data in memory. and flushes data into graphite once in X seconds. This flushinterval can be configured via config file.

I suggest you install node.js version 0.38 as other versions threw some errors and warnings regarding vm.

    $ node stats.js config.js

Once statsd daemon is up and running. Install graphite and its dependencies. Installing graphite is quite some work as it has dependencies on django, pycairo and several other modules. Grig has written an excellent post on “Installing and configuring Graphite”. Instead of using Apache to serve django I chose to use gunicorn which is pretty easy to setup.

    # run this command using runit or daemontools
    cd /opt/graphite/webapp/graphite && gunicorn_django -b

After you install graphite make sure to syncdb and start carbon-cache server to start collecting data.

Write to Statsd with python

Now comes the interesting part writing data into the statsd. I used a python-statsd client py-statsd available on github.

    from pystatsd import Client
    sc = Client("", 8125)

Now this code creates a connection object sc to statsd and increments a key. You can do quite a lot of tracking with statsd package.

  • Track logins on the site
  • Track 500 rate on the site
  • Track user invite rate
  • Track emails sent out and many more

py-statsd also comes with a server which collects data just like statsd, but I use it only for development purposes. We run all our graphite + statsd packages on a EC2 small instance. Pretty sweet.

Make sure you configure graphite to store data back in the past for about 6 months to 1 year.