Tracking with statsd
I like the fact that they chose UDP to aggregate stats since its blazing fast. Graphite is chosen as a backend to store the tracked data. Graphite receives data and creates a data point every minute.
Statsd is a nodejs daemon which accepts UDP packets, stores the data in memory. and flushes data into graphite once in X seconds. This flushinterval can be configured via config file.
I suggest you install node.js version 0.38 as other versions threw some errors and warnings regarding vm.
$ node stats.js config.js
Once statsd daemon is up and running. Install graphite and its dependencies. Installing graphite is quite some work as it has dependencies on django, pycairo and several other modules. Grig has written an excellent post on “Installing and configuring Graphite”. Instead of using Apache to serve django I chose to use gunicorn which is pretty easy to setup.
# run this command using runit or daemontools cd /opt/graphite/webapp/graphite && gunicorn_django -b 0.0.0.0:8000
After you install graphite make sure to syncdb and start carbon-cache server to start collecting data.
Write to Statsd with python
Now comes the interesting part writing data into the statsd. I used a python-statsd client py-statsd available on github.
from pystatsd import Client sc = Client("graphite_host.yashh.com", 8125) sc.increment("stats.blog_post.track-with-statsd") sc.decrement("stats.users.count")
Now this code creates a connection object sc to statsd and increments a key. You can do quite a lot of tracking with statsd package.
- Track logins on the site
- Track 500 rate on the site
- Track user invite rate
- Track emails sent out and many more
py-statsd also comes with a server which collects data just like statsd, but I use it only for development purposes. We run all our graphite + statsd packages on a EC2 small instance. Pretty sweet.
Make sure you configure graphite to store data back in the past for about 6 months to 1 year.