2.11 Graph Your Data: RRDtool

There are a lot of data in the world around you, which can be easily measured in time. For example, changes in the temperature, or the number of data sent or received by your computer's network interface. RRDtool can help you store and visualize such data in detailed and customizable graphs.

RRDtool is available for most UNIX platforms and Linux distributions. openSUSE® ships RRDtool as well. Install it either with YaST or by entering

zypper install rrdtool in the command line as root.

HINT: There are Perl, Python, Ruby, or PHP bindings available for RRDtool, so that you can write your own monitoring scripts with your preferred scripting language.

2.11.1 How RRDtool Works

RRDtool is a shortcut of Round Robin Database tool. Round Robin is a method for manipulating with a constant amount of data. It uses the principle of a circular buffer, where there is no end nor beginning to the data row which is being read. RRDtool uses Round Robin Databases to store and read its data.

As mentioned above, RRDtool is designed to work with data that change in time. The ideal case is a sensor which repeatedly reads measured data (like temperature, speed etc.) in constant periods of time, and then exports them in a given format. Such data are perfectly ready for RRDtool, and it is easy to process them and create the desired output.

Sometimes it is not possible to obtain the data automatically and regularly. Their format needs to be pre-processed before it is supplied to RRDtool, and often you need to manipulate RRDtool even manually.

The following is a simple example of basic RRDtool usage. It illustrates all three important phases of the usual RRDtool workflow: creating a database, updating measured values, and viewing the output.

2.11.2 Simple Real Life Example

Suppose we want to collect and view information about the memory usage in the Linux system as it changes in time. To make the example more vivid, we measure the currently free memory for the period of 40 seconds in 4-second intervals. During the measuring, the three hungry applications that usually consume a lot of system memory have been started and closed: the Firefox Web browser, the Evolution e-mail client, and the Eclipse development framework.

Collecting Data

RRDtool is very often used to measure and visualize network traffic. In such case, Simple Network Management Protocol (SNMP) is used. This protocol can query network devices for relevant values of their internal counters. Exactly these values are to be stored with RRDtool. For more information on SNMP, see http://www.net-snmp.org/.

Our situation is different - we need to obtain the data manually. A helper script free_mem.sh repetitively reads the current state of free memory and writes it to the standard output.

tux@mercury:~> cat free_mem.sh
INTERVAL=4
for steps in {1..10}
do
    DATE=`date +%s`
    FREEMEM=`free -b | grep "Mem" | awk '{ print $4 }'`
    sleep $INTERVAL
    echo "rrdtool update free_mem.rrd $DATE:$FREEMEM"
done

Points to Notice

  • The time interval is set to 4 seconds, and is implemented with the sleep command.

  • RRDtool accepts time information in a special format - so called Unix time. It is defined as the number of seconds since the midnight of January 1, 1970 (UTC). For example, 1272907114 represents 2010-05-03 17:18:34.

  • The free memory information is reported in bytes with free -b. Prefer to supply basic units (bytes) instead of multiple units (like kilobytes).

  • The line with the echo ... command contains the future name of the database file (free_mem.rrd), and together creates a command line for the purpose of updating RRDtool values.

After running free_mem.sh, you see an output similar to this:

tux@mercury:~> sh free_mem.sh
rrdtool update free_mem.rrd 1272974835:1182994432
rrdtool update free_mem.rrd 1272974839:1162817536
rrdtool update free_mem.rrd 1272974843:1096269824
rrdtool update free_mem.rrd 1272974847:1034219520
rrdtool update free_mem.rrd 1272974851:909438976
rrdtool update free_mem.rrd 1272974855:832454656
rrdtool update free_mem.rrd 1272974859:829120512
rrdtool update free_mem.rrd 1272974863:1180377088
rrdtool update free_mem.rrd 1272974867:1179369472
rrdtool update free_mem.rrd 1272974871:1181806592

It is convenient to redirect the command's output to a file with

sh free_mem.sh > free_mem_updates.log

to ease its future execution.

Creating Database

Create the initial Robin Round database for our example with the following command:

rrdtool create free_mem.rrd --start 1272974834 --step=4 \
DS:memory:GAUGE:600:U:U RRA:AVERAGE:0.5:1:24

Points to Notice

  • This command creates a file called free_mem.rrd for storing our measured values in a Round Robin type database.

  • The --start option specifies the time (in Unix time) when the first value will be added to the database. In this example, it is one less than the first time value of the free_mem.sh output (1272974835).

  • The --step specifies the time interval in seconds with which the measured data will be supplied to the database.

  • The DS:memory:GAUGE:600:U:U part introduces a new data source for the database. It is called memory, its type is gauge, the maximum number between two updates is 600 seconds, and the minimal and maximal value in the measured range are unknown (U).

  • RRA:AVERAGE:0.5:1:24 creates Round Robin archive (RRA) whose stored data are processed with the consolidation functions (CF) that calculates the average of data points. 3 arguments of the consolidation function are appended to the end of the line .

If no error message is displayed, then free_mem.rrd database is created in the current directory:

tux@mercury:~> ls -l free_mem.rrd
-rw-r--r-- 1 tux users 776 May  5 12:50 free_mem.rrd

Updating Database Values

After the database is created, you need to fill it with the measured data. In Collecting Data, we already prepared the file free_mem_updates.log which consists of rrdtool update commands. These commands do the update of database values for us.

tux@mercury:~> sh free_mem_updates.log; ls -l free_mem.rrd
-rw-r--r--  1 tux users  776 May  5 13:29 free_mem.rrd

As you can see, the size of free_mem.rrd remained the same even after updating its data.

Viewing Measured Values

We have already measured the values, created the database, and stored the measured value in it. Now we can play with the database, and retrieve or view its values.

To retrieve all the values from our database, enter the following on the command line:

tux@mercury:~> rrdtool fetch free_mem.rrd AVERAGE --start 1272974830 \
--end 1272974871
          memory
1272974832: nan
1272974836: 1.1729059840e+09
1272974840: 1.1461806080e+09
1272974844: 1.0807572480e+09
1272974848: 1.0030243840e+09
1272974852: 8.9019289600e+08
1272974856: 8.3162112000e+08
1272974860: 9.1693465600e+08
1272974864: 1.1801251840e+09
1272974868: 1.1799787520e+09
1272974872: nan

Points to Notice

  • AVERAGE will fetch average value points from the database, because only one data source is defined (Creating Database) with AVERAGE processing and no other function is available.

  • The first line of the output prints the name of the data source as defined in Creating Database.

  • The left results column represents individual points in time, while the right one represents corresponding measured average values in scientific notation.

  • The nan in the last line stands for not a number.

Now a graph representing representing the values stored in the database is drawn:

tux@mercury:~> rrdtool graph free_mem.png \
--start 1272974830 \
--end 1272974871 \
--step=4 \
DEF:free_memory=free_mem.rrd:memory:AVERAGE \
LINE2:free_memory#FF0000 \
--vertical-label "GB" \
--title "Free System Memory in Time" \
--zoom 1.5 \
--x-grid SECOND:1:SECOND:4:SECOND:10:0:%X

Points to Notice

  • free_mem.png is the file name of the graph to be created.

  • --start and --end limit the time range within which the graph will be drawn.

  • --step specifies the time resolution (in seconds) of the graph.

  • The DEF:... part is a data definition called free_memory. Its data are read from the free_mem.rrd database and its data source called memory. The average value points are calculated, because no others were defined in Creating Database.

  • The LINE... part specifies properties of the line to be drawn into the graph. It is 2 pixels wide, its data come from the free_memory definition, and its color is red.

  • --vertical-label sets the label to be printed along the y axis, and --title sets the main label for the whole graph.

  • --zoom specifies the zoom factor for the graph. This value must be greater than zero.

  • --x-grid specifies how to draw grid lines and their labels into the graph. Our example places them every second, while major grid lines are placed every 4 seconds. Labels are placed every 10 seconds under the major grid lines.

Figure 2-2 Example Graph Created with RRDtool

2.11.3 For More Information

RRDtool is a very complex tool with a lot of sub-commands and command line options. Some of them are easy to understand, but you have to really study RRDtool to make it produce the results you want and fine-tune them according to your liking.

Apart form RRDtool's man page (man 1 rrdtool) which gives you only basic information, you should have a look at the RRDtool homepage. There is a detailed documentation of the rrdtool command and all its sub-commands. There are also several tutorials to help you understand the common RRDtool workflow.

If you are interested in monitoring network traffic, have a look at MRTG. It stands for Multi Router Traffic Grapher and can graph the activity of all sorts of network devices. It can easily make use of RRDtool.