Profiling & Benchmarking

Why profile?

Naturally, you should aim for efficiency by design and by writing tight code. Profiling is done to locate sections of code that are called most often and/or consume the most CPU time. Optimisation effort can then be focused on those routines, giving the best return on the time invested. If you have never profiled code before, it would be an idea to get acquainted with gprof, a typical userspace command line profiler.

Just as it is poor practice to rely on a debugger to catch bad code, so it is bad practice to rely on a profiler to catch inefficient code.

Basic profiling

The kernel has some built in profiling functionality. If you add profile=1 to your kernel command line arguments, then you get a file: proc/profile which can be used by readprofile to print profiling information to standard output. Example output from readprofile:

$readprofile
2	stext			0.0500
514867 	default_idle		12871.6750
1 	copy_thread		0.0071
1 	restore_sigcontext	0.0030
2	system_call		0.0312
2	handle_IRQ_event	0.0227
13 	do_page_fault		0.0111
1 	schedule		0.0012
1	wake_up_process		0.0132
1	copy_mm			0.0014                       
-------8---Snip---8-----------------

The first column contains the number of clock ticks spent in a function (second column), while the third column gives the normalised load; the number of clock ticks divided by the length of the function.

See man readprofile for details and useful examples. readprofile is usually part of the util-linux package. Sometimes readprofile is kept in /usr/sbin, so you may need to add that directory to your PATH or soft link it to some directory that is in your path, e.g. /usr/bin.

Note that if you want to carry out profiling on a remote machine, you will need to copy System.map and vmlinux across to /usr/src/linux from the top level source directory where the running kernel was compiled.

For more information on benchmarking, see the benchmarking-HOWTO

Your turn

Using the information in this chapter, profile your system carrying out an intensive task such as:

  • Running find on the root directory to locate a particular file. Compare with running updatedb followed by locate

  • Opening multiple heavy weight apps e.g. mozilla, OpenOffice, nautilus, use top to make sure you have driven your system into swapping out to disk.

  • Playing a high resolution mpeg video using your favourite media player.

  • Compiling a small application

  • Compiling a kernel

  • If possible try and simulate a heavy network load.

The idea here is to get an understanding of the places that the kernel spends most of its time during everyday operations.

Now using lxr (or some other source code navigation tool), locate the most referenced functions, get familiar with them and try and understand what each function does.