Introduction to Dynamic Tracing


Introduction

Having recently completed some work in the DTrace and ProbeVue environments I was feeling fairly high on the subject. You know, the kind of high where you come to think of it as a part of the family, and only if your real family members could live up to its wonderful goodness. So it came as a bit of a surprise to me to have a senior systems administrator ask me why he should take the time to learn and use these tools. I was floored that such a brilliant system administrator1 would ask such a question.

This paper was written as a response to him and other system administrators out there who question why these tools are growing in popularity and why we all need to learn them. - Not long ago, I was one of those administrators.

Before Dynamic Tracing

To answer the question at hand it is best to first relate the world before dynamic tracing. In that world one had to rely upon a number of different system tools that reported statistics that were maintained for various kernel subsystems, drivers, and events. Each pool of information had its own tool, with its own options and semantics. To transition to another set of statistics, or to sometimes even dig deeper in the same area of analysis one had to change tools to get the new information. Finding statistical relationships between two or more data points typically required running two tools simultaneously and then integrating the results typically in a spreadsheet or similar method.

When the administrator had tools that could dig deeper they tended to be significantly more painful in terms of performance impact as they lacked precision. As tools got more detailed lower level information from the system the amounts of data they returned usually was orders of magnitude larger. This meant that the tool generated more system load as it collected more data, and the administrator had to sift through a needle stack2 to find the needle they were looking for.

Diving deep into a system problem or application performance issue was frequently as task that was delegated to another group. The OS vendor had to work through low-level system problems, and the application vendor / group had to work through performance profiling or system problems. These were the darker corners of the administrators job. The relationship of the systems admin with these groups was frequently an organizationally weak one, even though they directly interfaced at a programatic level.

Enter Dynamic Tracing

With the advent of dynamic tracing, a single mechanism has been introduced that allows for one tool to provide multiple statistics sources, as well as the means to describe exactly (only) what data to gather, how to manipulate, and then to present that data to the user. Now the administrator can define how wide to cast his net for data, and then how to represent that data.

Additionally, the administrator can dig deeper into the areas that were once relegated to other groups. The admin equipped with a dynamic tracing tool can talk to the developer in terms of APIs rather than just CPU, Memory, and I/O. This requires a more knowledgeable systems person, but it ultimately more fitting of the word "systems" as the administrator is the representative of the entire Unix system to other groups in the IT organization.

Now, armed with a dynamic tracing tool the administrator can communicate to the developer where the application is misbehaving or performing poorly. As a representative of the system, the administrator can now speak to the developer in a language that is more on their territory. For example, the administrator can tell the developer that excessive calls to fork() / exec() are hindering performance rather than simply talking about the resulting system load and poor performance that it causes.

Understanding the Dynamic Tracing Facility

A dynamic tracing environment is best defined3 by two measures; the number of providers, and the features in the language to parse the data from those providers.

Providers are groupings of probe points around a subsystem or common type of probes. Providers define numerous probe points that are potential events for the tool to watch. Once a probe has been defined for a probe point the dynamic tracing tool will fire the probe whenever that event happens in the system. For example, the syscall provider defines event probe points for the entry into a kernel system call and another for the exit from that call. The dynamic tracing environment allows the administrator to collect information from these events and then present that data however they wish.

When a syscall probe fires some of the relevant information that can be gathered is the input values to the system call, the return value, the errno value, what PID / UID made the system call, how long the call took, or simply how many times the call was made.

The language of the tracing environment allows you to explicitly define probes for each probe point that can fire when those events happen. I use the word "explicitly" for a reason. Because when we are looking at a system that is firing thousands of system calls a second we only want to observe a specific few of those calls. In short, we want to capture just our needles in the needle-stack.

As the events happen and the defined probe fires, a small piece of user defined code can act on information related to the probe. The power of the dynamic tracing language defines what you can do with the data you collect. It may be as simple as counting the events or as complex as doing a statical analysis of all the events.

The first and most obvious thing about the tracing languages is that while they look a lot like C, they typically represent a subset of the features of the C language. Instead of defining a procedural flow of an application like C, the tracing languages define blocks of limited, yet very C-like code that execute based upon events that are watched outside the dynamic tracing environment.

Philosophical Approaches to The Tracing environment

When talking about various Unices the traditional method of categorizing the flavors is either "BSD" or "Sys V". Years of Unix inbreeding and the rising popularity of GNU/Linux systems has blurred much of the finer distinctions between these camps. I tend to classify Unices by the philosophical approach of the administrative interface.

Some Unices tend to rely more on collections of smaller tools with the expectation that the user will be required to pull these tools together to provide higher level interfaces. These systems tend to have very open file formats, and even open source code. Into this camp falls Solaris, BSD and some of the Linux distributions.

The next group of Unices tend to provide a rich set of management tools that provide a more singular interface into the system. Behind this interface hides much of the complexity of these Unix systems. The administrator is not expected to make extensive modifications to these environments and those who do so are rare.

Solaris

The dynamic tracing environment was first introduced to the Unix world on Solaris where administrators are quite comfortable modifying the administrative environment. The dynamic tracing environment represents a powerful extension of a behavior that many Solaris administrators already exhibit. For this reason, the Solaris dynamic tracing tool DTrace quickly became popular in the Solaris community.

Another design consideration of the DTrace environment is its central focus in the Solaris environment. It was backed by a strong marketing effort by Sun in Solaris 10 (along with ZFS and Zones) that launched it into the view of administrators everywhere. Because it was so important and highly profiled it achieved critical mass that promoted it even more. DTrace is now commonly used amongst the Sun, reseller, and end-user communities. The common usage only serves to make it's adoption stronger.

Sun has taken a philosophical approach to D that lets it be used for all metrics. The idea is that many of the existing / traditional tools can be re-written in D.4 While this seems kind of odd on the surface, the extremely positive side effect is that this intended use requires lots of providers and probe points.

AIX

The AIX response (to the Solaris 10 feature set) was to provide some very similar tools that would present the same capabilities. AIX has had a rich set of tools that do low level tracing for years, but in AIX 6.1 IBM provided an equivalent dynamic tracing environment called ProbeVue based upon a clean-room implementation of DTrace.

Those who have used both DTrace and ProbeVue will share the obvious question: "Why didn't IBM just license DTrace?" The answer eludes me, but it may have something to do with the recent legal experience IBM had in tangling with open source. My guess is that they have erected stringent legal boundaries that keep these things pure. As D reflects its engineering and marketing lineage, so Vue reflects its legal and marketing lineage.

The good news is that IBM has a strong development effort behind Vue and has a roadmap of rich features that should put Vue on a nearly level playing field with D before AIX 7(.1) ships.

Linux

I will not speak at length about the Linux dynamic tracing tool called SystemTap as I have only a limited familiarity with it. SystemTap is most similar to the other tools in concept, and the least similar in implementation. The language of SystemTap retains not only the C flavor but also many of the C constructs such as functions and flow-control that both D and Vue do not provide.

Ironically, SystemTap is the only tool that provides the capability to compile scripts into binaries. The end result, and where the irony enters, is that the normally open Linux environment is the only system that allows you to obfuscate your code in an "unreadable" format.

In the next section I (briefly) question the use of complex flow control structures in SystemTap. I am not a "fan" of this design but, the beauty and strength of open source design is that if something does not work, then it will be abandoned and replaced with something that does. The free market of engineering solutions will tell us if the "full" language implementation of SystemTap is a good idea.

Directly comparing the products

Because of Sun's jump on IBM in this area, the DTrace tool is significantly more mature than that from IBM. Suns holistic approach of D-for-everything has only served to accelerate the adoption of D both internally and externally. The result is a commanding lead in the race between the two tools by Sun.

In this respect, Solaris represents true innovation in the Unix space, and this innovation is providing the lion's share of the value that Sun provides in their systems today.

Neither Vue or D have commanding leads over the other in context of the language. They tend to be more similar than different. DTrace has associative arrays and a strong set of aggregating functions that Vue does not have. This does represent a lead for D, but not one that will last as IBM has committed to equivalent functionality in later TLs for 6.1. SystemTap represents a greater departure from the group in features of the language, but the wisdom of placing complex programming structures in an intentionally lightweight tool is yet to be seen (by me). Both (Vue and D) have minor language / implementation annoyances, but the community has provided methods and examples to overcome most of these issues.

Generally, DTrace is a much more effective tool than ProbeVue (and System Tap) because of the immense number of providers (probe points) it has. With a fairly level playing field in language, the sheer volume of data available from the Solaris providers places DTrace in a class of its own.

The current design of DTrace allows the administrator to introduce dynamic tracing into the problem determination process early and continue to use it to dive deep in the system. IBMs ProbeVue, on the other hand, should be considered in the more traditional "one tool amongst many" sense as IBM has a long pre-existing environment of these tools, where ProbeVue is just a recent addition.

When to Use Tracing Tools

The more mature versions of these tools allow you to introduce them early into the problem determination process and then continue to use them as you dig deeper into the system. The flexibility of the tools allows you to find one specific statistic that is relevant, then add new probes that dig deeper while looking for correlations between the statistics. As the process takes you deeper you can use the same tool to refine the location of the pain and simultaneously reduce the amount of data you collect.

The next approach to this question is to consider how deep you wish to dig into a problem. Many will be happy to determine a performance problem with an application and then "kick it to the application guys". Others of us may not have the luxury of pushing a performance problem to another group, or it may be necessary to come prepared with more information before you try to "kick it to the application guys".

Where to go from here

If you have never used a dynamic tracing tool but know C, I recommend reading some of the many documents on the implementation for your platform. There are plenty of resources on the Internet and virtually all of them are free. Sun has DTrace documentation as part of the Solaris 10 documentation set and lots of white papers and sample code on BigAdmin. IBM has equivalent resources in the Red Books and DeveloperWorks. Additionally, I have written documentation for both DTrace and ProbeVue and published them on tablespace.net. The most notable of these is the QuickSheets for both D and Vue.

If you are new to the C language and the Unix system call interface I recommend a good book on C and the Stevens book(s) on Unix system programming. A basic understanding of what the system is doing is not only important to understanding what you are seeing in your dynamic tracing tool, but understanding what the system is about in the first place.

Footnotes

1. I know he will eventually read this.
2. Everyone knows that finding a needle in a haystack is trivial work for those armed with a magnet. This task is more like finding a specific needle in a stack of needles.
3. Differentiated might be a another term. These two items simultaneously define what a dynamic tracing environment can do, and the key differences between the tools available on popular operating systems today.
4. I do not know this to be fact, I just heard it third-hand. I do believe it to be a fact as it is clearly apparent in their implementation.
*. I would like to thank Amtrak for returning the time to my busy schedule to allow me to write this paper.

By: William Favorite <wfavorite@tablespace.net>
Version: 0.2.0
Date: 2/24/9