Tag: gdb (subscribe)

Hi there. Long time no write!

On Tuesday, February 23, 2021, I made an announcement at debian-devel-announce about a new service that I configured for Debian: a debuginfod server.

This post serves two purposes: pay the promise I made to Jonathan Carter that I would write a blog post about the service, and go into a bit more detail about it.

What's debuginfod?

From the announcement above:

debuginfod is a new-ish project whose purpose is to serve
ELF/DWARF/source-code information over HTTP.  It is developed under the
elfutils umbrella.  You can find more information about it here:

  https://sourceware.org/elfutils/Debuginfod.html

In a nutshell, by using a debuginfod service you will not need to
install debuginfo (a.k.a. dbgsym) files anymore; the symbols will be
served to GDB (or any other debuginfo consumer that supports debuginfod)
over the network.  Ultimately, this makes the debugging experience much
smoother (I myself never remember the full URL of our debuginfo
repository when I need it).

Perhaps not everybody knows this, but until last year I was a Debugger Engineer (a.k.a. GDB hacker) at Red Hat. I was not involved with the creation of debuginfod directly, but I witnessed discussions about "having way to serve debug symbols over the internet" multiple times during my tenure at the company. So this is not a new idea, and it's not even the first implementation, but it's the first time that some engineers actually got their hands dirty enough to have something concrete to show.

The idea to set up a debuginfod server for Debian started to brew after 2019's GNU Tools Cauldron, but as usual several things happened in $LIFE (including a global pandemic and leaving Red Hat and starting a completely different job at Canonical) which had the effect of shuffling my TODO list "a little".

Benefits for Debian

Debian unfortunately is lagging behind when it comes to offer its users a good debugging experience. Before the advent of our debuginfod server, if you wanted to debug a package in Debian you would need to:

  1. Add the debian-debug apt repository to your /etc/apt/sources.list.

  2. Install the dbgsym package that contains the debug symbols for the package you are debugging. Note that the version of the dbgsym package needs to be exactly the same as the version of the package you want to debug.

  3. Figure out which shared libraries your package uses and install the dbgsym packages for all of them. Arguably, this step is optional but recommended if you would like to perform a more in-depth debugging.

  4. Download the package source, possibly using apt source or some equivalent command.

  5. Open GDB, and make sure you adjust the source paths properly (more below). This can be non-trivial.

  6. Finally, debug the program.

Now, with the new service, you will be able to start from step 4, without having to mess with sources.list, dbgsym packages and version mismatches.

The package source

It is important to mention an existing (but perhaps not well-known) limitation of our debugging experience in Debian: the need to manually download the source packages and adjust GDB to properly find them (see step 4 above). debuginfod is able to serve source code as well, but our Debian instance is not doing that at the moment.

Debian does not provide a patched source tree that is ready to be consumed by GDB or debuginfod (for a good example of a distribution that does this, see Fedora's debugsource packages). Let me show you an example of debugging GDB itself (using debuginfod) on Debian:

$ HOME=/tmp DEBUGINFOD_URLS=https://debuginfod.debian.net gdb -q gdb
Reading symbols from gdb...
Downloading separate debug info for /tmp/gdb...
Reading symbols from /tmp/.cache/debuginfod_client/02046bac4352940d19d9164bab73b2f5cefc8c73/debuginfo...
(gdb) start
Temporary breakpoint 1 at 0xd18e0: file /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c, line 28.
Starting program: /usr/bin/gdb 
Downloading separate debug info for /lib/x86_64-linux-gnu/libreadline.so.8...
Downloading separate debug info for /lib/x86_64-linux-gnu/libz.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/libncursesw.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libtinfo.so.6...
Downloading separate debug info for /tmp/.cache/debuginfod_client/d6920dbdd057f44edaf4c1fbce191b5854dfd9e6/debuginfo...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Downloading separate debug info for /lib/x86_64-linux-gnu/libexpat.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/liblzma.so.5...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbabeltrace.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbabeltrace-ctf.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libipt.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libmpfr.so.6...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libsource-highlight.so.4...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libxxhash.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libdebuginfod.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libstdc++.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libgcc_s.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0...
Downloading separate debug info for /tmp/.cache/debuginfod_client/dbfea245d26065975b4084f4e9cd2d83c65973ee/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libdw.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libelf.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libuuid.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgmp.so.10...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.74.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4...
Downloading separate debug info for /lib/x86_64-linux-gnu/libbz2.so.1.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libicui18n.so.67...
Downloading separate debug info for /tmp/.cache/debuginfod_client/acaa831dbbc8aa70bb2131134e0c83206a0701f9/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libicuuc.so.67...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libnghttp2.so.14...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libidn2.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/librtmp.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libssh2.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libpsl.so.5...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libnettle.so.8...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgnutls.so.30...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbrotlidec.so.1...
Downloading separate debug info for /tmp/.cache/debuginfod_client/39739740c2f8a033de95c1c0b1eb8be445610b31/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libunistring.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libhogweed.so.6...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgcrypt.so.20...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libp11-kit.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libtasn1.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libcom_err.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libsasl2.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbrotlicommon.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/libgpg-error.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libffi.so.7...
Downloading separate debug info for /lib/x86_64-linux-gnu/libkeyutils.so.1...

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffebf8) at /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c:28
28      /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c: Directory not empty.
(gdb) list
23      in /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c
(gdb) 

(See all those Downloading separate debug info for... lines? Nice!)

As you can see, when we try to list the contents of the file we're in, nothing shows up. This happens because GDB doesn't know where the file is. So you have to tell it. In this case, it's relatively easy: you see that the GDB package's build directory is /build/gdb-Nav6Es/gdb-10.1/. When you apt source gdb, you will end up with a directory called $PWD/gdb-10.1/ containing the full source of the package. Notice that the last directory's name in both paths is the same, so in this case we can use GDB's set substitute-path command do the job for us (in this example $PWD is /tmp/):

$ HOME=/tmp DEBUGINFOD_URLS=https://debuginfod.debian.net gdb -q gdb
Reading symbols from gdb...
Reading symbols from /tmp/.cache/debuginfod_client/02046bac4352940d19d9164bab73b2f5cefc8c73/debuginfo...
(gdb) set substitute-path /build/gdb-Nav6Es/ /tmp/
(gdb) start
Temporary breakpoint 1 at 0xd18e0: file /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c, line 28.
Starting program: /usr/bin/gdb 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffebf8) at /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c:28
warning: Source file is more recent than executable.
28        memset (&args, 0, sizeof args);
(gdb) list
23      int
24      main (int argc, char **argv)
25      {
26        struct captured_main_args args;
27
28        memset (&args, 0, sizeof args);
29        args.argc = argc;
30        args.argv = argv;
31        args.interpreter_p = INTERP_CONSOLE;
32        return gdb_main (&args);
(gdb)

Much better, huh? The problem is that this process is manual, and changes depending on how the package you're debugging was built.

What can we do to improve this? What I personally would like to see is something similar to what the Fedora project already does: create a new debug package which will contain the full, patched source package. This would mean changing our building infrastructure and possibly other somewhat complex things.

Using the service (by default)

At the time of this writing, I am working on an elfutils Merge Request whose purpose is to implement a debconf question to ask the user whether she wants to use our service by default.

If you would like to start using the service right now, all you have to do is set the following environment variable in your shell:

DEBUGINFOD_URLS="https://debuginfod.debian.net"

More information

You can find more information about our debuginfod service here. Try to keep an eye on the page as it's being constantly updated.

If you'd like to get in touch with me, my email is my domain at debian dot org.

I sincerely believe that this service is a step in the right direction, and hope that it can be useful to you :-).


Back in 2016, when life was simpler, a Fedora GDB user reported a bug (or a feature request, depending on how you interpret it) saying that GDB's gcore command did not respect the COREFILTER_ELF_HEADERS flag, which instructs it to dump memory pages containing ELF headers. As you may or may not remember, I have already written about the broader topic of revamping GDB's internal corefile dump algorithm; it's an interesting read and I recommend it if you don't know how Linux (or GDB) decides which mappings to dump to a corefile.

Anyway, even though the bug was interesting and had to do with a work I'd done before, I couldn't really work on it at the time, so I decided to put it in the TODO list. Of course, the "TODO list" is actually a crack where most things fall through and are usually never seen again, so I was blissfully ignoring this request because I had other major priorities to deal with. That is, until a seemingly unrelated problem forced me to face this once and for all!

What? A regression? Since when?

As the Fedora GDB maintainer, I'm routinely preparing new releases for Fedora Rawhide distribution, and sometimes for the stable versions of the distro as well. And I try to be very careful when dealing with new releases, because a regression introduced now can come and bite us (i.e., the Red Hat GDB team) back many years in the future, when it's sometimes too late or too difficult to fix things. So, a mandatory part of every release preparation is to actually run a regression test against the previous release, and make sure that everything is working correctly.

One of these days, some weeks ago, I had finished running the regression check for the release I was preparing when I noticed something strange: a specific, Fedora-only corefile test was FAILing. That's a no-no, so I started investigating and found that the underlying reason was that, when the corefile was being generated, the build-id note from the executable was not being copied over. Fedora GDB has a local patch whose job is to, given a corefile with a build-id note, locate the corresponding binary that generated it. Without the build-id note, no binary was being located.

Coincidentally or not, at the same I started noticing some users reporting very similar build-id issues on the freenode's #gdb channel, and I thought that this bug had a potential to become a big headache for us if nothing was done to fix it right now.

I asked for some help from the team, and we managed to discover that the problem was also happening with upstream gcore, and that it was probably something that binutils was doing, and not GDB. Hmm...

Ah, so it's ld's fault. Or is it?

So there I went, trying to confirm that it was binutils's fault, and not GDB's. Of course, if I could confirm this, then I could also tell the binutils guys to fix it, which meant less work for us :-).

With a lot of help from Keith Seitz, I was able to bisect the problem and found that it started with the following commit:

commit f6aec96dce1ddbd8961a3aa8a2925db2021719bb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Feb 27 11:34:20 2018 -0800

    ld: Add --enable-separate-code

This is a commit that touches the linker, which is part of binutils. So that means this is not GDB's problem, right?!? Hmm. No, unfortunately not.

What the commit above does is to simply enable the use of --enable-separate-code (or -z separate-code) by default when linking an ELF program on x86_64 (more on that later). On a first glance, this change should not impact the corefile generation, and indeed, if you tell the Linux kernel to generate a corefile (for example, by doing sleep 60 & and then hitting C-\), you will notice that the build-id note is included into it! So GDB was still a suspect here. The investigation needed to continue.

What's with -z separate-code?

The -z separate-code option makes the code segment in the ELF file to put in a completely separated segment than data segment. This was done to increase the security of generated binaries. Before it, everything (code and data) was put together in the same memory region. What this means in practice is that, before, you would see something like this when you examined /proc/PID/smaps:

00400000-00401000 r-xp 00000000 fc:01 798593                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

And now, you will see two memory regions instead, like this:

00400000-00401000 r--p 00000000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         4 kB
Private_Dirty:         0 kB
Referenced:            4 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd mr mw me dw sd
00401000-00402000 r-xp 00001000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

A few minor things have changed, but the most important of them is the fact that, before, the whole memory region had anonymous data in it, which means that it was considered an anonymous private mapping (anonymous because of the non-zero Anonymous amount of data; private because of the p in the r-xp permission bits). After -z separate-code was made default, the first memory mapping does not have Anonymous contents anymore, which means that it is now considered to be a file-backed private mapping instead.

GDB, corefile, and coredump_filter

It is important to mention that, unlike the Linux kernel, GDB doesn't have all of the necessary information readily available to decide the exact type of a memory mapping, so when I revamped this code back in 2015 I had to create some heuristics to try and determine this information. If you're curious, take a look at the linux-tdep.c file on GDB's source tree, specifically at the functions dump_mapping_p and linux_find_memory_regions_full.

When GDB is deciding which memory regions should be dumped into the corefile, it respects the value found at the /proc/PID/coredump_filter file. The default value for this file is 0x33, which, according to core(5), means:

Dump memory pages that are either anonymous private, anonymous
shared, ELF headers or HugeTLB.

GDB had the support implemented to dump almost all of these pages, except for the ELF headers variety. And, as you can probably infer, this means that, before the -z separate-code change, the very first memory mapping of the executable was being dumped, because it was marked as anonymous private. However, after the change, the first mapping (which contains only data, no code) wasn't being dumped anymore, because it was now considered by GDB to be a file-backed private mapping!

Finally, that is the reason for the difference between corefiles generated by GDB and Linux, and also the reason why the build-id note was not being included in the corefile anymore! You see, the first memory mapping contains not only the program's data, but also its ELF headers, which in turn contain the build-id information.

gcore, meet ELF headers

The solution was "simple": I needed to improve the current heuristics and teach GDB how to determine if a mapping contains an ELF header or not. For that, I chose to follow the Linux kernel's algorithm, which basically checks the first 4 bytes of the mapping and compares them against \177ELF, which is ELF's magic number. If the comparison succeeds, then we just assume we're dealing with a mapping that contains an ELF header and dump it.

In all fairness, Linux just dumps the first page (4K) of the mapping, in order to save space. It would be possible to make GDB do the same, but I chose the faster way and just dumped the whole mapping, which, in most scenarios, shouldn't be a big problem.

It's also interesting to mention that GDB will just perform this check if:

  • The heuristic has decided not to dump the mapping so far, and;
  • The mapping is private, and;
  • The mapping's offset is zero, and;
  • There is a request to dump mappings with ELF headers (i.e., coredump_filter).

Linux also makes these checks, by the way.

The patch, finally

I submitted the patch to the mailing list, and it was approved fairly quickly (with a few minor nits).

The reason I'm writing this blog post is because I'm very happy and proud with the whole process. It wasn't an easy task to investigate the underlying reason for the build-id failures, and it was interesting to come up with a solution that extended the work I did a few years ago. I was also able to close a few bug reports upstream, as well as the one reported against Fedora GDB.

The patch has been pushed, and is also present at the latest version of Fedora GDB for Rawhide. It wasn't possible to write a self-contained testcase for this problem, so I had to resort to using an external tool (eu-unstrip) in order to guarantee that the build-id note is correctly present in the corefile. But that's a small detail, of course.

Anyway, I hope this was an interesting (albeit large) read!


After spending the last weeks struggling with this, I decided to write a blog post. First, what is “this” that you are talking about? The answer is: Linux kernel's concept of memory mapping. I found it utterly confused, beyond my expectations, and so I believe that a blog post is the write way to (a) preserve and (b) share this knowledge. So, let's do it!

First things first

First, I cannot begin this post without a few acknowledgements and “thank you's”. The first goes to Oleg Nesterov (sorry, I could not find his website), a Linux kernel guru who really helped me a lot through the whole task. Another “thank you” goes to Jan Kratochvil, who also provided valuable feedback by commenting my GDB patch. Now, back to the point.

The task

The task was requested here: GDB needed to respect the /proc/<PID>/coredump_filter file when generating a coredump (i.e., when you use the gcore command).

Currently, GDB has his own coredump mechanism implemented which, despite its limitations and bugs, has been around for quite some time. However, and maybe you don't know that, but the Linux kernel has its own algorithm for generating the corefile of a process. And unfortunately, GDB and Linux were not really following the same standards here...

So, in the end, the task was about synchronizing GDB and Linux. To do that, I first had to decipher the contents of the /proc/<PID>/smaps file.

The /proc/<PID>/smaps file

This special file, generated by the Linux kernel when you read it, contains detailed information about each memory mapping of a certain process. Some of the fields on this file are documented in the proc(5) manpage, but others are missing there (asking for a patch!). Here is an explanation of everything I needed:

  • The first line of each memory mapping has the following format:

    The fields here are:

    a) address is the address range, in the process' address space, that the mapping occupies. This part was already treated by GDB, so I did not have to worry about it.

    b) perms is a set of permissions (r ead, w rite, e x ecute, s hared, p rivate [COW -- copy-on-write]) applied to the memory mapping. GDB was already dealing with rwx permissions, but I needed to include the p flag as well. I also made GDB ignore the mappings that did not have the r flag active, because it does not make sense to dump something that you cannot read.

    c) offset is the offset into the applied to the file, if the mapping is file-backed (see below). GDB already handled this correctly.

    d) dev is the device (major:minor) related to the file, if there is one. GDB already handled this correctly, though I was using this field for more things (continue reading).

    e) inode is the inode on the device above. The value of zero means that no inode is associated with the memory mapping. Nothing to do here.

    f) pathname is the file associate with this mapping, if there is one. This is one of the most important fields that I had to use, and one of the most complicated to understand completely. GDB now uses this to heuristically identify whether the mapping is anonymous or not.

  • GDB is now also interested in Anonymous: and AnonHugePages: fields from the smaps file. Those fields represent the content of anonymous data on the mapping; if GDB finds that this content is greater than zero, this means that the mapping is anonymous.

  • The last, but perhaps most important field, is the VmFlags: field. It contains a series of two-letter flags that provide very useful information about the mapping. A description of the fields is: a) sh: the mapping is shared (VM_SHARED) b) dd: this mapping should not be dumped in a corefile (VM_DONTDUMP) c) ht: this is HugeTLB mapping

With that in hands, the following task was to be able to determine whether a memory mapping is anonymous or file-backed, private or shared.

Types of memory mappings

There can be four types of memory mappings:

  1. Anonymous private mapping
  2. Anonymous shared mapping
  3. File-backed private mapping
  4. File-backed shared mapping

It should be possible to uniquely identify each mapping based on the information provided by the smaps file; however, you will see that this is not always the case. Below, I will explain how to determine each of the four characteristics that define a mapping.

Anonymous

A mapping is anonymous if one of these conditions apply:

  1. The pathname associated with it is either /dev/zero (deleted), /SYSV%08x (deleted), or <filename> (deleted) (see below).
  2. There is content in the Anonymous: or in the AnonHugePages: fields of the mapping in the smaps file.

A special explanation is needed for the <filename> (deleted) case. It is not always guaranteed that it identifies an anonymous mapping; in fact, it is possible to have the (deleted) part for file-backed mappings as well (say, when you are running a program that uses shared libraries, and those shared libraries have been removed because of an update, for example). However, we are trying to mimic the behavior of the Linux kernel here, which checks to see if a file has no hard links associated with it (and therefore is truly deleted).

Although it may be possible for the userspace to do an extensive check (by stat ing the file, for example), the Linux kernel certainly could give more information about this.

File-backed

A mapping is file-backed (i.e., not anonymous) if:

  1. The pathname associated with it contains a <filename>, without the (deleted) part.

As has been explained above, a mapping whose pathname contains the (deleted) string could still be file-backed, but we decide to consider it anonymous.

It is also worth mentioning that a mapping can be simultaneously anonymous and file-backed: this happens when the mapping contains a valid pathname (without the (deleted) part), but also contains Anonymous: or AnonHugePages: contents.

Private

A mapping is considered to be private (i.e., not shared) if:

  1. In the absence of the VmFlags field (in the smaps file), its permission field has the flag p.
  2. If the VmFlags field is present, then the mapping is private if we do not find the sh flag there.

Shared

A mapping is shared (i.e., not private) if:

  1. In the absence of VmFlags in the smaps file, the permission field of the mapping does not have the p flag. Not having this flag actually means VM_MAYSHARE and not necessarily VM_SHARED (which is what we want), but it is the best approximation we have.
  2. If the VmFlags field is present, then the mapping is shared if we find the sh flag there.

The patch

With all that in mind, I hacked GDB to improve the coredump mechanism for GNU/Linux operating systems. The main function which decides the memory mappings that will or will not be dumped on GNU/Linux is linux_find_memory_regions_full; the Linux kernel obviously uses its own function, vma_dump_size, to do the same thing.

Linux has one advantage: it is a kernel, and therefore has much more knowledge about processes' internals than a userspace program. For example, inside Linux it is trivial to check if a file marked as "(deleted)" in the output of the smaps file has no hard links associated with it (and therefore is not really deleted); the same operation on userspace, however, would require root access to inspect the contents of the /proc/<PID>/map_files/ directory.

The case described above, if you remember, is something that impacts the ability to tell whether a mapping is anonymous or not. I am talking to the Linux kernel guys to see if it is possible to export this information directly via the smaps file, instead of having to do the current heuristic.

While doing this work, some strange behaviors were found in the Linux kernel. Oleg is working on them, along with other Linux hackers. From our side, there is still room for improvement on this code. The first thing I can think of is to improve the heuristics for finding anonymous mappings. Another relatively easy thing to do would be to let the user specify a value for coredump_filter on the command line, without editing the /proc file. And of course, keep this code always updated with its counterpart in the Linux kernel.

Upstream discussions and commit

If you are interested, you can see the discussions that happened upstream by going to this link. This is the fourth (and final) submission of the patch; you should be able to find the other submissions in the archive.

The final commit can be found in the official repository.


It is really nice to see something you did in a project influence in future features and developments. I always feel happy and proud when I notice such scenarios happening, and this time was no different. Gary Benson, a colleague at Red Hat who works in the GDB team as well, has implemented a way of improving the interface between the linker and the debugger, and one of the things he used to achieve this is the GDB <-> SystemTap integration that I implemented with Tom Tromey 2 years ago. Neat!

The problem

You can read a detailed description of the problem in the message Gary sent to the gdb-patches mailing list, but to summarize: GDB needs to interface with the linker in order to identify which shared libraries were loaded during the inferior's (i.e., program being debugged) life.

Nowadays, what GDB does is to put a breakpoint in _dl_debug_state, which is an empty function called by the linker every time a shared library is loaded (the linker calls it twice, once before modifying the list of loaded shlibs, and once after). But GDB has no way to know what has changed in the list of loaded shlibs, and therefore it needs to load the entire list every time something happens. You can imagine how bad this is for performance...

The solution

What Gary did was to put SDT probes strategically on the linker, so that GDB could make use of them when examining for changes in the list of loaded shlibs. It improves performance a lot, because now GDB doesn't need to stop twice every time a shlib is loaded (it just needs to do that when stop-on-solib-events is set); it just needs to stop at the right probe, which will inform the address of the link-map entry of the first newly added library. It means GDB also won't need to walk through the list of shlibs and identify what has changed: you get that for free by examining the probe's argument.

Gary also mentions a discrepancy that happened on Solaris libc, which has also been solved by his patch.

And now, the most impressing thing: the numbers! Take a look at this table, which displays the huge improvement in the performance when using lots of shlibs (the time is in seconds):

Number of shlibs 128 256 512 1024 2048 4096
Old interface > 0 > 1 > 4 > 12 > 47 > 185
New interface > 0 > 0 > 2 > 4 > 10 > 36

Impressive, isn't it?

Conclusion

This is one the things I like most in Free Software projects: the possibility of extending and improving things by using what others did before. When I hacked GDB to implement the integration between itself and SystemTap, I had absolutely no idea that this could be used for improving the interface between the linker and the debugger (though I am almost sure that Tom was already thinking ahead!). And I can say it is a pleasure and I feel proud when I see such things happening. It just makes me feel more and more certain that Free Software is the way to go :-).


Nesta última sexta-feira, dia 30/11/2012, estive presente na sétima edição do SoLiSC 2012, em Florianópolis, para apresentar uma palestra introdutória sobre o GDB. Este é um relato sobre minha particição no evento :-).

Impressões sobre o evento

Foi a primeira vez que fui ao SoLiSC. Já tive vontade de ir em anos anteriores, mas infelizmente sempre havia algo para atrapalhar. No entanto, nesse ano felizmente tudo correu bem, e inclusive tive uma palestra aceita! Ou seja, um ótimo motivo para visitar Floripa e rever o mar :-D.

Peguei um vôo saindo às 6h de Campinas, e cheguei lá às 7h10min. Estava bastante cansado, pois não havia dormido de quinta pra sexta, só que a ansiedade estava conseguindo me deixar ligado :-).

O evento aconteceu Universidade Estácio de Sá, que fica em São José. Cheguei por lá às 8h, e fui bem recebido pelo pessoal do evento. Já tentei me enturmar, e conheci algumas pessoas que também iam palestrar no evento. Como minha palestra estava marcada para começar às 14h, resolvi ficar batendo papo e de olho na grade de palestras.

Por coincidência (ou não!), acabei ficando na sala onde aconteceria o primeiro LibreOffice Hack Day no Brasil. Acabei ficando na sala o dia todo, ajudando o pessoal a resolver alguns problemas chatos com o firewall da Universidade, e depois com git. Foi uma experiência muito legal, nunca tinha participado de um Hack Day antes, e foi uma honra poder presenciar e ajudar no primeiro evento do tipo que o pessoal do LibreOffice fez no Brasil :-). Aliás, foi muito interessante conhecer um pouco mais sobre um projeto tão grande e complexo quanto o LibreOffice, e inclusive fiz um "jabá" sobre o GDB para eles :-).

No final, também conheci algumas pessoas muito interessadas em contribuir com projetos de software livre, o que é sempre bom! Isso me ajuda a ter mais motivação para continuar a fazer esse trabalho de divulgação. Você pode ler uma descrição mais detalhada sobre o LibreOffice Hack Day (inclusive com fotos) aqui.

Apresentação "GDB Crash Course"

Eu já estava esperando pouca gente na palestra, até porque falar sobre o GDB está ficando cada vez mais complicado... As pessoas em geral não sabem (e nem se interessam) pelo software, então é normal ficar meio "de escanteio" nesses eventos :-). Quem sabe um dia eu não escreva um post sobre isso?

Bem, mas mesmo com pouco público, creio que palestra correu bem. Dessa vez, meu amigo Edjunior não foi, então levei a palestra sozinho :-). Existem vantagens e desvantagens nisso, mas de modo geral acho que a palestra ficou um pouco mais rápida.

Adicionei alguns slides extras para falar sobre a Red Hat, e sobre o que estamos fazendo pelas comunidades de software livre por aí -- não só na do GDB, mas também em muitas outras. Essa parte da apresentação realmente foi bacana, porque o orgulho de se trabalhar nessa empresa é grande!

Depois que terminei minha palestra e voltei à sala do LibreOffice Hack Day, alguns desenvolvedores que estavam por lá me perguntaram como foi, e disseram que tinham se arrependido de não ter ido... Sabe como é, preferiram ficar fazendo patches, então eu entendo :-P. Bem, pra não deixar ninguém insatisfeito, acabei fazendo uma segunda rodada da palestra dentro do Hack Day, e também foi muito bacana :-).

Várias pessoas me pediram os slides, então aqui estão eles:

Conclusão

Gostaria de agradecer especialmente à Eliane Domingos, ao David Jourdain e ao Olivier Hallot, todos membros da TDF e contribuidores do LibreOffice, pelos momentos prazerosos e pelas conversas divertidas que tivemos durante todo o evento!

Também gostaria de agradecer à organização do SoLiSC pela oportunidade de participar de um evento tão bacana! O Klaibson Ribeiro foi a pessoa com quem troquei alguns e-mails antes do evento, então um "muito obrigado" a ele também :-).

Nos vemos no próximo SoLiSC!


Hi everybody :-).

I finally got some time to finish this series of posts, and I hope you like the overall result. For those of you who are reading this blog for the first time, you can access the first post here, and the second here.

My goal with this third post is to talk a little bit about how you can use the SDT probes with tracepoints inside GDB. Maybe this particular feature will not be so helpful to you, but I recommend reading the post either way. I will also give a brief explanation about how the SDT probes are laid out inside the binary. So, let's start!

Complementary information

In my last post, I forgot to mention that the SDT probe support present on older versions of Fedora GDB is not exactly as the way I described here. This is because Fedora GDB adopted this feature much earlier than upstream GDB itself, so while this has a great positive aspect in terms of how the distro's philosophy works (i.e., Fedora contains leading-edge features, so if you want to know how to FLOSS community will be in a few months, use it!), it also has the downside of delivering older/different versions of features in older Fedoras. But of course, this SDT feature will be fully available on Fedora 18, to be announced soon.

My suggestion is that if you use a not-so-recent Fedora (like Fedora 16, 15, etc), please upgrade it to the last version, or compile your own version of GDB yourself (it's not that hard, I will make a post about it in the next days/weeks!).

With that said, let's move on to our main topic here.

SDT Probes and Tracepoint

Before anything else, let me explain what a tracepoint is. Think of it as a breakpoint which doesn't stop the program's execution when it hits. In fact, it's a bit more than that: you can define actions associated with a tracepoint, and those actions will be performed when the tracepoint is hit. Neat, huh? :-)

There is a nice description of what a tracepoint in the GDB documentation, I recommend you give it a reading to understand the concept.

Ok, so now we have to learn how to put tracepoints in our code, and how to define actions for them. But before that, let's remember our example program:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <sys/sdt.h>

int
main (int argc, char *argv[])
{
    int a = 10;

    STAP_PROBE1 (test_program, my_probe, a);

    return 0;
}

Very simple, isn't it? Ok, to the tracepoints now, my friends.

Using tracepoints inside GDB

In order to properly use tracepoints inside GDB, you will need to use gdbserver, a tiny version of GDB suitable for debugging programs remotely, over the net or serial line. In short, this is because GDB cannot put tracepoints on a program running directly under it, so we have to run it inside gdbserver and then connect GDB to it.

Running our program inside gdbserver

In our case, we will just start gdbserver in our machine, order it to listen to some high port, and connect to it through localhost, so there will be no need to have access to another computer or device.

First of all, make sure you have gdbserver installed. If you use Fedora, the package name you will have to install is gdb-gdbserver. If you have it installed, you can do:

1
2
3
$ gdbserver :3001 ./test_program
Process ./test_program created; pid = 17793
Listening on port 3001

The second argument passed to gdbserver instructs it to listen on the port 3001 of your loopback interface, a.k.a. localhost.

You will notice that gdbserver will stay there indefinitely, waiting for new connections to arrive. Don't worry, we will connect to it soon!

Connecting an instance of GDB to gdbserver

Now, go to another terminal and start GDB with our program:

1
2
3
4
5
6
7
$ gdb ./test_program
...
(gdb) target remote :3001
Remote debugging using :3001
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x0000003d60401530 in _start () from /lib64/ld-linux-x86-64.so.2

The command you have to use inside GDB is target remote. It takes as an argument the host and the port to which you want to connect. In our case, we just want it to connect to localhost, port 3001. If you saw an output like the above, great, things are working for you (don't pay attention to the messages about glibc debug information). If you didn't see it, please check to see if you're connecting to the right port, and if no other service is using it.

Ok, so now it is time to start our trace experiment!

Creating the tracepoints

Every command should be issued on GDB, not on gdbserver!

In your GDB prompt, put a tracepoint in the probe named my_probe:

1
2
(gdb) trace -probe-stap my_probe
Tracepoint 1 at 0x4005a9

As you can see, the trace command takes exactly the same arguments as the break command. Thus, you need to use the -probe-stap modified in order to instruct GDB to put the tracepoint in the probe.

And now, let's define the actions associated with this tracepoint. To do that, we use the actions command, which is an interactive command inside GDB. It takes some specific keywords, and if you want to learn more about it, please take a look at this link. For this example, we will use only the collect keyword, which tells GDB to... hm... collect something :-). In our case, it will collect the probe's first argument, or $_probe_arg0, as you may remember.

1
2
3
4
5
6
(gdb) actions 
Enter actions for tracepoint 1, one per line.
End with a line saying just "end".
>collect $_probe_arg0
>end
(gdb)

Simple as that. Finally, we have to define a breakpoint in the last instruction of our program, because it is necessary to keep it running on gdbserver in order to examine the tracepoints later. If we didn't put this breakpoint, our program would finish and gdbserver would not be able to provide information about what happened with our tracepoints. In our case, we will simply put a breakpoint on line 10, i.e., on the return 0;:

Running the trace experiment

Ok, time to run our trace experiment. First, we must issue a tstart to tell GDB to start monitoring the tracepoints. And then, we can continue our program normally.

1
2
3
4
5
6
7
8
(gdb) tstart 
(gdb) continue
Continuing.

Breakpoint 1, main (argc=1, argv=0x7fffffffde88) at /tmp/test_program.c:10
10        return 0;
(gdb) tstop
(gdb)

Remember, GDB is not going to stop your program, because tracepoints are designed to not interfere with the execution of it. Also notice that we have also stopped the trace experiment after the breakpoint hit, by using the tstop command.

Now, we will be able to examine what the tracepoint has collected. First, we will the tfind command to make sure the tracepoint has hit, and then we can inspect what we ordered it to collect:

1
2
3
4
5
(gdb) tfind start
Found trace frame 0, tracepoint 1
8         STAP_PROBE1 (test_program, my_probe, a);
(gdb) p $_probe_arg0
$1 = 10

And it works! Notice that we are printing the probe argument using the same notation as with breakpoints, even though we are not exactly executing the STAP_PROBE1 instruction. What does it mean? Well, with the tfind start command we tell GDB to actually use the trace frame collected during the program's execution, which, in this case, is the probe argument. If you know GDB, think of it as if we were using the frame command to jump back to a specific frame, where we would have access to its state.

This is a very simple example of how to use the SDT probe support in GDB with tracepoints. There is much more you can do, but I hope I could explain the basics so that you can start playing with this feature.

How the SDT probe is laid out in the binary

You might be interested in learning how the probes are created inside the binary. Other than reading the source code of /usr/include/sys/sdt.h, which is the heart of the whole feature, I also recommend this page, which explains in detail what's going on under the hood. I also recommend that you study a little about how the ELF format works, specifically about notes in the ELF file.

Conclusion

After this series of blog posts, I expect that you will now be able to use the not-so-new feature of SDT probe support on GDB. Of course, if you find some bug while using this, please feel free to report it using our bugzilla. And if you have some question, use the comment system below and I will answer ASAP :-).

See ya, and thanks for reading!


I tell you this: it is depressing when you realize that you spent more time struggling with blog engines than writing posts on your blog!

It's been a long time since I wrote the first post about this subject, and since then the patches have been accepted upstream, and GDB 7.5 now has official support for userspace SystemTap probes :-). Yay!

Well, but enough of cheap talk, let's get to the business!

Errata for my last post

Frank Ch. Eigler, one of SystemTap's maintainers, kindly mentioned something that I should say about SystemTap userspace probes.

Basically, it should be clear that SDT probes are not the only kind of userspace probing one can do with SystemTap. There is yet another kind of probe (maybe even more powerful, depending on the goals): DWARF-based function/statement probes. SystemTap supports this kind of probing mechanism for quite a while now.

It is not the goal of this post to explain it in detail, but you might want to give it a try by compiling your binary with debuginfo support (use the -g flag on GCC), and do something like:

1
2
$ stap -e 'probe process("/bin/foo").function("name") { log($$parms) }' -c /bin/foo
$ stap -e 'probe process("/bin/foo").statement("*@file.c:443") { log($$vars) }' -c /bin/foo

And that's it. You can read SystemTap's documentation, or this guide to learn how to add userspace probes.

Using GDB with SystemTap SDT Probes

Well, now let's get to the interesting part. It is time to make GDB work with the SDT probe that we have put in our example code. Let's remember it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <sys/sdt.h>

int
main (int argc, char *argv[])
{
  int a = 10;

  STAP_PROBE1 (test_program, my_probe, a);

  return 0;
}

It is a very simple example, and we will have to extend it later in order to show more features. But for now, it will do.

The first thing to do is to open GDB (with SystemTap support, of course!), and check to see if it can actually see probe inserted in our example.

1
2
3
4
5
6
7
8
$ gdb ./test_program
GNU gdb (GDB) 7.5.50.20121014-cvs
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...
(gdb) info probes
Provider     Name     Where              Semaphore Object
test_program my_probe 0x00000000004004ae           /home/sergio/work/src/git/build/gdb/test_program

Wow, it actually works! :-)

If you have seen something like the above, it means your GDB is correctly recognizing SDT probes. If you see an error, or if your GDB doesn't have the info probes command, then you'd better make sure you have a recent version of GDB otherwise you won't be able to use the SDT support.

Putting breakpoints in the code

Anyway, now it is time to start using this support. The first thing I want to show you is how to put a breakpoint in a probe.

1
2
(gdb) break -probe-stap my_probe
Breakpoint 1 at 0x4004ae

That's all! We have chosen to extend the break command in order to support the new -probe-stap parameter. If you're wondering ... why the -probe prefix?, it is because I was asked to implement a complete abstraction layer inside GDB in order to allow more types of probes to be added in the future. So, for example, if someone implements support for an hypothetical type of probe called xyz, you would have break -probe-xyz. It took me a little more time to implement this layer, but it is worth the effort.

Anyway, as you have see above, GDB recognize the probe's name and correctly put a breakpoint in it. You can also confirm that it has done the right thing by matching the address reported by info probes with the one reported by break: they should be the same.

Ok, so now, with our breakpoint in place, let's run the program and see what happens.

1
2
3
4
5
(gdb) run
Starting program: /home/sergio/work/src/git/build/gdb/test_program

Breakpoint 1, main (argc=1, argv=0x7fffffffdf68) at /tmp/example-stap.c:8
8  STAP_PROBE1 (test_program, my_probe, a);

As you can see, GDB stopped at the exact location of the probe. Therefore, you are now able to put marks (i.e., probes) in your source code which are location-independent. It means that it doesn't really matter where in the source code your probe is, and it also doesn't matter if you change the code around it, changing the line numbers, or even moving it to another file. GDB will always find your probe, and always stop at the right location. Neat!

Examining probes' arguments

But wait, there's more! Remember when I told you that you could also inspect the probe's arguments? Yes, let's do it now!

Just remember that, in SDT's parlance, the current probe's argument is a. So let's print its value.

1
2
3
4
(gdb) p $_probe_arg0
$1 = 10
(gdb) p a
$2 = 10

"Hey, captain, it seems the boat really floats!"

Check the source code above, and convince yourself that a's value is 10 :-). As you might have seen, I have used a fairly strange way of printing it. It is because the probe's arguments are available inside GDB by means of convenience variables. You can see a list of them here.

Since SDT probes can have up to 12 arguments (i.e., you can use STAP_PROBE1 ... STAP_PROBE12), we have created inside GDB 12 convenience variables, named $_probe_arg0 until $_probe_arg11. I know, it is not an easy name to remember, and even the relation between SDT naming and GDB naming is not direct (i.e., you have to subtract 1 from the SDT probe number). If you are not satisfied with this, please open a bug in our bugzilla and I promise we will discuss other options.

I would like to emphasize something here: just as you don't need debuginfo support for dealing with probes inside GDB, you also don't need debuginfo support for dealing with their arguments as well. It means that you can actually compile your code without debuginfo support, but still have access to some important variables/expressions when debugging it. Depending on how GCC optimizes your code, you may experience some difficulties with argument printing, but so far I haven't heard of anything like that.

More to come

Ok, now we have covered more things about the SDT probe support inside GDB, and I hope you understood all the concepts. It is not hard to get things going with this, specially because you don't need extra libraries to make it work.

In the next post, I intend to finish this series by explaining how to use tracepoints with SDT probes. Also, as I said in the previous post of this series, maybe I will talk a little bit about how the SDT probes are organized within the binary.

See you soon!


Conforme eu havia comentado no post anterior, segue o relato sobre as apresentações que fiz na Semana da Computação da UNESP de Rio Claro.

TL;DR: Gostei de ter tido a oportunidade de dar as apresentações, e principalmente de ter feito minha primeira palestra como Embaixador do Projeto Fedora no Brasil. Sobre a palestra a respeito do GDB, também gostei do jeito que ela foi conduzida. Notei algumas falhas que precisam ser corrigidas, mas no geral a experiência foi muito boa.

Apresentação "O Projeto Fedora"

Foi a primeira apresentação da noite, de acordo com a grade de programação. Começou meia hora atrasada, pois a organização pediu para esperarmos mais pessoas chegarem (estava chovendo bastante no momento, o que dificultou a locomoção).

Comecei a palestra falando um pouco sobre o Projeto Fedora. Acabei passando rapidamente pelas origens do projeto, uma falha que pretendo corrigir em próximas ocasiões. Dei muita ênfase na definição de comunidade e no que isso significa quando lidamos com software livre. Confesso que fiz algumas comparações com o Ubuntu, o que talvez não tenha sido uma boa idéia (de acordo com os guidelines do Projeto Fedora para Embaixadores). De qualquer modo, a mensagem foi passada e notei que algumas pessoas se interessaram em conhecer mais a respeito do projeto e da filosofia.

Pontos positivos: Creio ter conseguido informar as pessoas a respeito do projeto, com a ajuda dos ótimos slides do Paul W. Frields. É sempre gratificante dar palestras, mesmo que apenas uma ou duas pessoas no final acabem se interessando de verdade. Além disso, me senti bem por estar divulgando um projeto que respeita as liberdades dos usuários (ou pelo menos tenta fazer isso ao máximo), e que eu realmente uso e gosto.

Pontos a serem melhorados: Fazer uma palestra um pouco menos "pessoal". É muito difícil conseguir isso, mas tenho a forte impressão de que minha orientação totalmente pró-software-livre acaba (às vezes) afastando algumas pessoas, que vêem no entusiasta por software livre uma pessoa "radical" e "xiita". Preciso pensar um pouco a respeito do assunto...

A conclusão é que fiquei bastante satisfeito com o resultado da palestra. Percebi que, depois dela, algumas pessoas vieram comentar que estavam utilizando Fedora, ou que já andavam pensando em trocar de distribuição, que agora o Fedora era uma opção. O objetivo foi cumprido :-).

Apresentação "GDB Crash Course"

Creio que essa já é a quarta vez que apresento essa palestra, e a terceira vez junto com meu amigo Edjunior. Sempre que ela termina, fico(amos) com a impressão de que ainda não acertamos no ponto, e dessa vez não foi diferente.

A palestra começou em ponto, às 21h, e decidimos tentar uma abordagem um pouco diferente. A última vez que apresentamos a palestra foi no evento da Semana Integrada da PUC Campinas. Naquela ocasião, tínhamos optado por começar falando mais sobre os comandos do GDB, e depois mostrarmos como a coisa funciona, estilo hands-on. Dessa vez, resolvemos ir mostrando a prática junto com a teoria. Ficou melhor, e acho que a apresentação ficou mais fluida, mas ainda assim esbarramos no velho problema da interdependência dos comandos: quando íamos falar sobre breakpoints, precisávamos ter mostrado algum outro comando que só iria ser explicado mais à frente, que por sua vez iria precisar de outro comando, que iria precisar de breakpoints, etc. Enfim, no final acabamos sendo obrigados a pular alguns comandos, e a adiantar a explicação de outros, quebrando um pouco o fluxo dos slides.

Notei que algumas pessoas estavam bastante interessadas no GDB, talvez por já programarem há algum tempo. As outras, aparentemente, ainda não conseguiam ver muita utilidade para um depurador, mas mesmo assim tentavam aprender algo que talvez fosse lhes servir no futuro.

Já era de se esperar, mas mesmo assim não deixo de me surpreender quando vejo que uma palestra técnica consegue atrair muito mais atenção do que uma palestra "filosófica", como foi a do Projeto Fedora. Talvez seja reflexo da sociedade em que vivemos, ou talvez seja apenas uma impressão errônea da minha parte.

A conclusão, finalmente, é que a palestra parece ter sido útil para algumas pessoas (mesmo que poucas), e isso nos dá ainda mais fôlego pra continuarmos tentando divulgar esse projeto pouco conhecido (mas muito útil) que é o GDB.

Agradecimentos

Não poderia deixar de agradecer primeiramente à organização da SECCOMP da UNESP de Rio Claro pelo ótimo evento. Fiquei surpreso com a infra-estrutura e, principalmente, com a receptividade das pessoas. Gostei muito do ambiente descontraído, e espero não ter decepcionado muita gente por lá com meus comentários informais e caipiras durante as palestras :-).

Também agradeço ao meu amigo Edjunior por ter me acompanhado até sua alma matter para me ajudar na realização da palestra sobre o GDB.

Até a próxima!


Hoje, dia 23/10/2012, estarei na UNESP de Rio Claro para dar duas apresentações na Semana da Computação.

A primeira palestra será sobre o Projeto Fedora. Vai ser a primeira vez que falarei sobre o projeto depois de ter me tornado Embaixador do Fedora no Brasil. Confesso que estou um pouco apreensivo, mas escolhi slides muito bons feitos pelo Paul W. Frields, ex-líder do Projeto e bastante competente em suas apresentações. Pretendo fazer um relato sobre a palestra na quarta-feira.

A segunda apresentação será sobre o GDB. Essa apresentação vai ser mais um crash course sobre como utilizar a ferramenta, e os slides estão disponíveis em https://github.com/sergiodj/gdb-unicamp2011.

Espero que ambas as palestras sejam bem recebidas pelo público! Volto depois pra contar como foi :-).

Abraços.


After a long time, here we are again :-).

With this post I will start to talk about the integration between GDB and SystemTap. This is something that Tom Tromey and I did during the last year. The patch is being reviewed as I write this post, and I expect to see it checked-in in the next few days/weeks. But let's get our hands dirty...

SystemTap Userspace Probes

You probably use (or have at least heard of) SystemTap, and maybe you think the tool is only useful for kernel inspections. If that's your case, I have a good news: you're wrong! You can actually use SystemTap to inspect userspace applications too, by using what we call SDT probes, or Static Defined Tracing probes. This is a very cheap and easy way to include probes in your application, and you can even specify arguments to those probes.

In order to use the probes (see an example below), you must include the <sys/sdt.h> header file in your source code. If you are using Fedora systems, you can obtain this header file by installing the package systemtap-sdt-devel, version equal or greater than 1.4.

Here's a simple example of an application with a one-argument probe:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <sys/sdt.h>

int
main (int argc, char *argv[])
{
  int a = 10;

  STAP_PROBE1 (test_program, my_probe, a);

  return 0;
}

As you can see, this is a very simple program with one probe, which contains one argument. You can now compile the program:

1
$ gcc test_program.c -o test_program

Now you must be thinking: "Wait, wait... Didn't you just forget to link this program against some SystemTap-specific library or something?" And my answer is no. One of the spetacular things about this <sys/sdt.h> header is that it does not have any dependencies at all! As Tom said in his blog post, this is "a virtuoso display of ELF and GCC asm wizardy".

If you want to make sure your probe was inserted in the binary, you can use readelf command:

1
2
3
4
5
6
7
8
$ readelf -x .note.stapsdt ./test_program

Hex dump of section '.note.stapsdt':
  0x00000000 08000000 3a000000 03000000 73746170 ....:.......stap
  0x00000010 73647400 86044000 00000000 88054000 sdt...@.......@.
  0x00000020 00000000 00000000 00000000 74657374 ............test
  0x00000030 5f70726f 6772616d 006d795f 70726f62 _program.my_prob
  0x00000040 65002d34 402d3428 25726270 29000000 e.-4@-4(%rbp)...

(I will think about writing an explanation on how the probes are laid out on the binary, but for now you just have to care if you actually see an output from this readelf command.)

You can also use SystemTap to perform this verification:

1
2
$ stap -L 'process("./test_program").mark("*")'
process("./test_program").mark("my_probe") $arg1:long

So far, so good. If you see an output like the one above, it means your probe is correctly inserted. You could obviously use SystemTap to inspect this probe, but I won't do this right now because this is not the purpose of this post.

For now, we have learned how to:

  1. Include an SDT probe in our source code, and compile it;
  2. Verify if the probe was correctly inserted.

In the next post, I will talk about the GDB support that allows you to inspect, print arguments, and gather other information about SDT probes. I hope you like it!