Tag: free-software (subscribe)

These past couple of months I have been working to bring debuginfod to Ubuntu. I thought it would be a good idea to make this post and explain a little bit about what the service is and how I'm planning to deploy it.

A quick recap: what's debuginfod?

Here's a good summary of what debuginfod is:

debuginfod is a new-ish project whose purpose is to serve
ELF/DWARF/source-code information over HTTP.  It is developed under the
elfutils umbrella.  You can find more information about it here:

  https://sourceware.org/elfutils/Debuginfod.html

In a nutshell, by using a debuginfod service you will not need to
install debuginfo (a.k.a. dbgsym) files anymore; the symbols will be
served to GDB (or any other debuginfo consumer that supports debuginfod)
over the network.  Ultimately, this makes the debugging experience much
smoother (I myself never remember the full URL of our debuginfo
repository when I need it).

If you follow the Debian project, you might know that I run their debuginfod service. In fact, the excerpt above was taken from the announcement I made last year, letting the Debian community know that the service was available.

First stage

With more and more GNU/Linux distributions offering a debuginfod service to their users, I strongly believe that Ubuntu cannot afford to stay out of this "party" anymore. Fortunately, I have a manager who not only agrees with me but also turned the right knobs in order to make this project one of my priorities for this development cycle.

The deployment of this service will be made in stages. The first one, whose results are due to be announced in the upcoming weeks, encompasses indexing and serving all of the available debug symbols from the official Ubuntu repository. In other words, the service will serve everything from main, universe and multiverse, from every supported Ubuntu release out there.

This initial (a.k.a. "alpha") stage will also allow us to have an estimate of how much the service is used, so that we can better determine the resources allocated to it.

More down the road

This is just the beginning. In the following cycles, I will be working on a few interesting projects to expand the scope of the service and make it even more useful for the broader Ubuntu community. To give you an idea, here is what is on my plate:

  • Working on the problem of indexing and serving source code as well. This is an interesting problem and I already have some ideas, but it's also challenging and may unfold into more sub-projects. The good news is that a solution for this problem will also be beneficial to Debian.

  • Working with the snap developers to come up with a way to index and serve debug symbols for snaps as well.

  • Improve the integration of the service into Ubuntu. In fact, I have already started working on this by making elfutils (actually, libdebuginfod) install a customized shell snippet to automatically setup access to Ubuntu's debuginfod instance.

As you can see, there's a lot to do. I am happy to be working on this project, and I hope it will be helpful and useful for the Ubuntu community.


Hi there. Long time no write!

On Tuesday, February 23, 2021, I made an announcement at debian-devel-announce about a new service that I configured for Debian: a debuginfod server.

This post serves two purposes: pay the promise I made to Jonathan Carter that I would write a blog post about the service, and go into a bit more detail about it.

What's debuginfod?

From the announcement above:

debuginfod is a new-ish project whose purpose is to serve
ELF/DWARF/source-code information over HTTP.  It is developed under the
elfutils umbrella.  You can find more information about it here:

  https://sourceware.org/elfutils/Debuginfod.html

In a nutshell, by using a debuginfod service you will not need to
install debuginfo (a.k.a. dbgsym) files anymore; the symbols will be
served to GDB (or any other debuginfo consumer that supports debuginfod)
over the network.  Ultimately, this makes the debugging experience much
smoother (I myself never remember the full URL of our debuginfo
repository when I need it).

Perhaps not everybody knows this, but until last year I was a Debugger Engineer (a.k.a. GDB hacker) at Red Hat. I was not involved with the creation of debuginfod directly, but I witnessed discussions about "having way to serve debug symbols over the internet" multiple times during my tenure at the company. So this is not a new idea, and it's not even the first implementation, but it's the first time that some engineers actually got their hands dirty enough to have something concrete to show.

The idea to set up a debuginfod server for Debian started to brew after 2019's GNU Tools Cauldron, but as usual several things happened in $LIFE (including a global pandemic and leaving Red Hat and starting a completely different job at Canonical) which had the effect of shuffling my TODO list "a little".

Benefits for Debian

Debian unfortunately is lagging behind when it comes to offer its users a good debugging experience. Before the advent of our debuginfod server, if you wanted to debug a package in Debian you would need to:

  1. Add the debian-debug apt repository to your /etc/apt/sources.list.

  2. Install the dbgsym package that contains the debug symbols for the package you are debugging. Note that the version of the dbgsym package needs to be exactly the same as the version of the package you want to debug.

  3. Figure out which shared libraries your package uses and install the dbgsym packages for all of them. Arguably, this step is optional but recommended if you would like to perform a more in-depth debugging.

  4. Download the package source, possibly using apt source or some equivalent command.

  5. Open GDB, and make sure you adjust the source paths properly (more below). This can be non-trivial.

  6. Finally, debug the program.

Now, with the new service, you will be able to start from step 4, without having to mess with sources.list, dbgsym packages and version mismatches.

The package source

It is important to mention an existing (but perhaps not well-known) limitation of our debugging experience in Debian: the need to manually download the source packages and adjust GDB to properly find them (see step 4 above). debuginfod is able to serve source code as well, but our Debian instance is not doing that at the moment.

Debian does not provide a patched source tree that is ready to be consumed by GDB or debuginfod (for a good example of a distribution that does this, see Fedora's debugsource packages). Let me show you an example of debugging GDB itself (using debuginfod) on Debian:

$ HOME=/tmp DEBUGINFOD_URLS=https://debuginfod.debian.net gdb -q gdb
Reading symbols from gdb...
Downloading separate debug info for /tmp/gdb...
Reading symbols from /tmp/.cache/debuginfod_client/02046bac4352940d19d9164bab73b2f5cefc8c73/debuginfo...
(gdb) start
Temporary breakpoint 1 at 0xd18e0: file /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c, line 28.
Starting program: /usr/bin/gdb 
Downloading separate debug info for /lib/x86_64-linux-gnu/libreadline.so.8...
Downloading separate debug info for /lib/x86_64-linux-gnu/libz.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/libncursesw.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libtinfo.so.6...
Downloading separate debug info for /tmp/.cache/debuginfod_client/d6920dbdd057f44edaf4c1fbce191b5854dfd9e6/debuginfo...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Downloading separate debug info for /lib/x86_64-linux-gnu/libexpat.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/liblzma.so.5...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbabeltrace.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbabeltrace-ctf.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libipt.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libmpfr.so.6...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libsource-highlight.so.4...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libxxhash.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libdebuginfod.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libstdc++.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libgcc_s.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0...
Downloading separate debug info for /tmp/.cache/debuginfod_client/dbfea245d26065975b4084f4e9cd2d83c65973ee/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libdw.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libelf.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libuuid.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgmp.so.10...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.74.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4...
Downloading separate debug info for /lib/x86_64-linux-gnu/libbz2.so.1.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libicui18n.so.67...
Downloading separate debug info for /tmp/.cache/debuginfod_client/acaa831dbbc8aa70bb2131134e0c83206a0701f9/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libicuuc.so.67...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libnghttp2.so.14...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libidn2.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/librtmp.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libssh2.so.1...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libpsl.so.5...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libnettle.so.8...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgnutls.so.30...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbrotlidec.so.1...
Downloading separate debug info for /tmp/.cache/debuginfod_client/39739740c2f8a033de95c1c0b1eb8be445610b31/debuginfo...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libunistring.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libhogweed.so.6...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libgcrypt.so.20...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libp11-kit.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libtasn1.so.6...
Downloading separate debug info for /lib/x86_64-linux-gnu/libcom_err.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libsasl2.so.2...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libbrotlicommon.so.1...
Downloading separate debug info for /lib/x86_64-linux-gnu/libgpg-error.so.0...
Downloading separate debug info for /usr/lib/x86_64-linux-gnu/libffi.so.7...
Downloading separate debug info for /lib/x86_64-linux-gnu/libkeyutils.so.1...

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffebf8) at /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c:28
28      /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c: Directory not empty.
(gdb) list
23      in /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c
(gdb) 

(See all those Downloading separate debug info for... lines? Nice!)

As you can see, when we try to list the contents of the file we're in, nothing shows up. This happens because GDB doesn't know where the file is. So you have to tell it. In this case, it's relatively easy: you see that the GDB package's build directory is /build/gdb-Nav6Es/gdb-10.1/. When you apt source gdb, you will end up with a directory called $PWD/gdb-10.1/ containing the full source of the package. Notice that the last directory's name in both paths is the same, so in this case we can use GDB's set substitute-path command do the job for us (in this example $PWD is /tmp/):

$ HOME=/tmp DEBUGINFOD_URLS=https://debuginfod.debian.net gdb -q gdb
Reading symbols from gdb...
Reading symbols from /tmp/.cache/debuginfod_client/02046bac4352940d19d9164bab73b2f5cefc8c73/debuginfo...
(gdb) set substitute-path /build/gdb-Nav6Es/ /tmp/
(gdb) start
Temporary breakpoint 1 at 0xd18e0: file /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c, line 28.
Starting program: /usr/bin/gdb 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffebf8) at /build/gdb-Nav6Es/gdb-10.1/gdb/gdb.c:28
warning: Source file is more recent than executable.
28        memset (&args, 0, sizeof args);
(gdb) list
23      int
24      main (int argc, char **argv)
25      {
26        struct captured_main_args args;
27
28        memset (&args, 0, sizeof args);
29        args.argc = argc;
30        args.argv = argv;
31        args.interpreter_p = INTERP_CONSOLE;
32        return gdb_main (&args);
(gdb)

Much better, huh? The problem is that this process is manual, and changes depending on how the package you're debugging was built.

What can we do to improve this? What I personally would like to see is something similar to what the Fedora project already does: create a new debug package which will contain the full, patched source package. This would mean changing our building infrastructure and possibly other somewhat complex things.

Using the service (by default)

At the time of this writing, I am working on an elfutils Merge Request whose purpose is to implement a debconf question to ask the user whether she wants to use our service by default.

If you would like to start using the service right now, all you have to do is set the following environment variable in your shell:

DEBUGINFOD_URLS="https://debuginfod.debian.net"

More information

You can find more information about our debuginfod service here. Try to keep an eye on the page as it's being constantly updated.

If you'd like to get in touch with me, my email is my domain at debian dot org.

I sincerely believe that this service is a step in the right direction, and hope that it can be useful to you :-).


Back in September, we had the GNU Tools Cauldron in the gorgeous city of Montréal (perhaps I should write a post specifically about it...). One of the sessions we had was the GDB BoF, where we discussed, among other things, how to improve our patch review system.

I have my own personal opinions about the current review system we use (mailing list-based, in a nutshell), and I haven't felt very confident to express it during the discussion. Anyway, the outcome was that at least 3 global maintainers have used or are currently using the Gerrit Code Review system for other projects, are happy with it, and that we should give it a try. Then, when it was time to decide who wanted to configure and set things up for the community, I volunteered. Hey, I'm already running the Buildbot master for GDB, what is the problem to manage yet another service? Oh, well.

Before we dive into the details involved in configuring and running gerrit in a machine, let me first say that I don't totally support the idea of migrating from mailing list to gerrit. I volunteered to set things up because I felt the community (or at least the its most active members) wanted to try it out. I don't necessarily agree with the choice.

Ah, and I'm writing this post mostly because I want to be able to close the 300+ tabs I had to open on my Firefox during these last weeks, when I was searching how to solve the myriad of problems I faced during the set up!

The initial plan

My very initial plan after I left the session room was to talk to the sourceware.org folks and ask them if it would be possible to host our gerrit there. Surprisingly, they already have a gerrit instance up and running. It's been set up back in 2016, it's running an old version of gerrit, and is pretty much abandoned. Actually, saying that it has been configured is an overstatement: it doesn't support authentication, user registration, barely supports projects, etc. It's basically what you get from a pristine installation of the gerrit RPM package in RHEL 6.

I won't go into details here, but after some discussion it was clear to me that the instance on sourceware would not be able to meet our needs (or at least what I had in mind for us), and that it would be really hard to bring it to the quality level I wanted. I decided to go look for other options.

The OSCI folks

Have I mentioned the OSCI project before? They are absolutely awesome. I really love working with them, because so far they've been able to meet every request I made! So, kudos to them! They're the folks that host our GDB Buildbot master. Their infrastructure is quite reliable (I never had a single problem), and Marc Dequénes (Duck) is very helpful, friendly and quick when replying to my questions :-).

So, it shouldn't come as a surprise the fact that when I decided to look for other another place to host gerrit, they were my first choice. And again, they delivered :-).

Now, it was time to start thinking about the gerrit set up.

User registration?

Over the course of these past 4 weeks, I had the opportunity to learn a bit more about how gerrit does things. One of the first things that negatively impressed me was the fact that gerrit doesn't handle user registration by itself. It is possible to have a very rudimentary user registration "system", but it relies on the site administration manually registering the users (via htpasswd) and managing everything by him/herself.

It was quite obvious to me that we would need some kind of access control (we're talking about a GNU project, with a copyright assignment requirement in place, after all), and the best way to implement it is by having registered users. And so my quest for the best user registration system began...

Gerrit supports some user authentication schemes, such as OpenID (not OpenID Connect!), OAuth2 (via plugin) and LDAP. I remembered hearing about FreeIPA a long time ago, and thought it made sense using it. Unfortunately, the project's community told me that installing FreeIPA on a Debian system is really hard, and since our VM is running Debian, it quickly became obvious that I should look somewhere else. I felt a bit sad at the beginning, because I thought FreeIPA would really be our silver bullet here, but then I noticed that it doesn't really offer a self-service user registration.

After exchanging a few emails with Marc, he told me about Keycloak. It's a full-fledged Identity Management and Access Management software, supports OAuth2, LDAP, and provides a self-service user registration system, which is exactly what we needed! However, upon reading the description of the project, I noticed that it is written in Java (JBOSS, to be more specific), and I was afraid that it was going to be very demanding on our system (after all, gerrit is also a Java program). So I decided to put it on hold and take a look at using LDAP...

Oh, man. Where do I start? Actually, I think it's enough to say that I just tried installing OpenLDAP, but gave up because it was too cumbersome to configure. Have you ever heard that LDAP is really complicated? I'm afraid this is true. I just didn't feel like wasting a lot of time trying to understand how it works, only to have to solve the "user registration" problem later (because of course, OpenLDAP is just an LDAP server).

OK, so what now? Back to Keycloak it is. I decided that instead of thinking that it was too big, I should actually install it and check it for real. Best decision, by the way!

Setting up Keycloak

It's pretty easy to set Keycloak up. The official website provides a .tar.gz file which contains the whole directory tree for the project, along with helper scripts, .jar files, configuration, etc. From there, you just need to follow the documentation, edit the configuration, and voilà.

For our specific setup I chose to use PostgreSQL instead of the built-in database. This is a bit more complicated to configure, because you need to download the JDBC driver, and install it in a strange way (at least for me, who is used to just editing a configuration file). I won't go into details on how to do this here, because it's easy to find on the internet. Bear in mind, though, that the official documentation is really incomplete when covering this topic! This is one of the guides I used, along with this other one (which covers MariaDB, but can be adapted to PostgreSQL as well).

Another interesting thing to notice is that Keycloak expects to be running on its own virtual domain, and not under a subdirectory (e.g, https://example.org instead of https://example.org/keycloak). For that reason, I chose to run our instance on another port. It is supposedly possible to configure Keycloak to run under a subdirectory, but it involves editing a lot of files, and I confess I couldn't make it fully work.

A last thing worth mentioning: the official documentation says that Keycloak needs Java 8 to run, but I've been using OpenJDK 11 without problems so far.

Setting up Gerrit

The fun begins now!

The gerrit project also offers a .war file ready to be deployed. After you download it, you can execute it and initialize a gerrit project (or application, as it's called). Gerrit will create a directory full of interesting stuff; the most important for us is the etc/ subdirectory, which contains all of the configuration files for the application.

After initializing everything, you can try starting gerrit to see if it works. This is where I had my first trouble. Gerrit also requires Java 8, but unlike Keycloak, it doesn't work out of the box with OpenJDK 11. I had to make a small but important addition in the file etc/gerrit.config:

[container]
    ...
    javaOptions = "--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED"
    ...

After that, I was able to start gerrit. And then I started trying to set it up for OAuth2 authentication using Keycloak. This took a very long time, unfortunately. I was having several problems with Gerrit, and I wasn't sure how to solve them. I tried asking for help on the official mailing list, and was able to make some progress, but in the end I figured out what was missing: I had forgotten to add the AddEncodedSlashes On in the Apache configuration file! This was causing a very strange error on Gerrit (as you can see, a java.lang.StringIndexOutOfBoundsException!), which didn't make sense. In the end, my Apache config file looks like this:

<VirtualHost *:80>
    ServerName gnutoolchain-gerrit.osci.io

    RedirectPermanent / https://gnutoolchain-gerrit.osci.io/r/
</VirtualHost>

<VirtualHost *:443>
    ServerName gnutoolchain-gerrit.osci.io

    RedirectPermanent / /r/

    SSLEngine On
    SSLCertificateFile /path/to/cert.pem
    SSLCertificateKeyFile /path/to/privkey.pem
    SSLCertificateChainFile /path/to/chain.pem

    # Good practices for SSL
    # taken from: <https://mozilla.github.io/server-side-tls/ssl-config-generator/>

    # intermediate configuration, tweak to your needs
    SSLProtocol             all -SSLv3
    SSLCipherSuite          ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS
    SSLHonorCipherOrder     on
    SSLCompression          off
    SSLSessionTickets       off

    # OCSP Stapling, only in httpd 2.3.3 and later
    #SSLUseStapling          on
    #SSLStaplingResponderTimeout 5
    #SSLStaplingReturnResponderErrors off
    #SSLStaplingCache        shmcb:/var/run/ocsp(128000)

    # HSTS (mod_headers is required) (15768000 seconds = 6 months)
    Header always set Strict-Transport-Security "max-age=15768000"

    ProxyRequests Off
    ProxyVia Off
    ProxyPreserveHost On
    <Proxy *>
        Require all granted
    </Proxy>

    AllowEncodedSlashes On
        ProxyPass /r/ http://127.0.0.1:8081/ nocanon
        #ProxyPassReverse /r/ http://127.0.0.1:8081/r/
</VirtualHost>

I confess I was almost giving up Keycloak when I finally found the problem...

Anyway, after that things went more smoothly. I was finally able to make the user authentication work, then I made sure Keycloak's user registration feature also worked OK...

Ah, one interesting thing: the user logout wasn't really working as expected. The user was able to logout from gerrit, but not from Keycloak, so when the user clicked on "Sign in", Keycloak would tell gerrit that the user was already logged in, and gerrit would automatically log the user in again! I was able to solve this by redirecting the user to Keycloak's logout page, like this:

[auth]
    ...
    logoutUrl = https://keycloak-url:port/auth/realms/REALM/protocol/openid-connect/logout?redirect_uri=https://gerrit-url/
    ...

After that, it was already possible to start worrying about configure gerrit itself. I don't know if I'll write a post about that, but let me know if you want me to.

Conclusion

If you ask me if I'm totally comfortable with the way things are set up now, I can't say that I am 100%. I mean, the set up seems robust enough that it won't cause problems in the long run, but what bothers me is the fact that I'm using technologies that are alien to me. I'm used to setting up things written in Python, C, C++, with very simple yet powerful configuration mechanisms, and an easy to discover what's wrong when something bad happens.

I am reasonably satisfied with the Keycloak logs things, but Gerrit leaves a lot to be desired in that area. And both projects are written in languages/frameworks that I am absolutely not comfortable with. Like, it's really tough to debug something when you don't even know where the code is or how to modify it!

All in all, I'm happy that this whole adventure has come to an end, and now all that's left is to maintain it. I hope that the GDB community can make good use of this new service, and I hope that we can see a positive impact in the quality of the whole patch review process.

My final take is that this is all worth as long as the Free Software and the User Freedom are the ones who benefit.

P.S.: Before I forget, our gerrit instance is running at https://gnutoolchain-gerrit.osci.io.


Back in 2016, when life was simpler, a Fedora GDB user reported a bug (or a feature request, depending on how you interpret it) saying that GDB's gcore command did not respect the COREFILTER_ELF_HEADERS flag, which instructs it to dump memory pages containing ELF headers. As you may or may not remember, I have already written about the broader topic of revamping GDB's internal corefile dump algorithm; it's an interesting read and I recommend it if you don't know how Linux (or GDB) decides which mappings to dump to a corefile.

Anyway, even though the bug was interesting and had to do with a work I'd done before, I couldn't really work on it at the time, so I decided to put it in the TODO list. Of course, the "TODO list" is actually a crack where most things fall through and are usually never seen again, so I was blissfully ignoring this request because I had other major priorities to deal with. That is, until a seemingly unrelated problem forced me to face this once and for all!

What? A regression? Since when?

As the Fedora GDB maintainer, I'm routinely preparing new releases for Fedora Rawhide distribution, and sometimes for the stable versions of the distro as well. And I try to be very careful when dealing with new releases, because a regression introduced now can come and bite us (i.e., the Red Hat GDB team) back many years in the future, when it's sometimes too late or too difficult to fix things. So, a mandatory part of every release preparation is to actually run a regression test against the previous release, and make sure that everything is working correctly.

One of these days, some weeks ago, I had finished running the regression check for the release I was preparing when I noticed something strange: a specific, Fedora-only corefile test was FAILing. That's a no-no, so I started investigating and found that the underlying reason was that, when the corefile was being generated, the build-id note from the executable was not being copied over. Fedora GDB has a local patch whose job is to, given a corefile with a build-id note, locate the corresponding binary that generated it. Without the build-id note, no binary was being located.

Coincidentally or not, at the same I started noticing some users reporting very similar build-id issues on the freenode's #gdb channel, and I thought that this bug had a potential to become a big headache for us if nothing was done to fix it right now.

I asked for some help from the team, and we managed to discover that the problem was also happening with upstream gcore, and that it was probably something that binutils was doing, and not GDB. Hmm...

Ah, so it's ld's fault. Or is it?

So there I went, trying to confirm that it was binutils's fault, and not GDB's. Of course, if I could confirm this, then I could also tell the binutils guys to fix it, which meant less work for us :-).

With a lot of help from Keith Seitz, I was able to bisect the problem and found that it started with the following commit:

commit f6aec96dce1ddbd8961a3aa8a2925db2021719bb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Feb 27 11:34:20 2018 -0800

    ld: Add --enable-separate-code

This is a commit that touches the linker, which is part of binutils. So that means this is not GDB's problem, right?!? Hmm. No, unfortunately not.

What the commit above does is to simply enable the use of --enable-separate-code (or -z separate-code) by default when linking an ELF program on x86_64 (more on that later). On a first glance, this change should not impact the corefile generation, and indeed, if you tell the Linux kernel to generate a corefile (for example, by doing sleep 60 & and then hitting C-\), you will notice that the build-id note is included into it! So GDB was still a suspect here. The investigation needed to continue.

What's with -z separate-code?

The -z separate-code option makes the code segment in the ELF file to put in a completely separated segment than data segment. This was done to increase the security of generated binaries. Before it, everything (code and data) was put together in the same memory region. What this means in practice is that, before, you would see something like this when you examined /proc/PID/smaps:

00400000-00401000 r-xp 00000000 fc:01 798593                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

And now, you will see two memory regions instead, like this:

00400000-00401000 r--p 00000000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         4 kB
Private_Dirty:         0 kB
Referenced:            4 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd mr mw me dw sd
00401000-00402000 r-xp 00001000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

A few minor things have changed, but the most important of them is the fact that, before, the whole memory region had anonymous data in it, which means that it was considered an anonymous private mapping (anonymous because of the non-zero Anonymous amount of data; private because of the p in the r-xp permission bits). After -z separate-code was made default, the first memory mapping does not have Anonymous contents anymore, which means that it is now considered to be a file-backed private mapping instead.

GDB, corefile, and coredump_filter

It is important to mention that, unlike the Linux kernel, GDB doesn't have all of the necessary information readily available to decide the exact type of a memory mapping, so when I revamped this code back in 2015 I had to create some heuristics to try and determine this information. If you're curious, take a look at the linux-tdep.c file on GDB's source tree, specifically at the functions dump_mapping_p and linux_find_memory_regions_full.

When GDB is deciding which memory regions should be dumped into the corefile, it respects the value found at the /proc/PID/coredump_filter file. The default value for this file is 0x33, which, according to core(5), means:

Dump memory pages that are either anonymous private, anonymous
shared, ELF headers or HugeTLB.

GDB had the support implemented to dump almost all of these pages, except for the ELF headers variety. And, as you can probably infer, this means that, before the -z separate-code change, the very first memory mapping of the executable was being dumped, because it was marked as anonymous private. However, after the change, the first mapping (which contains only data, no code) wasn't being dumped anymore, because it was now considered by GDB to be a file-backed private mapping!

Finally, that is the reason for the difference between corefiles generated by GDB and Linux, and also the reason why the build-id note was not being included in the corefile anymore! You see, the first memory mapping contains not only the program's data, but also its ELF headers, which in turn contain the build-id information.

gcore, meet ELF headers

The solution was "simple": I needed to improve the current heuristics and teach GDB how to determine if a mapping contains an ELF header or not. For that, I chose to follow the Linux kernel's algorithm, which basically checks the first 4 bytes of the mapping and compares them against \177ELF, which is ELF's magic number. If the comparison succeeds, then we just assume we're dealing with a mapping that contains an ELF header and dump it.

In all fairness, Linux just dumps the first page (4K) of the mapping, in order to save space. It would be possible to make GDB do the same, but I chose the faster way and just dumped the whole mapping, which, in most scenarios, shouldn't be a big problem.

It's also interesting to mention that GDB will just perform this check if:

  • The heuristic has decided not to dump the mapping so far, and;
  • The mapping is private, and;
  • The mapping's offset is zero, and;
  • There is a request to dump mappings with ELF headers (i.e., coredump_filter).

Linux also makes these checks, by the way.

The patch, finally

I submitted the patch to the mailing list, and it was approved fairly quickly (with a few minor nits).

The reason I'm writing this blog post is because I'm very happy and proud with the whole process. It wasn't an easy task to investigate the underlying reason for the build-id failures, and it was interesting to come up with a solution that extended the work I did a few years ago. I was also able to close a few bug reports upstream, as well as the one reported against Fedora GDB.

The patch has been pushed, and is also present at the latest version of Fedora GDB for Rawhide. It wasn't possible to write a self-contained testcase for this problem, so I had to resort to using an external tool (eu-unstrip) in order to guarantee that the build-id note is correctly present in the corefile. But that's a small detail, of course.

Anyway, I hope this was an interesting (albeit large) read!


Heya!

This past Saturday, April 27th, 2019, Samuel Vale, Alex Volkov and I organized the Toronto Bug Squashing Party here in the city. I was very happy with the outcome, especially the fact that we had more than 10 people attending, including a bunch of folks that came from Montréal!

The start

It was a cold day in Toronto, and we met at the Mozilla Toronto office at 9 in the morning. Right there at the door I met anarcat, who had just arrived from Montréal. Together with Alex, we waited for Will to arrive and open the door for us. Then, some more folks started showing up, and we waited until 10:30h to start the first presentation of the day.

Packaging 101

Anarcat kindly gave us his famous "Packaging 101" presentation, in which he explains the basics of Debian packaging. Here's a picture of the presentation:

anarcat presenting Packaging 101, side

And another one:

anarcat presenting Packaging 101, front

The presentation was great, and Alex recorded it! You can watch it here (sorry, youtube link...).

During the day, we've also taught a few tricks about the BTS, in order to help people file bugs, add/remove tags, comment on bugs, etc.

Then, we moved on to the actual hacking.

Bug fixing

This part took most of the day, as was expected. We started by looking at the RC bugs currently filed against Buster, and deciding which ones would be interesting for us. I won't go into details here, but I think we made great progress, considering this was the first BSP for many of us there (myself included).

You can look at the bugs we worked on, and you will see that we have actually fixed 6 of them! I even fixed a JavaScript bug, which is something totally out of my area of expertise ;-).

I also noticed something interesting. The way we look at bugs can vary wildly between one DD and another. I mean, this is something I always knew, especially when I was more involved with the debian-mentors effort, but it's really amazing to feel this in person. I tend to be more picky when it comes to defining what to do when I start to work on a bug; I try really hard to reproduce it (and spend a lot of time doing so), and will really dive deep into the code trying to understand why some test is failing. Other developer may be less "pedantic", and choose to (e.g.) disable certain test that is failing. In the end, I think everything is a balance and I tried to learn from this experience.

Anyway, given that we looked at 12 bugs and solved 6, I think we did great! And this also helped me to get my head "back in the Debian game"; I was too involved with GDB these past months (there's a post about one of the things I did which is coming soon, stay tunned).

Look at us hacking:

Everybody hacking

Wrap up

At 19h (or 7p.m.), we had to wrap up and prepare to go. Because we had a sizeable number of Brazilians in the group (5!), the logical thing to do was to go to a pub and resume the conversation there :-). If I say it was one of the first times I went to a pub to drink with newly made friends in Toronto, you probably wouldn't believe, so I won't say anything...

I know one thing for sure: we want to make this again, and soon! In fact, my idea is to do another one after Buster is released (and after the summer is gone, of course), so maybe October. We'll see.

Acknowledgements

I would like to thank Mozilla Toronto for hosting us; it was awesome to finally visit their office and enjoy their hospitality, personified by Will Hawkins. It is impossible not to thank anarcat, who came all the way from Montréal to give us his Debian Packaging 101 talk. Speaking of the French-Canadian (and Brazilian), it was super awesome meeting Tiago Vaz and Tássia Camões, and it was great seeing Valessio Brito again.

Let me also thank the "locals" who attended the party; it was great seeing everybody there! Hope I can see everybody again when we make the second edition of our BSP :-).


Combater


Tags:

Às vezes, é preciso combater. É preciso dizer que o outro está errado, que ele está falando besteira sobre um assunto que não conhece (e não quer conhecer). É preciso dizer o que é ético, o que é certo. É preciso discernir tudo o que é errado e anti-ético, imoral, e que faz mal. É preciso combater o ódio, muitas vezes com amor, outras tantas com força e integridade.

É preciso falar praquele ignorante que ele não sabe o que é Software Livre. É preciso dizer que o Software Livre é muito maior do que o GNU, muito maior do que uma pessoa ou do que suas declarações. É preciso dizer que o ignorante tornou-se troll. É preciso dizer que ele não sabe o que fala, e que deve calar-se. É preciso deixar que ele viva sua adolescência conturbada e por vezes medíocre, mas tomando cuidado para que isso não influencie outras pessoas ignorantes a tornarem-se trolls também. É preciso que esse troll saia do Twitter, saia do BR-[GNU/]Linux, saia dos fóruns movidos a coisas proprietárias; ou talvez seja preciso que ele fique lá, destilando seu ódio, veneno e ignorância para seus semelhantes.

É preciso combater o liberalismo de fachada, que é um veículo para o ódio. É preciso combater o ódio. É preciso combater a ignorância, novamente. É preciso combater o reacionarismo disfarçado de “livre mercado”, é preciso combater a falta de bom senso que ocorre quando se generaliza um partido político por um comportamento, é preciso combater o comportamento, é preciso fazer progresso social sempre, é preciso parar de se importar tanto com aqueles que não se importam.

É preciso combater o pastor ignorante. É preciso combater a ignorância, uma terceira vez. É preciso combater a “trollagem” do pastor, dos fiéis e dos simpatizantes a eles. É preciso combater a onda de “radicalismo conservador” que aflige a todos. É preciso combater a falta de amor ao próximo e o excesso de arrogância. É preciso combater as falsas palavras divinas, as falsas vontades de uma entidade, as falsas aglomerações públicas em torno de um erro.

É preciso combater o apresentador idiota, ignorante e presunçoso. É preciso combater o que se destila de ódio naquele país, porque nem todos têm um soro contra veneno de cobra criada. É preciso combater a ignorância, novamente, porque ela é o caminho mais fácil para o ódio, e o ódio retroalimenta a ignorância num ciclo difícil de ser quebrado. É preciso ensinar a aprender, e aprender a ensinar. É preciso combater a preguiça, essa desculpa tão usada e repetida que chega a dar preguiça de combatê-la. É preciso sair do sofá, mas não para ir para o Twitter ou Facebook; é preciso sair do sofá e ser crítico o suficiente para saber o que se deve fazer, porque não sou eu quem vou falar.


After spending the last weeks struggling with this, I decided to write a blog post. First, what is “this” that you are talking about? The answer is: Linux kernel's concept of memory mapping. I found it utterly confused, beyond my expectations, and so I believe that a blog post is the write way to (a) preserve and (b) share this knowledge. So, let's do it!

First things first

First, I cannot begin this post without a few acknowledgements and “thank you's”. The first goes to Oleg Nesterov (sorry, I could not find his website), a Linux kernel guru who really helped me a lot through the whole task. Another “thank you” goes to Jan Kratochvil, who also provided valuable feedback by commenting my GDB patch. Now, back to the point.

The task

The task was requested here: GDB needed to respect the /proc/<PID>/coredump_filter file when generating a coredump (i.e., when you use the gcore command).

Currently, GDB has his own coredump mechanism implemented which, despite its limitations and bugs, has been around for quite some time. However, and maybe you don't know that, but the Linux kernel has its own algorithm for generating the corefile of a process. And unfortunately, GDB and Linux were not really following the same standards here...

So, in the end, the task was about synchronizing GDB and Linux. To do that, I first had to decipher the contents of the /proc/<PID>/smaps file.

The /proc/<PID>/smaps file

This special file, generated by the Linux kernel when you read it, contains detailed information about each memory mapping of a certain process. Some of the fields on this file are documented in the proc(5) manpage, but others are missing there (asking for a patch!). Here is an explanation of everything I needed:

  • The first line of each memory mapping has the following format:

    The fields here are:

    a) address is the address range, in the process' address space, that the mapping occupies. This part was already treated by GDB, so I did not have to worry about it.

    b) perms is a set of permissions (r ead, w rite, e x ecute, s hared, p rivate [COW -- copy-on-write]) applied to the memory mapping. GDB was already dealing with rwx permissions, but I needed to include the p flag as well. I also made GDB ignore the mappings that did not have the r flag active, because it does not make sense to dump something that you cannot read.

    c) offset is the offset into the applied to the file, if the mapping is file-backed (see below). GDB already handled this correctly.

    d) dev is the device (major:minor) related to the file, if there is one. GDB already handled this correctly, though I was using this field for more things (continue reading).

    e) inode is the inode on the device above. The value of zero means that no inode is associated with the memory mapping. Nothing to do here.

    f) pathname is the file associate with this mapping, if there is one. This is one of the most important fields that I had to use, and one of the most complicated to understand completely. GDB now uses this to heuristically identify whether the mapping is anonymous or not.

  • GDB is now also interested in Anonymous: and AnonHugePages: fields from the smaps file. Those fields represent the content of anonymous data on the mapping; if GDB finds that this content is greater than zero, this means that the mapping is anonymous.

  • The last, but perhaps most important field, is the VmFlags: field. It contains a series of two-letter flags that provide very useful information about the mapping. A description of the fields is: a) sh: the mapping is shared (VM_SHARED) b) dd: this mapping should not be dumped in a corefile (VM_DONTDUMP) c) ht: this is HugeTLB mapping

With that in hands, the following task was to be able to determine whether a memory mapping is anonymous or file-backed, private or shared.

Types of memory mappings

There can be four types of memory mappings:

  1. Anonymous private mapping
  2. Anonymous shared mapping
  3. File-backed private mapping
  4. File-backed shared mapping

It should be possible to uniquely identify each mapping based on the information provided by the smaps file; however, you will see that this is not always the case. Below, I will explain how to determine each of the four characteristics that define a mapping.

Anonymous

A mapping is anonymous if one of these conditions apply:

  1. The pathname associated with it is either /dev/zero (deleted), /SYSV%08x (deleted), or <filename> (deleted) (see below).
  2. There is content in the Anonymous: or in the AnonHugePages: fields of the mapping in the smaps file.

A special explanation is needed for the <filename> (deleted) case. It is not always guaranteed that it identifies an anonymous mapping; in fact, it is possible to have the (deleted) part for file-backed mappings as well (say, when you are running a program that uses shared libraries, and those shared libraries have been removed because of an update, for example). However, we are trying to mimic the behavior of the Linux kernel here, which checks to see if a file has no hard links associated with it (and therefore is truly deleted).

Although it may be possible for the userspace to do an extensive check (by stat ing the file, for example), the Linux kernel certainly could give more information about this.

File-backed

A mapping is file-backed (i.e., not anonymous) if:

  1. The pathname associated with it contains a <filename>, without the (deleted) part.

As has been explained above, a mapping whose pathname contains the (deleted) string could still be file-backed, but we decide to consider it anonymous.

It is also worth mentioning that a mapping can be simultaneously anonymous and file-backed: this happens when the mapping contains a valid pathname (without the (deleted) part), but also contains Anonymous: or AnonHugePages: contents.

Private

A mapping is considered to be private (i.e., not shared) if:

  1. In the absence of the VmFlags field (in the smaps file), its permission field has the flag p.
  2. If the VmFlags field is present, then the mapping is private if we do not find the sh flag there.

Shared

A mapping is shared (i.e., not private) if:

  1. In the absence of VmFlags in the smaps file, the permission field of the mapping does not have the p flag. Not having this flag actually means VM_MAYSHARE and not necessarily VM_SHARED (which is what we want), but it is the best approximation we have.
  2. If the VmFlags field is present, then the mapping is shared if we find the sh flag there.

The patch

With all that in mind, I hacked GDB to improve the coredump mechanism for GNU/Linux operating systems. The main function which decides the memory mappings that will or will not be dumped on GNU/Linux is linux_find_memory_regions_full; the Linux kernel obviously uses its own function, vma_dump_size, to do the same thing.

Linux has one advantage: it is a kernel, and therefore has much more knowledge about processes' internals than a userspace program. For example, inside Linux it is trivial to check if a file marked as "(deleted)" in the output of the smaps file has no hard links associated with it (and therefore is not really deleted); the same operation on userspace, however, would require root access to inspect the contents of the /proc/<PID>/map_files/ directory.

The case described above, if you remember, is something that impacts the ability to tell whether a mapping is anonymous or not. I am talking to the Linux kernel guys to see if it is possible to export this information directly via the smaps file, instead of having to do the current heuristic.

While doing this work, some strange behaviors were found in the Linux kernel. Oleg is working on them, along with other Linux hackers. From our side, there is still room for improvement on this code. The first thing I can think of is to improve the heuristics for finding anonymous mappings. Another relatively easy thing to do would be to let the user specify a value for coredump_filter on the command line, without editing the /proc file. And of course, keep this code always updated with its counterpart in the Linux kernel.

Upstream discussions and commit

If you are interested, you can see the discussions that happened upstream by going to this link. This is the fourth (and final) submission of the patch; you should be able to find the other submissions in the archive.

The final commit can be found in the official repository.


Fazendo a Diferença


Tags:

Deu saudade de escrever em português :-). E deu saudade, também, de fazer algum post mais “filosófico”.

Não sei dizer o porquê, mas às vezes tenho uma mania besta: gosto de ficar procurando “sarna pra me coçar”. Em outras palavras, eu fico procurando coisas que me deixam mal, mesmo sabendo que vou ficar mal depois de vê-las.

Não tenho explicação pra esse comportamento. É algo meio sabotador, meio sofredor, meio... Não sei. Às vezes, quando me vejo novamente nesse ciclo vicioso, consigo parar. No entanto, na maioria das vezes, eu entro num estado estranho: é como se eu estivesse me observando, estudando quais consequências aquele ato traz para mim. Fico me perguntando se sou a única pessoa desse mundo que faz isso...

Acho que um exemplo bom desse tipo de comportamento é o que tenho feito ultimamente. Às vezes, por algum motivo que me é estranho, leio coisas ruins escritas por pessoas extremamente insensatas. E, talvez pelo mesmo motivo misterioso, eu fico mal com o que leio, mesmo sabendo que, colocando na balança o que essas pessoas fazem e o que eu faço, a diferença é gigantesca. Então por que raios eu fico mal quando leio as besteiras que são praticamente vomitadas por essas pessoas?

Talvez algumas pessoas (eu incluso) tenham um radar pra sentimentos fortes. Por exemplo, um gesto de altruísmo é algo que consegue tocar o fundo da alma, e merece ser apreciado como um vinho raro. Mas, em contrapartida, uma expressão de raiva, desprezo ou incompreensão também capta a atenção de uma forma quase inevitável. O mistério que esse gesto, muitas vezes incoerente, esconde é algo que me deixa quase aficcionado, como se eu estivesse lendo um livro e não quisesse parar antes de chegar no final. Por que uma pessoa se coloca num papel por vezes ridículo, apenas por conta de uma opinião? Por que essa pessoa, na ânsia de criticar um comportamento, um pensamento, ou uma ideologia, muitas vezes exibe exatamente as mesmas características que repudia? O que faz um ser humano, cheio de falhas e limitações, subir num (muitas vezes falso) pedestal e esquecer que já esteve lá embaixo?

Felizmente, as questões acima, por mais intrigantes que sejam, não têm me prendido por muito tempo. Acho que, nesse processo de aprendizagem a que chamamos de “vida”, estou num ponto em que percebo claramente o caos que reina na cabeça dessas pessoas, e tento me afastar dele. Mas, mais importante que isso, acho que me dou conta de você pode escolher ser a mudança que quer ver no mundo (Gandhi), ou ficar ladrando enquanto a caravana passa... E eu definitivamente não quero perder meu tempo comparando códigos pra dizer quem é melhor.


The GNU Radical


Tags:

A friend of mine, Blaise, once told me not to introduce myself as “... what you would call a radical...”. He had listened to me talking to a person who were questioning what a Free Software activist does. My friend's rationale, to which I totally agree, is that you must let the other person decide whether she thinks you are a “radical” or not. In other words, if you say you are a “radical” from the beginning, it will probably induce the other person to a pre-judgement about you, which is not good for you and for her.

As I said, I agree with him. But I am going through a lot of situations in my life that are constantly reminding me that, maybe, I am that “radical” after all. I do not know whether this is good or bad, and I can say I have been questioning myself for a while now. This post, by the way, is going to be a lot about self-questioning.

Maybe the problem is that I am expecting too much from those that have the same beliefs that I do. Or maybe the cause is that I do not know what to expect from them in certain situations, and I am disappointed when I see that they do not follow what I think is best sometimes. On the other hand, when I look myself in the mirror, I do not know whether I am totally following what I think is best; and if I am not, then how can I even consider telling others to do that? And even if I am following my own advices, how can I be sure that they are good enough for others?

One good example of this is my opinion about FSF's use of Twitter. The opinion is public, and has been criticized by many people already, including Free Software supporters. Shortly after I wrote the post, I mentioned it to Richard Stallman, and he told me he was not going to read it because he considered it “too emotional”. I felt deeply sad because of his reaction, especially because it came from someone who often appeals to emotions in order to teach what he has to say. But I also started questioning myself about the topic.

Is it really bad to use Twitter? This is what I ask myself sometimes. I see so many people using it, including those who defend Free Software as I do (like Matt Lee), or those who stand against privacy abuses (like Jacob Appelbaum), or who are worried about social causes, or... Yeah, you got the point. I refuse to believe that they did not think about Twitter's issues, or about how they would be endorsing its use by using it themselves. Yet, they are there, and a lot of people is following their posts and discussing their opinions and ideas for a better world. As much as I try to understand their motivation for using Twitter (or even Facebook), I cannot convince myself that what they are doing is good for their goals. Am I being too narrow minded? Am I missing something?

Another example are my thoughts about Free Software programs that support (and sometimes even promote) unethical services. They (the thoughts) are also public. And it seems that this opinion, which is about something I called “Respectful Software”, is too strong (or “radical”?) for the majority of the developers, even considering Free Software developers. I saw very good arguments on why Free Software should support unethical services, and it is hard to disagree with them. I guess the best of those arguments is that when you support unethical services like Facebook, you are offering a Free Software option for those who want or need to use the service. In other words, you are helping them to slowly get rid of the digital handcuffs.

It seems like all those arguments (about Twitter, about implementing support for proprietary systems on Free Software, and others) are ultimately about reaching users that would otherwise remain ignorant of the Free Software philosophy. And how can someone have counter-arguments for this? It is impossible to argue that we do not need to take the Free Software message to everybody, because when someone does not use Free Software, she is doing harm to her community (thus, we want more people using Free Software, of course). When the Free Software Foundation makes use of Twitter to bring more people to the movement, and when I see that despite talking to people all around me I can hardly convince them to try GNU/Linux, who am I to criticize the FSF?

So, I have been thinking to myself whether it is time to change. What I am realizing more and more is that my fight for coherence perhaps is flawed. We are incoherent by nature. And the truth is that, no matter what we do, people change according to their own time, their own will, and their own beliefs (or to the lack of them). I remembered something that I once heard: changing is not binary, changing is a process. So, after all, maybe it is time to stop being a “GNU radical” (in the sense that I am radical even for the GNU project), and become a new type of activist.


To what extent should Free Software respect its users?

The question, strange as it may sound, is not only valid but also becoming more and more important these days. If you think that the four freedoms are enough to guarantee that the Free Software will respect the user, you are probably being oversimplistic. The four freedoms are essential, but they are not sufficient. You need more. I need more. And this is why I think the Free Software movement should have been called the Respectful Software movement.

I know I will probably hear that I am too radical. And I know I will hear it even from those who defend Free Software the way I do. But I need to express this feeling I have, even though I may be wrong about it.

It all began as an innocent comment. I make lots of presentations and talks about Free Software, and, knowing that the word “Free” is ambiguous in English, I started joking that Richard Stallman should have named the movement “Respectful Software”, instead of “Free Software”. If you think about it just a little, you will see that “respect” is a word that brings different interpretations to different people, just as “free” does. It is a subjective word. However, at least it does not have the problem of referring to completely unrelated things such as “price” and “freedom”. Respect is respect, and everybody knows it. What can change (and often does) is what a person considers respectful or not.

(I am obviously not considering the possible ambiguity that may exist in another language with the word “respect”.)

So, back to the software world. I want you to imagine a Free Software. For example, let's consider one that is used to connect to so-called “social networks” like GNU Social or pump.io. I do not want to use a specific example here; I am more interested in the consequences of a certain decision. Which decision? Keep reading :-).

Now, let's imagine that this Free Software is just beginning its life, probably in some code repository under the control of its developer(s), but most likely using some proprietary service like GitHub (which is an issue by itself). And probably the developer is thinking: “Which social network should my software support first?”. This is an extremely valid and important question, but sometimes the developer comes up with an answer that may not be satisfactory to its users. This is where the “respect” comes into play.

In our case, this bad answer would be “Facebook”, “Twitter”, “Linkedin”, or any other unethical social network. However, those are exactly the easiest answers for many and many Free Software developers, either because those “vampiric” services are popular among users, or because the developer him/herself uses them!! By now, you should be able to see where I am getting at. My point, in a simple question, is: “How far should we, Free Software developers, allow users to go and harm themselves and the community?”. Yes, this is not just a matter of self-inflicted restrictions, as when the user chooses to use a non-free software to edit a text file, for example. It is, in most cases, a matter of harming the community too. (I have written a post related to this issue a while ago, called “Privacy as a Collective Good”.)

It should be easy to see that it does not matter if I am using Facebook through my shiny Free Software application on my computer or cellphone. What really matters is that, when doing so, you are basically supporting the use of those unethical social networks, to the point that perhaps some of your friends are also using them because of you. What does it matter if they are using Free Software to access them or not? Is the benefit offered by the Free Software big enough to eliminate (or even soften) the problems that exist when the user uses an unethical service like Linkedin?

I wonder, though, what is the limit that we should obey. Where should we draw the line and say “I will not pass beyond this point”? Should we just “abandon” the users of those unethical services and social networks, while we lock ourselves in our not-very-safe world? After all, we need to communicate with them in order to bring them to our cause, but it is hard doing so without getting our hands dirty. But that is a discussion to another post, I believe.

Meanwhile, I could give plenty of examples of existing Free Softwares that are doing a disservice to the community by allowing (and even promoting) unethical services or solutions for their users. They are disrespecting their users, sometimes exploiting the fact that many users are not fully aware of privacy issues that come as a “gift” when you use those services, without spending any kind of effort to teach the users. However, I do not want this post to become a flamewar, so I will not mention any software explicitly. I think it should be quite easy for the reader to find examples out there.

Perhaps this post does not have a conclusion. I myself have not made my mind completely about the subject, though I am obviously leaning towards what most people would call the “radical” solution. But it is definitely not an easy topic to discuss, or to argument about. Nonetheless, we are closing our eyes to it, and we should not do so. The future of Free Software depends also on what kinds of services we promote, and what kinds of services we actually warn the users against. This is my definition of respect, and this is why I think we should develop Free and Respectful Software.