Tag: en_us (subscribe)

Back in September, we had the GNU Tools Cauldron in the gorgeous city of Montréal (perhaps I should write a post specifically about it...). One of the sessions we had was the GDB BoF, where we discussed, among other things, how to improve our patch review system.

I have my own personal opinions about the current review system we use (mailing list-based, in a nutshell), and I haven't felt very confident to express it during the discussion. Anyway, the outcome was that at least 3 global maintainers have used or are currently using the Gerrit Code Review system for other projects, are happy with it, and that we should give it a try. Then, when it was time to decide who wanted to configure and set things up for the community, I volunteered. Hey, I'm already running the Buildbot master for GDB, what is the problem to manage yet another service? Oh, well.

Before we dive into the details involved in configuring and running gerrit in a machine, let me first say that I don't totally support the idea of migrating from mailing list to gerrit. I volunteered to set things up because I felt the community (or at least the its most active members) wanted to try it out. I don't necessarily agree with the choice.

Ah, and I'm writing this post mostly because I want to be able to close the 300+ tabs I had to open on my Firefox during these last weeks, when I was searching how to solve the myriad of problems I faced during the set up!

The initial plan

My very initial plan after I left the session room was to talk to the sourceware.org folks and ask them if it would be possible to host our gerrit there. Surprisingly, they already have a gerrit instance up and running. It's been set up back in 2016, it's running an old version of gerrit, and is pretty much abandoned. Actually, saying that it has been configured is an overstatement: it doesn't support authentication, user registration, barely supports projects, etc. It's basically what you get from a pristine installation of the gerrit RPM package in RHEL 6.

I won't go into details here, but after some discussion it was clear to me that the instance on sourceware would not be able to meet our needs (or at least what I had in mind for us), and that it would be really hard to bring it to the quality level I wanted. I decided to go look for other options.

The OSCI folks

Have I mentioned the OSCI project before? They are absolutely awesome. I really love working with them, because so far they've been able to meet every request I made! So, kudos to them! They're the folks that host our GDB Buildbot master. Their infrastructure is quite reliable (I never had a single problem), and Marc Dequénes (Duck) is very helpful, friendly and quick when replying to my questions :-).

So, it shouldn't come as a surprise the fact that when I decided to look for other another place to host gerrit, they were my first choice. And again, they delivered :-).

Now, it was time to start thinking about the gerrit set up.

User registration?

Over the course of these past 4 weeks, I had the opportunity to learn a bit more about how gerrit does things. One of the first things that negatively impressed me was the fact that gerrit doesn't handle user registration by itself. It is possible to have a very rudimentary user registration "system", but it relies on the site administration manually registering the users (via htpasswd) and managing everything by him/herself.

It was quite obvious to me that we would need some kind of access control (we're talking about a GNU project, with a copyright assignment requirement in place, after all), and the best way to implement it is by having registered users. And so my quest for the best user registration system began...

Gerrit supports some user authentication schemes, such as OpenID (not OpenID Connect!), OAuth2 (via plugin) and LDAP. I remembered hearing about FreeIPA a long time ago, and thought it made sense using it. Unfortunately, the project's community told me that installing FreeIPA on a Debian system is really hard, and since our VM is running Debian, it quickly became obvious that I should look somewhere else. I felt a bit sad at the beginning, because I thought FreeIPA would really be our silver bullet here, but then I noticed that it doesn't really offer a self-service user registration.

After exchanging a few emails with Marc, he told me about Keycloak. It's a full-fledged Identity Management and Access Management software, supports OAuth2, LDAP, and provides a self-service user registration system, which is exactly what we needed! However, upon reading the description of the project, I noticed that it is written in Java (JBOSS, to be more specific), and I was afraid that it was going to be very demanding on our system (after all, gerrit is also a Java program). So I decided to put it on hold and take a look at using LDAP...

Oh, man. Where do I start? Actually, I think it's enough to say that I just tried installing OpenLDAP, but gave up because it was too cumbersome to configure. Have you ever heard that LDAP is really complicated? I'm afraid this is true. I just didn't feel like wasting a lot of time trying to understand how it works, only to have to solve the "user registration" problem later (because of course, OpenLDAP is just an LDAP server).

OK, so what now? Back to Keycloak it is. I decided that instead of thinking that it was too big, I should actually install it and check it for real. Best decision, by the way!

Setting up Keycloak

It's pretty easy to set Keycloak up. The official website provides a .tar.gz file which contains the whole directory tree for the project, along with helper scripts, .jar files, configuration, etc. From there, you just need to follow the documentation, edit the configuration, and voilà.

For our specific setup I chose to use PostgreSQL instead of the built-in database. This is a bit more complicated to configure, because you need to download the JDBC driver, and install it in a strange way (at least for me, who is used to just editing a configuration file). I won't go into details on how to do this here, because it's easy to find on the internet. Bear in mind, though, that the official documentation is really incomplete when covering this topic! This is one of the guides I used, along with this other one (which covers MariaDB, but can be adapted to PostgreSQL as well).

Another interesting thing to notice is that Keycloak expects to be running on its own virtual domain, and not under a subdirectory (e.g, https://example.org instead of https://example.org/keycloak). For that reason, I chose to run our instance on another port. It is supposedly possible to configure Keycloak to run under a subdirectory, but it involves editing a lot of files, and I confess I couldn't make it fully work.

A last thing worth mentioning: the official documentation says that Keycloak needs Java 8 to run, but I've been using OpenJDK 11 without problems so far.

Setting up Gerrit

The fun begins now!

The gerrit project also offers a .war file ready to be deployed. After you download it, you can execute it and initialize a gerrit project (or application, as it's called). Gerrit will create a directory full of interesting stuff; the most important for us is the etc/ subdirectory, which contains all of the configuration files for the application.

After initializing everything, you can try starting gerrit to see if it works. This is where I had my first trouble. Gerrit also requires Java 8, but unlike Keycloak, it doesn't work out of the box with OpenJDK 11. I had to make a small but important addition in the file etc/gerrit.config:

[container]
    ...
    javaOptions = "--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED"
    ...

After that, I was able to start gerrit. And then I started trying to set it up for OAuth2 authentication using Keycloak. This took a very long time, unfortunately. I was having several problems with Gerrit, and I wasn't sure how to solve them. I tried asking for help on the official mailing list, and was able to make some progress, but in the end I figured out what was missing: I had forgotten to add the AddEncodedSlashes On in the Apache configuration file! This was causing a very strange error on Gerrit (as you can see, a java.lang.StringIndexOutOfBoundsException!), which didn't make sense. In the end, my Apache config file looks like this:

<VirtualHost *:80>
    ServerName gnutoolchain-gerrit.osci.io

    RedirectPermanent / https://gnutoolchain-gerrit.osci.io/r/
</VirtualHost>

<VirtualHost *:443>
    ServerName gnutoolchain-gerrit.osci.io

    RedirectPermanent / /r/

    SSLEngine On
    SSLCertificateFile /path/to/cert.pem
    SSLCertificateKeyFile /path/to/privkey.pem
    SSLCertificateChainFile /path/to/chain.pem

    # Good practices for SSL
    # taken from: <https://mozilla.github.io/server-side-tls/ssl-config-generator/>

    # intermediate configuration, tweak to your needs
    SSLProtocol             all -SSLv3
    SSLCipherSuite          ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS
    SSLHonorCipherOrder     on
    SSLCompression          off
    SSLSessionTickets       off

    # OCSP Stapling, only in httpd 2.3.3 and later
    #SSLUseStapling          on
    #SSLStaplingResponderTimeout 5
    #SSLStaplingReturnResponderErrors off
    #SSLStaplingCache        shmcb:/var/run/ocsp(128000)

    # HSTS (mod_headers is required) (15768000 seconds = 6 months)
    Header always set Strict-Transport-Security "max-age=15768000"

    ProxyRequests Off
    ProxyVia Off
    ProxyPreserveHost On
    <Proxy *>
        Require all granted
    </Proxy>

    AllowEncodedSlashes On
        ProxyPass /r/ http://127.0.0.1:8081/ nocanon
        #ProxyPassReverse /r/ http://127.0.0.1:8081/r/
</VirtualHost>

I confess I was almost giving up Keycloak when I finally found the problem...

Anyway, after that things went more smoothly. I was finally able to make the user authentication work, then I made sure Keycloak's user registration feature also worked OK...

Ah, one interesting thing: the user logout wasn't really working as expected. The user was able to logout from gerrit, but not from Keycloak, so when the user clicked on "Sign in", Keycloak would tell gerrit that the user was already logged in, and gerrit would automatically log the user in again! I was able to solve this by redirecting the user to Keycloak's logout page, like this:

[auth]
    ...
    logoutUrl = https://keycloak-url:port/auth/realms/REALM/protocol/openid-connect/logout?redirect_uri=https://gerrit-url/
    ...

After that, it was already possible to start worrying about configure gerrit itself. I don't know if I'll write a post about that, but let me know if you want me to.

Conclusion

If you ask me if I'm totally comfortable with the way things are set up now, I can't say that I am 100%. I mean, the set up seems robust enough that it won't cause problems in the long run, but what bothers me is the fact that I'm using technologies that are alien to me. I'm used to setting up things written in Python, C, C++, with very simple yet powerful configuration mechanisms, and an easy to discover what's wrong when something bad happens.

I am reasonably satisfied with the Keycloak logs things, but Gerrit leaves a lot to be desired in that area. And both projects are written in languages/frameworks that I am absolutely not comfortable with. Like, it's really tough to debug something when you don't even know where the code is or how to modify it!

All in all, I'm happy that this whole adventure has come to an end, and now all that's left is to maintain it. I hope that the GDB community can make good use of this new service, and I hope that we can see a positive impact in the quality of the whole patch review process.

My final take is that this is all worth as long as the Free Software and the User Freedom are the ones who benefit.

P.S.: Before I forget, our gerrit instance is running at https://gnutoolchain-gerrit.osci.io.


Back in 2016, when life was simpler, a Fedora GDB user reported a bug (or a feature request, depending on how you interpret it) saying that GDB's gcore command did not respect the COREFILTER_ELF_HEADERS flag, which instructs it to dump memory pages containing ELF headers. As you may or may not remember, I have already written about the broader topic of revamping GDB's internal corefile dump algorithm; it's an interesting read and I recommend it if you don't know how Linux (or GDB) decides which mappings to dump to a corefile.

Anyway, even though the bug was interesting and had to do with a work I'd done before, I couldn't really work on it at the time, so I decided to put it in the TODO list. Of course, the "TODO list" is actually a crack where most things fall through and are usually never seen again, so I was blissfully ignoring this request because I had other major priorities to deal with. That is, until a seemingly unrelated problem forced me to face this once and for all!

What? A regression? Since when?

As the Fedora GDB maintainer, I'm routinely preparing new releases for Fedora Rawhide distribution, and sometimes for the stable versions of the distro as well. And I try to be very careful when dealing with new releases, because a regression introduced now can come and bite us (i.e., the Red Hat GDB team) back many years in the future, when it's sometimes too late or too difficult to fix things. So, a mandatory part of every release preparation is to actually run a regression test against the previous release, and make sure that everything is working correctly.

One of these days, some weeks ago, I had finished running the regression check for the release I was preparing when I noticed something strange: a specific, Fedora-only corefile test was FAILing. That's a no-no, so I started investigating and found that the underlying reason was that, when the corefile was being generated, the build-id note from the executable was not being copied over. Fedora GDB has a local patch whose job is to, given a corefile with a build-id note, locate the corresponding binary that generated it. Without the build-id note, no binary was being located.

Coincidentally or not, at the same I started noticing some users reporting very similar build-id issues on the freenode's #gdb channel, and I thought that this bug had a potential to become a big headache for us if nothing was done to fix it right now.

I asked for some help from the team, and we managed to discover that the problem was also happening with upstream gcore, and that it was probably something that binutils was doing, and not GDB. Hmm...

Ah, so it's ld's fault. Or is it?

So there I went, trying to confirm that it was binutils's fault, and not GDB's. Of course, if I could confirm this, then I could also tell the binutils guys to fix it, which meant less work for us :-).

With a lot of help from Keith Seitz, I was able to bisect the problem and found that it started with the following commit:

commit f6aec96dce1ddbd8961a3aa8a2925db2021719bb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Feb 27 11:34:20 2018 -0800

    ld: Add --enable-separate-code

This is a commit that touches the linker, which is part of binutils. So that means this is not GDB's problem, right?!? Hmm. No, unfortunately not.

What the commit above does is to simply enable the use of --enable-separate-code (or -z separate-code) by default when linking an ELF program on x86_64 (more on that later). On a first glance, this change should not impact the corefile generation, and indeed, if you tell the Linux kernel to generate a corefile (for example, by doing sleep 60 & and then hitting C-\), you will notice that the build-id note is included into it! So GDB was still a suspect here. The investigation needed to continue.

What's with -z separate-code?

The -z separate-code option makes the code segment in the ELF file to put in a completely separated segment than data segment. This was done to increase the security of generated binaries. Before it, everything (code and data) was put together in the same memory region. What this means in practice is that, before, you would see something like this when you examined /proc/PID/smaps:

00400000-00401000 r-xp 00000000 fc:01 798593                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

And now, you will see two memory regions instead, like this:

00400000-00401000 r--p 00000000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         4 kB
Private_Dirty:         0 kB
Referenced:            4 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd mr mw me dw sd
00401000-00402000 r-xp 00001000 fc:01 799548                             /file
Size:                  4 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd ex mr mw me dw sd

A few minor things have changed, but the most important of them is the fact that, before, the whole memory region had anonymous data in it, which means that it was considered an anonymous private mapping (anonymous because of the non-zero Anonymous amount of data; private because of the p in the r-xp permission bits). After -z separate-code was made default, the first memory mapping does not have Anonymous contents anymore, which means that it is now considered to be a file-backed private mapping instead.

GDB, corefile, and coredump_filter

It is important to mention that, unlike the Linux kernel, GDB doesn't have all of the necessary information readily available to decide the exact type of a memory mapping, so when I revamped this code back in 2015 I had to create some heuristics to try and determine this information. If you're curious, take a look at the linux-tdep.c file on GDB's source tree, specifically at the functions dump_mapping_p and linux_find_memory_regions_full.

When GDB is deciding which memory regions should be dumped into the corefile, it respects the value found at the /proc/PID/coredump_filter file. The default value for this file is 0x33, which, according to core(5), means:

Dump memory pages that are either anonymous private, anonymous
shared, ELF headers or HugeTLB.

GDB had the support implemented to dump almost all of these pages, except for the ELF headers variety. And, as you can probably infer, this means that, before the -z separate-code change, the very first memory mapping of the executable was being dumped, because it was marked as anonymous private. However, after the change, the first mapping (which contains only data, no code) wasn't being dumped anymore, because it was now considered by GDB to be a file-backed private mapping!

Finally, that is the reason for the difference between corefiles generated by GDB and Linux, and also the reason why the build-id note was not being included in the corefile anymore! You see, the first memory mapping contains not only the program's data, but also its ELF headers, which in turn contain the build-id information.

gcore, meet ELF headers

The solution was "simple": I needed to improve the current heuristics and teach GDB how to determine if a mapping contains an ELF header or not. For that, I chose to follow the Linux kernel's algorithm, which basically checks the first 4 bytes of the mapping and compares them against \177ELF, which is ELF's magic number. If the comparison succeeds, then we just assume we're dealing with a mapping that contains an ELF header and dump it.

In all fairness, Linux just dumps the first page (4K) of the mapping, in order to save space. It would be possible to make GDB do the same, but I chose the faster way and just dumped the whole mapping, which, in most scenarios, shouldn't be a big problem.

It's also interesting to mention that GDB will just perform this check if:

  • The heuristic has decided not to dump the mapping so far, and;
  • The mapping is private, and;
  • The mapping's offset is zero, and;
  • There is a request to dump mappings with ELF headers (i.e., coredump_filter).

Linux also makes these checks, by the way.

The patch, finally

I submitted the patch to the mailing list, and it was approved fairly quickly (with a few minor nits).

The reason I'm writing this blog post is because I'm very happy and proud with the whole process. It wasn't an easy task to investigate the underlying reason for the build-id failures, and it was interesting to come up with a solution that extended the work I did a few years ago. I was also able to close a few bug reports upstream, as well as the one reported against Fedora GDB.

The patch has been pushed, and is also present at the latest version of Fedora GDB for Rawhide. It wasn't possible to write a self-contained testcase for this problem, so I had to resort to using an external tool (eu-unstrip) in order to guarantee that the build-id note is correctly present in the corefile. But that's a small detail, of course.

Anyway, I hope this was an interesting (albeit large) read!


Heya!

This past Saturday, April 27th, 2019, Samuel Vale, Alex Volkov and I organized the Toronto Bug Squashing Party here in the city. I was very happy with the outcome, especially the fact that we had more than 10 people attending, including a bunch of folks that came from Montréal!

The start

It was a cold day in Toronto, and we met at the Mozilla Toronto office at 9 in the morning. Right there at the door I met anarcat, who had just arrived from Montréal. Together with Alex, we waited for Will to arrive and open the door for us. Then, some more folks started showing up, and we waited until 10:30h to start the first presentation of the day.

Packaging 101

Anarcat kindly gave us his famous "Packaging 101" presentation, in which he explains the basics of Debian packaging. Here's a picture of the presentation:

anarcat presenting Packaging 101, side

And another one:

anarcat presenting Packaging 101, front

The presentation was great, and Alex recorded it! You can watch it here (sorry, youtube link...).

During the day, we've also taught a few tricks about the BTS, in order to help people file bugs, add/remove tags, comment on bugs, etc.

Then, we moved on to the actual hacking.

Bug fixing

This part took most of the day, as was expected. We started by looking at the RC bugs currently filed against Buster, and deciding which ones would be interesting for us. I won't go into details here, but I think we made great progress, considering this was the first BSP for many of us there (myself included).

You can look at the bugs we worked on, and you will see that we have actually fixed 6 of them! I even fixed a JavaScript bug, which is something totally out of my area of expertise ;-).

I also noticed something interesting. The way we look at bugs can vary wildly between one DD and another. I mean, this is something I always knew, especially when I was more involved with the debian-mentors effort, but it's really amazing to feel this in person. I tend to be more picky when it comes to defining what to do when I start to work on a bug; I try really hard to reproduce it (and spend a lot of time doing so), and will really dive deep into the code trying to understand why some test is failing. Other developer may be less "pedantic", and choose to (e.g.) disable certain test that is failing. In the end, I think everything is a balance and I tried to learn from this experience.

Anyway, given that we looked at 12 bugs and solved 6, I think we did great! And this also helped me to get my head "back in the Debian game"; I was too involved with GDB these past months (there's a post about one of the things I did which is coming soon, stay tunned).

Look at us hacking:

Everybody hacking

Wrap up

At 19h (or 7p.m.), we had to wrap up and prepare to go. Because we had a sizeable number of Brazilians in the group (5!), the logical thing to do was to go to a pub and resume the conversation there :-). If I say it was one of the first times I went to a pub to drink with newly made friends in Toronto, you probably wouldn't believe, so I won't say anything...

I know one thing for sure: we want to make this again, and soon! In fact, my idea is to do another one after Buster is released (and after the summer is gone, of course), so maybe October. We'll see.

Acknowledgements

I would like to thank Mozilla Toronto for hosting us; it was awesome to finally visit their office and enjoy their hospitality, personified by Will Hawkins. It is impossible not to thank anarcat, who came all the way from Montréal to give us his Debian Packaging 101 talk. Speaking of the French-Canadian (and Brazilian), it was super awesome meeting Tiago Vaz and Tássia Camões, and it was great seeing Valessio Brito again.

Let me also thank the "locals" who attended the party; it was great seeing everybody there! Hope I can see everybody again when we make the second edition of our BSP :-).


Don't come here


Tags:

If you're brazilian, don't come here. If you voted for the president-elected, don't come here. If you think it's better to have a dead son than a gay son, don't come here. If you think it's OK to kill first and ask later (or perhaps don't even ask), don't come here. If you would like to say the things he said, don't come here. If you think he didn't really mean what he said, don't come here. If you can't understand why the things he said are horrible, don't come here. If you think he is a myth, don't come here. If you think he is a saviour, don't come here.

If you don't understand the absurdity of living in Canada and voting for him, don't come here (and please go away if you do). If you don't understand the hypocrisy of calling a homosexual names while living in a country where homosexual marriage has been legal for more than a decade, don't come here. If you think everybody should have a gun, don't come here.

If you are inhuman, don't come here. If you like authoritarians, don't come here. If you can't see he is an authoritarian, don't come here. If you are absolutely sure of everything and don't question yourself, don't come here. If you use (anti)social networks to inform yourself, don't come here. If you think torture is not a big deal, don't come here. If you praise a torturer, don't come here. If you think closing the Congress is a good idea, don't come here. If you think economics are what really matter, don't come here. If you can't feel empathy for another human being who is not related to you, don't come here. If you try to rationalize your vote for him, try to justify to yourself that he is not as bad as he himself says, don't come here. If you are not concerned about more deaths, don't come here.

If you think killing more people is going to improve things, don't come here. If you think the police should be allowed to kill without suffering any consequences, don't come here. If you see a dead black person and think "(s)he must have done something wrong, so it's justified that (s)he was killed", don't come here. If you think the slavery should return, don't come here. If you think indigenous people are lazy and should be killed or have no rights, don't come here. If you have no regard for our forests and our environment, don't come here. If you think the climate change is a hoax, don't come here. If you are just interested in the money, don't come here.

If you blame PT for everything bad that's happened to Brazil, don't come here. If you think people who don't blame PT for everything bad that's happened to Brazil are communists, don't come here. If you think everyone who disagrees with you is a communist, don't come here. If you think Francis Fukuyama is a communist, don't come here. If you think you know what communism is, but only read about it on (anti)social networks, don't come here.

If you are already here, then I have everything against you, and I think you should leave.


Hello, Planet Debian


Tags:

Hey, there. This is long overdue: my entry in Planet Debian! I'm creating this post because, until now, I didn't have a debian tag in my blog! Well, not anymore.

Stay tunned!


Dreaming...


Tags:

Back then, I still wanted to write something. Back then, life was different, and I had another view of myself and of others. Back then, my house of cards was still standing, giving the impression that it was safe and sound, that its foundation was solid, and that nothing would shake it. But that was back then.

Right now, I have lost my will and my power to concentrate, to focus on what really matters, because what really matters is still undefined. Right now, things don't seem to fit as they once did; the vision blurs and I am not so sure what it is that I should be doing but am not. Right now, my self has become another one. Someone that doesn't remind me of anybody in particular.

Struggling, defining, living and knowing. These are constant words, constant feelings and actions that live with me. Who am I? What do I like? What do I don't like? Am I good in what I thought I was good? Am I feeling what I think I'm feeling?

This is more than the impostor syndrome. But it is less than the Stockholm syndrome. It's somewhere in between, or maybe nowhere. When I woke up and decided to keep going, I knew it was a temporary decision. It still is. I still have to find what I missed, or what I have never found. What to do? Too hard of a question to answer right now. Here's hoping that time will help me with this hard, but long-wanted task.


After spending the last weeks struggling with this, I decided to write a blog post. First, what is “this” that you are talking about? The answer is: Linux kernel's concept of memory mapping. I found it utterly confused, beyond my expectations, and so I believe that a blog post is the write way to (a) preserve and (b) share this knowledge. So, let's do it!

First things first

First, I cannot begin this post without a few acknowledgements and “thank you's”. The first goes to Oleg Nesterov (sorry, I could not find his website), a Linux kernel guru who really helped me a lot through the whole task. Another “thank you” goes to Jan Kratochvil, who also provided valuable feedback by commenting my GDB patch. Now, back to the point.

The task

The task was requested here: GDB needed to respect the /proc/<PID>/coredump_filter file when generating a coredump (i.e., when you use the gcore command).

Currently, GDB has his own coredump mechanism implemented which, despite its limitations and bugs, has been around for quite some time. However, and maybe you don't know that, but the Linux kernel has its own algorithm for generating the corefile of a process. And unfortunately, GDB and Linux were not really following the same standards here...

So, in the end, the task was about synchronizing GDB and Linux. To do that, I first had to decipher the contents of the /proc/<PID>/smaps file.

The /proc/<PID>/smaps file

This special file, generated by the Linux kernel when you read it, contains detailed information about each memory mapping of a certain process. Some of the fields on this file are documented in the proc(5) manpage, but others are missing there (asking for a patch!). Here is an explanation of everything I needed:

  • The first line of each memory mapping has the following format:

    The fields here are:

    a) address is the address range, in the process' address space, that the mapping occupies. This part was already treated by GDB, so I did not have to worry about it.

    b) perms is a set of permissions (r ead, w rite, e x ecute, s hared, p rivate [COW -- copy-on-write]) applied to the memory mapping. GDB was already dealing with rwx permissions, but I needed to include the p flag as well. I also made GDB ignore the mappings that did not have the r flag active, because it does not make sense to dump something that you cannot read.

    c) offset is the offset into the applied to the file, if the mapping is file-backed (see below). GDB already handled this correctly.

    d) dev is the device (major:minor) related to the file, if there is one. GDB already handled this correctly, though I was using this field for more things (continue reading).

    e) inode is the inode on the device above. The value of zero means that no inode is associated with the memory mapping. Nothing to do here.

    f) pathname is the file associate with this mapping, if there is one. This is one of the most important fields that I had to use, and one of the most complicated to understand completely. GDB now uses this to heuristically identify whether the mapping is anonymous or not.

  • GDB is now also interested in Anonymous: and AnonHugePages: fields from the smaps file. Those fields represent the content of anonymous data on the mapping; if GDB finds that this content is greater than zero, this means that the mapping is anonymous.

  • The last, but perhaps most important field, is the VmFlags: field. It contains a series of two-letter flags that provide very useful information about the mapping. A description of the fields is: a) sh: the mapping is shared (VM_SHARED) b) dd: this mapping should not be dumped in a corefile (VM_DONTDUMP) c) ht: this is HugeTLB mapping

With that in hands, the following task was to be able to determine whether a memory mapping is anonymous or file-backed, private or shared.

Types of memory mappings

There can be four types of memory mappings:

  1. Anonymous private mapping
  2. Anonymous shared mapping
  3. File-backed private mapping
  4. File-backed shared mapping

It should be possible to uniquely identify each mapping based on the information provided by the smaps file; however, you will see that this is not always the case. Below, I will explain how to determine each of the four characteristics that define a mapping.

Anonymous

A mapping is anonymous if one of these conditions apply:

  1. The pathname associated with it is either /dev/zero (deleted), /SYSV%08x (deleted), or <filename> (deleted) (see below).
  2. There is content in the Anonymous: or in the AnonHugePages: fields of the mapping in the smaps file.

A special explanation is needed for the <filename> (deleted) case. It is not always guaranteed that it identifies an anonymous mapping; in fact, it is possible to have the (deleted) part for file-backed mappings as well (say, when you are running a program that uses shared libraries, and those shared libraries have been removed because of an update, for example). However, we are trying to mimic the behavior of the Linux kernel here, which checks to see if a file has no hard links associated with it (and therefore is truly deleted).

Although it may be possible for the userspace to do an extensive check (by stat ing the file, for example), the Linux kernel certainly could give more information about this.

File-backed

A mapping is file-backed (i.e., not anonymous) if:

  1. The pathname associated with it contains a <filename>, without the (deleted) part.

As has been explained above, a mapping whose pathname contains the (deleted) string could still be file-backed, but we decide to consider it anonymous.

It is also worth mentioning that a mapping can be simultaneously anonymous and file-backed: this happens when the mapping contains a valid pathname (without the (deleted) part), but also contains Anonymous: or AnonHugePages: contents.

Private

A mapping is considered to be private (i.e., not shared) if:

  1. In the absence of the VmFlags field (in the smaps file), its permission field has the flag p.
  2. If the VmFlags field is present, then the mapping is private if we do not find the sh flag there.

Shared

A mapping is shared (i.e., not private) if:

  1. In the absence of VmFlags in the smaps file, the permission field of the mapping does not have the p flag. Not having this flag actually means VM_MAYSHARE and not necessarily VM_SHARED (which is what we want), but it is the best approximation we have.
  2. If the VmFlags field is present, then the mapping is shared if we find the sh flag there.

The patch

With all that in mind, I hacked GDB to improve the coredump mechanism for GNU/Linux operating systems. The main function which decides the memory mappings that will or will not be dumped on GNU/Linux is linux_find_memory_regions_full; the Linux kernel obviously uses its own function, vma_dump_size, to do the same thing.

Linux has one advantage: it is a kernel, and therefore has much more knowledge about processes' internals than a userspace program. For example, inside Linux it is trivial to check if a file marked as "(deleted)" in the output of the smaps file has no hard links associated with it (and therefore is not really deleted); the same operation on userspace, however, would require root access to inspect the contents of the /proc/<PID>/map_files/ directory.

The case described above, if you remember, is something that impacts the ability to tell whether a mapping is anonymous or not. I am talking to the Linux kernel guys to see if it is possible to export this information directly via the smaps file, instead of having to do the current heuristic.

While doing this work, some strange behaviors were found in the Linux kernel. Oleg is working on them, along with other Linux hackers. From our side, there is still room for improvement on this code. The first thing I can think of is to improve the heuristics for finding anonymous mappings. Another relatively easy thing to do would be to let the user specify a value for coredump_filter on the command line, without editing the /proc file. And of course, keep this code always updated with its counterpart in the Linux kernel.

Upstream discussions and commit

If you are interested, you can see the discussions that happened upstream by going to this link. This is the fourth (and final) submission of the patch; you should be able to find the other submissions in the archive.

The final commit can be found in the official repository.


The GNU Radical


Tags:

A friend of mine, Blaise, once told me not to introduce myself as “... what you would call a radical...”. He had listened to me talking to a person who were questioning what a Free Software activist does. My friend's rationale, to which I totally agree, is that you must let the other person decide whether she thinks you are a “radical” or not. In other words, if you say you are a “radical” from the beginning, it will probably induce the other person to a pre-judgement about you, which is not good for you and for her.

As I said, I agree with him. But I am going through a lot of situations in my life that are constantly reminding me that, maybe, I am that “radical” after all. I do not know whether this is good or bad, and I can say I have been questioning myself for a while now. This post, by the way, is going to be a lot about self-questioning.

Maybe the problem is that I am expecting too much from those that have the same beliefs that I do. Or maybe the cause is that I do not know what to expect from them in certain situations, and I am disappointed when I see that they do not follow what I think is best sometimes. On the other hand, when I look myself in the mirror, I do not know whether I am totally following what I think is best; and if I am not, then how can I even consider telling others to do that? And even if I am following my own advices, how can I be sure that they are good enough for others?

One good example of this is my opinion about FSF's use of Twitter. The opinion is public, and has been criticized by many people already, including Free Software supporters. Shortly after I wrote the post, I mentioned it to Richard Stallman, and he told me he was not going to read it because he considered it “too emotional”. I felt deeply sad because of his reaction, especially because it came from someone who often appeals to emotions in order to teach what he has to say. But I also started questioning myself about the topic.

Is it really bad to use Twitter? This is what I ask myself sometimes. I see so many people using it, including those who defend Free Software as I do (like Matt Lee), or those who stand against privacy abuses (like Jacob Appelbaum), or who are worried about social causes, or... Yeah, you got the point. I refuse to believe that they did not think about Twitter's issues, or about how they would be endorsing its use by using it themselves. Yet, they are there, and a lot of people is following their posts and discussing their opinions and ideas for a better world. As much as I try to understand their motivation for using Twitter (or even Facebook), I cannot convince myself that what they are doing is good for their goals. Am I being too narrow minded? Am I missing something?

Another example are my thoughts about Free Software programs that support (and sometimes even promote) unethical services. They (the thoughts) are also public. And it seems that this opinion, which is about something I called “Respectful Software”, is too strong (or “radical”?) for the majority of the developers, even considering Free Software developers. I saw very good arguments on why Free Software should support unethical services, and it is hard to disagree with them. I guess the best of those arguments is that when you support unethical services like Facebook, you are offering a Free Software option for those who want or need to use the service. In other words, you are helping them to slowly get rid of the digital handcuffs.

It seems like all those arguments (about Twitter, about implementing support for proprietary systems on Free Software, and others) are ultimately about reaching users that would otherwise remain ignorant of the Free Software philosophy. And how can someone have counter-arguments for this? It is impossible to argue that we do not need to take the Free Software message to everybody, because when someone does not use Free Software, she is doing harm to her community (thus, we want more people using Free Software, of course). When the Free Software Foundation makes use of Twitter to bring more people to the movement, and when I see that despite talking to people all around me I can hardly convince them to try GNU/Linux, who am I to criticize the FSF?

So, I have been thinking to myself whether it is time to change. What I am realizing more and more is that my fight for coherence perhaps is flawed. We are incoherent by nature. And the truth is that, no matter what we do, people change according to their own time, their own will, and their own beliefs (or to the lack of them). I remembered something that I once heard: changing is not binary, changing is a process. So, after all, maybe it is time to stop being a “GNU radical” (in the sense that I am radical even for the GNU project), and become a new type of activist.


To what extent should Free Software respect its users?

The question, strange as it may sound, is not only valid but also becoming more and more important these days. If you think that the four freedoms are enough to guarantee that the Free Software will respect the user, you are probably being oversimplistic. The four freedoms are essential, but they are not sufficient. You need more. I need more. And this is why I think the Free Software movement should have been called the Respectful Software movement.

I know I will probably hear that I am too radical. And I know I will hear it even from those who defend Free Software the way I do. But I need to express this feeling I have, even though I may be wrong about it.

It all began as an innocent comment. I make lots of presentations and talks about Free Software, and, knowing that the word “Free” is ambiguous in English, I started joking that Richard Stallman should have named the movement “Respectful Software”, instead of “Free Software”. If you think about it just a little, you will see that “respect” is a word that brings different interpretations to different people, just as “free” does. It is a subjective word. However, at least it does not have the problem of referring to completely unrelated things such as “price” and “freedom”. Respect is respect, and everybody knows it. What can change (and often does) is what a person considers respectful or not.

(I am obviously not considering the possible ambiguity that may exist in another language with the word “respect”.)

So, back to the software world. I want you to imagine a Free Software. For example, let's consider one that is used to connect to so-called “social networks” like GNU Social or pump.io. I do not want to use a specific example here; I am more interested in the consequences of a certain decision. Which decision? Keep reading :-).

Now, let's imagine that this Free Software is just beginning its life, probably in some code repository under the control of its developer(s), but most likely using some proprietary service like GitHub (which is an issue by itself). And probably the developer is thinking: “Which social network should my software support first?”. This is an extremely valid and important question, but sometimes the developer comes up with an answer that may not be satisfactory to its users. This is where the “respect” comes into play.

In our case, this bad answer would be “Facebook”, “Twitter”, “Linkedin”, or any other unethical social network. However, those are exactly the easiest answers for many and many Free Software developers, either because those “vampiric” services are popular among users, or because the developer him/herself uses them!! By now, you should be able to see where I am getting at. My point, in a simple question, is: “How far should we, Free Software developers, allow users to go and harm themselves and the community?”. Yes, this is not just a matter of self-inflicted restrictions, as when the user chooses to use a non-free software to edit a text file, for example. It is, in most cases, a matter of harming the community too. (I have written a post related to this issue a while ago, called “Privacy as a Collective Good”.)

It should be easy to see that it does not matter if I am using Facebook through my shiny Free Software application on my computer or cellphone. What really matters is that, when doing so, you are basically supporting the use of those unethical social networks, to the point that perhaps some of your friends are also using them because of you. What does it matter if they are using Free Software to access them or not? Is the benefit offered by the Free Software big enough to eliminate (or even soften) the problems that exist when the user uses an unethical service like Linkedin?

I wonder, though, what is the limit that we should obey. Where should we draw the line and say “I will not pass beyond this point”? Should we just “abandon” the users of those unethical services and social networks, while we lock ourselves in our not-very-safe world? After all, we need to communicate with them in order to bring them to our cause, but it is hard doing so without getting our hands dirty. But that is a discussion to another post, I believe.

Meanwhile, I could give plenty of examples of existing Free Softwares that are doing a disservice to the community by allowing (and even promoting) unethical services or solutions for their users. They are disrespecting their users, sometimes exploiting the fact that many users are not fully aware of privacy issues that come as a “gift” when you use those services, without spending any kind of effort to teach the users. However, I do not want this post to become a flamewar, so I will not mention any software explicitly. I think it should be quite easy for the reader to find examples out there.

Perhaps this post does not have a conclusion. I myself have not made my mind completely about the subject, though I am obviously leaning towards what most people would call the “radical” solution. But it is definitely not an easy topic to discuss, or to argument about. Nonetheless, we are closing our eyes to it, and we should not do so. The future of Free Software depends also on what kinds of services we promote, and what kinds of services we actually warn the users against. This is my definition of respect, and this is why I think we should develop Free and Respectful Software.


Yes, you are reading correctly: I decided to buy a freacking Chromebook. I really needed a lightweight notebook with me for my daily hackings while waiting for my subway station, and this one seemed to be the best option available when comparing models and prices. To be fair, and before you throw me rocks, I visited the LibreBoot X60's website for some time, because I was strongly considering buying one (even considering its weight); however, they did not have it in stock, and I did not want to wait anymore, so...

Anyway, as one might expect, configuring GNU/Linux on notebooks is becoming harder as time goes by, either because the infamous Secure Boot (anti-)feature, or because they come with more and more devices that demand proprietary crap to be loaded. But fortunately, it is still possible to overcome most of those problems and still get a GNU/Linux distro running.

References

For main reference, I used the following websites:

I also used other references for small problems that I had during the configuration, and I will list them when needed.

Backing up ChromeOS

The first thing you will probably want to do is to make a recovery image of the ChromeOS that comes pre-installed in the machine, in case things go wrong. Unfortunately, to do that you need to have a Google account, otherwise the system will fail to record the image. So, if you want to let Google know that you bought a Chromebook, login into the system, open Chrome, and go to the special URL chrome://imageburner. You will need a 4 GiB pendrive/sdcard. It should be pretty straightforward to do the recording from there.

Screw the screw

Now comes the hard part. This notebook comes with a write-protect screw. You might be thinking: what is the purpose of this screw?

Well, the thing is: Chromebooks come with their own boot scheme, which unfortunately doesn't work to boot Linux. However, newer models also offer a “legacy boot” option (SeaBIOS), and this can boot Linux. So far, so good, but...

When you switch to SeaBIOS (details below), the system will complain that it cannot find ChromeOS, and will ask if you want to reinstall the system. This will happen every time you boot the machine, because the system is still entering the default BIOS. In order to activate SeaBIOS, you have to press CTRL-L (Control + L) every time you boot! And this is where the screw comes into play.

If you remove the write-protect screw, you will be able to make the system use SeaBIOS by default, and therefore will not need to worry about pressing CTRL-L every time. Sounds good? Maybe not so much...

The first thing to consider is that you will lose your warranty the moment you open the notebook case. As I was not very concerned about it, I decided to try to remove the screw, and guess what happened? I stripped the screw! I am still not sure why that happened, because I was using the correct screw driver for the job, but when I tried to remove the screw, it seemed like butter and started to “decompose”!

Anyway, after spending many hours trying to figure out a way to remove the screw, I gave up. My intention is to always suspend the system, so I rarely need to press CTRL-L anyway...

Well, that's all I have to say about this screwed screw. If you decide to try removing it, keep in mind that I cannot help you in any way, and that you are entirely responsible for what happens.

Now, let's install the system :-).

Enable Developer Mode

You need to enable the Developer Mode in order to be able to enable SeaBIOS. To do that, follow these steps from the Arch[GNU/]Linux wiki page.

I don't remember if this step works if you don't have activated the ChromeOS (i.e., if you don't have a Google account associated with the device). For my use, I just created a fake account to be able to proceed.

Accessing the superuser shell inside ChromeOS

Now, you will need to access the superuser (root) shell inside ChromeOS, to enable SeaBIOS. Follow the steps described in the Arch[GNU/]Linux wiki page. For this specific step, you don't need to login, which is good.

Enabling SeaBIOS

We're almost there! The last step before you boot your Fedora LiveUSB is to actually enable SeaBIOS. Just go inside your superuser shell (from the previous step) and type:

1
> crossystem dev_boot_usb=1 dev_boot_legacy=1

And that's it!

If you managed to successfuly remove the write-protect screw, you may also want to enable booting SeaBIOS by default. To do that, there is a guide, again on Arch[GNU/]Linux wiki. DO NOT DO THAT IF YOU DID NOT REMOVE THE WRITE-PROTECT SCREW!!!!

Booting Fedora

Now, we should finally be able to boot Fedora! Remember, you will have to press CTRL-L after you reboot (if you have not removed the write-protect screw), otherwise the system will just complain and not boot into SeaBIOS. So, press CTRL-L, choose the boot order (you will probably want to boot from USB first, if your Fedora is on a USB stick), choose to boot the live Fedora image, and... bum!! You will probably see a message complaining that there was not enough memory to boot (the message is “Not enough memory to load specified image”).

You can solve that by passing the mem parameter to Linux. So, when GRUB complains that it was unable to load the specified image, it will give you a command prompt (boot:), and you just need to type:

#!bash boot: linux mem=1980M

And that's it, things should work.

Installing the system

I won't guide you through the installation process; I just want to remember you that you have a 32 GiB SSD drive, so think carefully before you decide how you want to set up the partitions. What I did was to reserve 1 GB for my swap, and take all the rest to the root partition (i.e., I did not create a separate /home partition).

You will also notice that the touchpad does not work (neither does the touchscreen). So you will have to do the installation using a USB mouse for now.

Getting the touchpad to work

I strongly recommend you to read this Fedora bug, which is mostly about the touchpad/touchscreen support, but also covers other interesting topics as well.

Anyway, the bug is still being constantly updated, because the proposed patches to make the touchpad/touchscreen work were not fully integrated into Linux yet. So, depending on the version of Linux that you are running, you will probably need to run a different version of the scripts that are being kindly provided in the bug.

As of this writing, I am running Linux 3.16.2-201.fc20, and the script that does the job for me is this one. If you are like me, you will never run a script without looking at what it does, so go there and do it, I will wait :-).

OK, now that you are confident, run the script (as root, of course), and confirm that it actually installs the necessary drivers to make the devices work. In my case, I only got the touchpad working, even though the touchscreen is also covered by this script. However, since I don't want the touchscreen, I did not investigate this further.

After the installation, reboot your system and at least your touchpad should be working :-). Or kind of...

What happened to me was that I was getting strange behaviors with the touchpad. Sometimes (randomly), its sensitivity became weird, and it was very hard to move the pointer or to click on things. Fortunately, I found the solution in the same bug, in this comment by Yannick Defais. After creating this X11 configuration file, everything worked fine.

Getting suspend to work

Now comes the hard part. My next challenge was to get suspend to work, because (as I said above) I don't want to poweroff/poweron every time.

My first obvious attempt was to try to suspend using the current configuration that came with Fedora. The notebook actually suspended, but then it resumed 1 second later, and the system froze (i.e., I had to force the shutdown by holding the power button for a few seconds). Hmm, it smelled like this would take some effort, and my nose was right.

After a lot of search (and asking in the bug), I found out about a few Linux flags that I could provide in boot time. To save you time, this is what I have now in my /etc/default/grub file:

1
GRUB_CMDLINE_LINUX="tpm_tis.force=1 tpm_tis.interrupts=0 ..."

The final ... means that you should keep whatever was there before you included those parameters, of course. Also, after you edit this file, you need to regenerate the GRUB configuration file on /boot. Run the following command as root:

1
> grub2-mkconfig -o /boot/grub2/grub.cfg

Then, after I rebooted the system, I found that only adding those flags was still not enough. I saw a bunch of errors on dmesg, which showed me that there was some problem with EHCI and xHCI. After a few more research, I found the this comment on an Arch[GNU/]Linux forum. Just follow the steps there (i.e., create the necessary files, especially the /usr/lib/systemd/system-sleep/cros-sound-suspend.sh), and things should start to get better. But not yet...

Now, you will see that suspend/resume work OK, but when you suspend, the system will still resume after 1 second or so. Basically, this happens because the system is using the touchpad and the touchscreen to determine whether it should resume from suspend or not. So basically what you have to do is to disable those sources of events:

1
2
echo TPAD > /proc/acpi/wakeup
echo TSCR > /proc/acpi/wakeup

And voilà! Now everything should work as expected :-). You might want to issue those commands every time you boot the system, in order to get suspend to work every time, of course. To do that, you can create a /etc/rc.d/rc.local, which gets executed when the system starts:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
> cat /etc/rc.d/rc.local
#!/bin/bash

suspend_tricks()
{
  echo TPAD > /proc/acpi/wakeup
  echo TSCR > /proc/acpi/wakeup
}

suspend_tricks

exit 0

Don't forget to make this file executable:

1
> chmod +x /etc/rc.d/rc.local

Conclusion

Overall, I am happy with the machine. I still haven't tried installing Linux-libre on it, so I am not sure if it can work without binary blobs and proprietary craps.

I found the keyboard comfortable, and the touchpad OK. The only extra issue I had was using the Canadian/French/whatever keyboard that comes with it, because it lacks some useful keys for me, like Page Up/Down, Insert, and a few others. So far, I am working around this issue by using xbindkeys and xvkdb.

I do not recommend this machine if you are not tech-savvy enough to follow the steps listed in this post. If that is the case, then consider buying a machine that can easily run GNU/Linux, because you feel much more comfortable configuring it!