Who’s NGI: Thorsten Leemhuis with Linux-Kernel regression tracking bot

The ‘Linux-Kernel regression tracking bot’ project works on a software to automate regression tracking for the Linux-kernel with one goal: help ensure future versions of this crucial building block of modern IT work as good as today’s. Thorsten Leemhuis explains more.

What is this project about?

The Linux kernel is a free and open-source, operating system (OS) kernel that manages computer hardware, software resources, and provides common services for computer programs. Like all OS, it has upgrades and you deal with a ‘regression’ if something that worked with an older version of the Linux kernel does not work with a newer one.

Thorsten Leemhuis
Thorsten Leemhuis with Tux the penguin, mascot of Linux

The project creates software and procedures to track regressions in the Linux-kernel to ensure any rule to fix them is applied thoroughly. By that the project hopes to improve trust in the Linux-kernel development model — and motivate users, admins, and vendors to regularly update to the newer Linux versions, as they bring security fixes, improved hardening techniques, and support for new internet standards. Getting these improvements out in the wild is important to make the internet more secure, as the Linux-kernel is the central software in most devices that build the Internet or connect to it.

Sadly, quite a lot of those devices currently run outdated Linux-versions with known security vulnerabilities. That’s mainly because people are afraid updating to newer Linux versions, as they fear something might break or might work worse than before. The kernel developers are well aware of this fear and fight against it for a long time already: they have a pretty strict “no regressions” rule, which makes them remove a change once they are aware it’s causing a regression, even if that change contain important improvements.

But there is a catch: despite this rule quite a few regressions remain unresolved, as some simply fall through the cracks. That happens because Linux-kernel development uses a slightly unusual development model: everything is done via email and with the help of more than a hundred mailing lists. Even regressions need to be reported by mail, as a central issue tracker does not really exist (there is one, but most developers ignore it and can’t be forced). Regression reports from users and automated test systems due to these factors often don’t reach the right people, don’t get acted upon appropriately, or sometimes simply get forgotten.

The project tries to solve this situation by creating a software that will track reported regressions and compile an overview webpage and a weekly summary. This will ensure that reports are not forgotten and gives everyone insights in the state of things. The latter is something the Linux creator and lead developer Linus Torvalds has been wanting to have for ages, as that makes it easy for him to jump in when the responsible developers don’t adhere the “no regression” appropriately.

This software is called “regzbot” (short for “regression [tracking] robot”) and is specifically tailored to the needs of Linux-kernel development. That’s crucial for the success of this effort, as history has shown that many core Linux-core contributors avoid solutions that create overhead or distractions, even if those solutions work great for many other projects.

That’s why regzbot in the ideal case doesn’t create any overhead for the developers. Reporters on the other hand get an additional burden, but it’s made easy to fulfill: to make regzbot track a report they just need to add a tag like `#regzbot introduced 1234567890ab` or `#regzbot introduced v5.13..v5.14`. Regzbot then will monitor the mailing list thread with the report for further activities and notices if other mails or commit messages refer it – which allows regzbot to automatically mark the issue as resolved when the fix lands, as those are supposed to refer to the report already.

What led you to improving Linux, and where did your passion come from?

In my early days on my parent’s dairy farm I learned to do things properly, to not work on something over and over again or reinvent the wheel multiple times. I was also a member in local youth and sport communities where I noticed there is lots of work involved in running them to keep them alive and enjoyable. Over time, I started to help with the leg-work and noticed I enjoyed doing it for free, as long as others benefit from it — even if they forgot to say “thanks” occasionally, as doing it “for the greater good” was enough of a reward for me. And since getting computers at an age of about eleven I was always interested in how both hardware and software works.

Linux brings all of this together: as an operating system kernel it’s the software that sits mostly invisible in the middle between the software that fills your screen and the hardware it runs on.
Contributing to an Open Source Software like Linux also means that others will benefit from my work. Open Source also makes people and companies work together, which allows developers to invent new things instead of reinventing the wheel over and over again.

How did you come up with the idea for your project?

In the 30-year history of the Linux-kernel two people temporarily took care of regression tracking manually and compiled weekly reports. I’m the second one and revamped the effort after a multi-year hiatus in 2017. But just like my predecessor I had to give up at some point. That mainly happened because it’s pretty boring, cumbersome, thankless and laborious work — and a task which software is able to do, if you write something that suits the exact needs and provide some handholding.

That became clear to me after doing manual regression tracking in my spare time for a while. But spare time was already rare due to the already demanding day-job and other real-life duties. These didn’t leave enough time to write the software I had in mind; and nobody else did, even if many Linux-kernel developers and outsiders agreed that regression tracking is important and having a software to do most of the hard work would be great.

NGI Pointer allowed me to finally get back to this area of work and write the software to automate things. I’m deeply grateful for this opportunity and quite sure that many Linux-kernel developer applaud this effort.

Will you be taking the idea further now that the support from NGI is over? 

The plan is to make regzbot stable, reliable, and capable enough so that it can continue to do its work without much effort by users, developers, or me.

The bot and the people that use it nevertheless will sometimes need some hand-holding. I plan to provide that after the project’s end, even if I get a job again which has nothing to do with Linux-kernel development. But I really hope my work on this project gets noticed and valued enough by Linux developers and their employers that someone might hire me to continue in this space: there simply is so much more that can and should be done to make it more attractive for users, admins and vendors to regularly switch to fresher Linux-kernel versions. A lot is happening there with constant testing by CI systems already. But compared to other Open-Source projects there is nearly nothing ongoing to build and maintain a community of users that help with regularly testing the Linux-kernel. This due to the multitude of hardware and configurations used in the wild and is also something that’s required to ensure that new versions work as good as today’s and are risk-free to update to.

Project’s website: https://linux-regtracking.leemhuis.info/

Thorsten on Twitter: https://twitter.com/kernellogger