Notes from OSS/ELC Europe 2020

The OSS/ELC Europe 2020 conference took place online from 26th to 28th October. There was one BoF session and one talk about KernelCI, followed by an impromptu video call. The notes below were gathered based on these events.

BoF: Lessons Learned

Guillaume Tucker, Collabora

A lot has happened since KernelCI was announced as a new Linux Foundation project at ELC-E 2019 in Lyon. One year on, what have we learnt?

See the full Event description for slides and more details. Below are a list of Q&A gathered from the session.

Q: I wonder if you plan to add any subsystem-specific CI? Are there any plans/ideas? e.g. for scsi drivers

There are already subsystem-specific tests being run, and subsystem branches can be monitored. Then results can be sent to subsystem mailing lists. For example, this is the case with v4l2: kernelci.org runs v4l2-compliance on a number of platforms for several branches including the media tree, mainline, stable and linux-next, and sends reports with regressions.

There should not be any subsystem-specific infrastructure needed on kernelci.org, but rather different tests and maybe different parameters to adjust to the workflows according to maintainers’ needs.

Q: Some time ago there was a way to search for test runs in a specific lab. I mean on the dashboard. But it seems this feature is gone now. Was that intended? Is it coming back? Can we help and contribute here? 🙂

The web frontend was scaled down to accommodate for functional testing rather than boot testing. This was because all the boot testing search pages were tailor-made, which doesn’t scale very well and is very hard to maintain.

We’re now looking into a fresh web dashboard design with flexible search features to be able to do that. As a first step, we are collecting user stories.  If you have any, such as “I want to find out all the test results for the devices in my lab”, feel free to reply to this thread:
https://groups.io/g/kernelci/topic/rfc_dashboards/77367531
“RFC: dashboards, visualization and analytics for test results”

Q: What is the relationship between KernelCI project and LAVA project? Does KernelCI have non-upstream changes to LAVA? Do LAVA people participate in KernelCI?

LAVA is used in many test labs that provide results to KernelCI, but KernelCI doesn’t run any labs itself. Some people do contribute to both, as KernelCI is one of the biggest public use-cases of LAVA, but they really are independent projects. The core KernelCI tools are designed to facilitate working with LAVA labs, but this is not a requirement and other test lab frameworks are also used.

Q: Is there any documentation on how to write those “custom” tests and to integrate it with KernelCI? (e.g. the SCSI drivers/storage devices you just mention before)

See Khouloud Touils’ talk Let’s Test with KernelCI with some hands-on examples.

There is also the user guide as part of the KernelCI documentation:
https://github.com/kernelci/kernelci-core/blob/master/doc/kci_testsuite.md

Each test is a bit different as they all have their own dependencies and are written in various languages. Typically, they will require a user-space image with all the required packages installed to be able to run as well as the latest versions of some test suites built from source. This is the case with v4l-utils, igt-gpu-tools or LTP. Some are plain scripts and don’t depend on anything in particular, such as bootrr.

When prototyping some new tests to run in LAVA, the easiest approach is to use nfsroot with the plain Debian Buster image provided by KernelCI and install extra packages at runtime, before starting the tests. Then when this is working well, dependencies and any data files can be baked into a fixed rootfs image for performance and reproducibility.

Q: How to properly deal with boards which are able to boot only from a mass-storage device and prevent them from being stuck with a non-working image?

In order to be useful with KernelCI, it’s required to at least be able to dynamically load the kernel image as well as any modules and device tree with a ramdisk for the tests that fit in a small enough image. If this can’t be done, then the kernel and user-space images need to be written to the persistent storage before each job. It might also be possible to load the kernel over TFTP and then extract the image onto the persistent storage and use it as a chroot. Ultimately this is the lab’s responsibility and it will depend on many things. If the kernel and the user-space can’t be changed at all, or if there is a possibility of bricking the device, then it’s basically not practical to do any CI on such a platform.

Let’s Test with KernelCI

Khouloud Touil, Baylibre

A growing number of Linux developers want to use KernelCI to run their test suites, but there’s a bit of a learning curve for how to make test suites work with KernelCI. “Let’s Test with KernelCI” will give an overview of the ways to integrate test suites and/or test results into the KernelCI modular pipeline.

See the full Event description for more details. Below are a list of Q&A gathered from the session.

Q: Is there also support for custom YP/OE distros or is it currently limited to the usage of predefined kernels and file systems?

The kernels are all built with regular “make”, not any packaging or yocto recipe is supported right now. However, that could be done with a bit of plumbing. Then for user-space, kernelci only really tests the kernel: the buildroot and debian images are only there to be able to run kernel tests. If you create your own KernelCI instance, you can run tests with your own user-space built using Yocto and extend testing to cover some user-space if you want.

Q: Is there some kind of test config to require a certain kernel flag active? I am basically thinking about running some pre-defined test base, based on my own kernel config and then report back the results with something like “ran test X, which requires kernel config flag Y, on architecture/platform Z on kernel version V”.

Yes, there are a couple of ways to adjust the kernel config on kernelci.org. One way is with a special syntax like defconfig+CONFIG_SOMETHING=y. Another way is to define a config fragment. Each KernelCI test result will have the information you mentioned as meta-data.

Q: Which firewall streams must be permitted in order for KernelCI to use a custom Lab? I mean if we want to contribute a lab (with associated boards) to KernelCI.org?

LAVA exposes a REST API over HTTPS. It’s also possible to have the LAVA server hosted publicly and using LAVA dispatchers in a private network which will be connecting to the server as clients, with no incoming connections.

When not using LAVA, you can also periodically poll storage.kernelci.org for new kernel builds to appear, and download them to test them then send results to api.kernelci.org. In this case, no incoming connections are required either.

Q: In real life how are tests that need to check hardware I/O done? For example in your audio playback case it’s probably not enough to run the play command but we want to check that something was actually played e.g. by capturing the output.

For audio (and video), some hardware has loopback devices which can be used to compare against expected output. For more advanced setup, labs can have external capture equipment as well. But this ends up to be lab-specific since there are many ways to do it.

Follow-up impromptu video discussion

As we neared the end of the time slot for the “Let’s Test with KernelCI” talk, we decided to start a public video call with anyone who was interested and attended the talk. We discussed various general things about the project, and a few notes and Q&A were captured:

Q: How can a test lab get added to kernelci.org?

This is something that would require better documentation. We can distinguish 3 different “levels” of integration for labs:

  1. LAVA-style: fully integrated into the pipeline
    If you have a LAVA lab, it’s the easiest way to contribute test results to KernelCI. It also enables automated bisection and is the most efficient way of getting tests run.
  2. Asynchronous test lab
    If you have a test lab with no way to receive requests to run tests, you can look for kernelci.org builds to appear and submit results with kci_data. A typical example is Labgrid. One way to improve this is to implement some notification protocol so these labs could avoid polling and get requests to run tests like the LAVA labs.
  3. Autonomous CI system: KCIDB
    With options 1 and 2, tests use kernel builds from kernelci.org and report results to the same backend.  This is called the “native” KernelCI tests. Option 3 is for full CI systems creating their own kernel builds and running their own set of tests. The results are sent to the common reporting database using the KCIDB tools.

Q: Where can we find the source and definition of tests visible on kernelci.org frontend?

This is also something that would require better documentation, with a directory of all the test plans and how they are created. Functional tests are fairly recent on kernelci.org, which is why we don’t have that yet.

All the tests are normally defined in the kernelci-core repository. This includes building some test suites from source and including them in user-space rootfs images, and defining how to run the tests.

User story: Checking results for devices in “my” lab across all the branches and revisions.

KernelCI Notes from Plumbers 2020

The Linux Plumbers Conference 2020 was held as a virtual event this year. The online platform provided a really good experience, with talks and live discussions using Big Blue Button for the video and Rocket Chat for text-based discussions. KernelCI was mentioned many times in several micro-conferences, with two talks in Testing & Fuzzing which are now available on YouTube:

The notes below were gathered publicly from a number of attendees, they give a good insight into what was discussed. In short, while there is still a lot to be done, the KernelCI project is healthy and growing well in its role of a central CI system for the upstream Linux kernel.

Real-Time Linux

We’ve been making great progress with running LAVA jobs using the test-definitions repository from Linaro, thanks to Daniel Wagner’s help in particular. This was prompted by the discussions in the real-time micro-conference.

The next steps from a KernelCI infrastructure point of view is to be able to detect performance regressions, as these are different to binary pass/fail results. KernelCI can already handle measurements, but not yet compare them to detect regressions. Real-time getting merged upstream means it is becoming increasingly important to be able to support this.

There was also an interesting talk about determining the scheduler latency when using RT_PREEMPT and the introduction of a new tool “rtsl” to trace real-time latency.  This might be an interesting area to investigate and potentially run automated tests with:

Static Analysis

The topic of static analysis and CI systems came up during the Kernel Dependability MC, and in particular, they were looking for a place to do “common reporting” in order to collect results for the various types of static analysis and checkers available.  We pointed them to the KernelCI common reporting talks/BoFs.

Some static analysis can also be done by KernelCI “native” tests using the kernelci.org Cloud infrastructure via Kubernetes, which is currently only used to build kernels. This is probably where KUnit and devicetree validation will be run, but the rest still needs to be defined.

KCIDB

Fuego

Tim Bird, the main developer of Fuego at SONY, started joined the KCIDB BoF  and we had a good discussion. Unfortunately he had not enough time to go through to an actual submission. We got about a quarter way through converting his mock data to KCIDB.

Gentoo Kernel CI

Alice Ferrazzi, maintainer of GKernelCI at Gentoo, had more time available for the KCIDB BoF and we talked through getting the data out of her system. A mockup of her data was made and successfully submitted to the KCIDB playground database setup.

Intel

Tim Orling, Yocto project architect at Intel, has expressed keen interest in KCIDB.  He said he would experiment at home and will push Intel internally to participate.

LLVM/Clang

The recently added support for “LLVM=1” upstream means we can now have better support for making Clang builds. In particular, this means we’re now using all the LLVM binaries and not just clang. It also solved the issue with merge_config.sh and the default CC=gcc in the top-level Makefile.

This was enabled in kernelci.org shortly after LPC.

kselftest

The first kselftest results were produced on staging.kernelci.org during Plumbers as a collective effort.  We have now started enabling them in production, so stay tuned as they should soon start appearing on kernelci.org.

Initial set of results: https://kernelci.org/test/job/next/branch/master/kernel/next-20200923/plan/kselftest/

AutoFDO

AutoFDO will hopefully get merged upstream, once it is it might be useful for CI systems to share profiling data from benchmarking runs in particular.

Building randconfig

The TuxML project carries out some research around Linux kernel builds: determining the build time, what can be optimised, which configurations are not valid… The project could benefit from the kernelci.org Cloud infrastructure to extend its build capacity while also providing more build coverage to kernelci.org. This could be done by detecting kernel configurations that don’t build or lead to problems that can’t be found with the regular defconfigs.

Using tuxmake

The goal of tuxmake is to provide a way to reproduce Linux kernel builds in a controlled environment. This is used primarily by LKFT, but it should be generic enough to cover any use-case related to building kernels. KernelCI uses its kci_build tool to generate kernel configurations and produce kernel builds with some associated meta-data. It could reuse tuxmake to avoid some duplication of effort and only implement the KernelCI-specific aspects.

Introducing Common Reporting

During the past year we’ve been working on a way to bring reports from all the separate kernel testing systems together. Our aim is to send a single report email to kernel developers when testing results are ready, and to provide a single go-to place to view them.

Read More

KernelCI Community Survey Report

We are thrilled to share with you the results of our first KernelCI Community Survey. It has been a very interesting experience, with just under 100 responses from people who all provided quality feedback. We are really thankful for every single one of them. It was also a great way to engage more widely with the community. The full results are available for everyone to see in a shared spreadsheet. Individual comments are not shared publicly although they are very valuable and will be taken into account.

Continue Reading