Notes from OSS/ELC Europe 2020 - KernelCI Foundation

The OSS/ELC Europe 2020 conference took place online from 26th to 28th October. There was one BoF session and one talk about KernelCI, followed by an impromptu video call. The notes below were gathered based on these events.

BoF: Lessons Learned

Guillaume Tucker, Collabora

A lot has happened since KernelCI was announced as a new Linux Foundation project at ELC-E 2019 in Lyon. One year on, what have we learnt?

See the full Event description for slides and more details. Below are a list of Q&A gathered from the session.

Q: I wonder if you plan to add any subsystem-specific CI? Are there any plans/ideas? e.g. for scsi drivers

There are already subsystem-specific tests being run, and subsystem branches can be monitored. Then results can be sent to subsystem mailing lists. For example, this is the case with v4l2: kernelci.org runs v4l2-compliance on a number of platforms for several branches including the media tree, mainline, stable and linux-next, and sends reports with regressions.

There should not be any subsystem-specific infrastructure needed on kernelci.org, but rather different tests and maybe different parameters to adjust to the workflows according to maintainers’ needs.

Q: Some time ago there was a way to search for test runs in a specific lab. I mean on the dashboard. But it seems this feature is gone now. Was that intended? Is it coming back? Can we help and contribute here? 🙂

The web frontend was scaled down to accommodate for functional testing rather than boot testing. This was because all the boot testing search pages were tailor-made, which doesn’t scale very well and is very hard to maintain.

We’re now looking into a fresh web dashboard design with flexible search features to be able to do that. As a first step, we are collecting user stories. If you have any, such as “I want to find out all the test results for the devices in my lab”, feel free to reply to this thread:
https://groups.io/g/kernelci/topic/rfc_dashboards/77367531
“RFC: dashboards, visualization and analytics for test results”

Q: What is the relationship between KernelCI project and LAVA project? Does KernelCI have non-upstream changes to LAVA? Do LAVA people participate in KernelCI?

LAVA is used in many test labs that provide results to KernelCI, but KernelCI doesn’t run any labs itself. Some people do contribute to both, as KernelCI is one of the biggest public use-cases of LAVA, but they really are independent projects. The core KernelCI tools are designed to facilitate working with LAVA labs, but this is not a requirement and other test lab frameworks are also used.

Q: Is there any documentation on how to write those “custom” tests and to integrate it with KernelCI? (e.g. the SCSI drivers/storage devices you just mention before)

See Khouloud Touils’ talk Let’s Test with KernelCI with some hands-on examples.

There is also the user guide as part of the KernelCI documentation:
https://github.com/kernelci/kernelci-core/blob/master/doc/kci_testsuite.md

Each test is a bit different as they all have their own dependencies and are written in various languages. Typically, they will require a user-space image with all the required packages installed to be able to run as well as the latest versions of some test suites built from source. This is the case with v4l-utils, igt-gpu-tools or LTP. Some are plain scripts and don’t depend on anything in particular, such as bootrr.

When prototyping some new tests to run in LAVA, the easiest approach is to use nfsroot with the plain Debian Buster image provided by KernelCI and install extra packages at runtime, before starting the tests. Then when this is working well, dependencies and any data files can be baked into a fixed rootfs image for performance and reproducibility.

Q: How to properly deal with boards which are able to boot only from a mass-storage device and prevent them from being stuck with a non-working image?

In order to be useful with KernelCI, it’s required to at least be able to dynamically load the kernel image as well as any modules and device tree with a ramdisk for the tests that fit in a small enough image. If this can’t be done, then the kernel and user-space images need to be written to the persistent storage before each job. It might also be possible to load the kernel over TFTP and then extract the image onto the persistent storage and use it as a chroot. Ultimately this is the lab’s responsibility and it will depend on many things. If the kernel and the user-space can’t be changed at all, or if there is a possibility of bricking the device, then it’s basically not practical to do any CI on such a platform.

Let’s Test with KernelCI

Khouloud Touil, Baylibre

A growing number of Linux developers want to use KernelCI to run their test suites, but there’s a bit of a learning curve for how to make test suites work with KernelCI. “Let’s Test with KernelCI” will give an overview of the ways to integrate test suites and/or test results into the KernelCI modular pipeline.

See the full Event description for more details. Below are a list of Q&A gathered from the session.

Q: Is there also support for custom YP/OE distros or is it currently limited to the usage of predefined kernels and file systems?

The kernels are all built with regular “make”, not any packaging or yocto recipe is supported right now. However, that could be done with a bit of plumbing. Then for user-space, kernelci only really tests the kernel: the buildroot and debian images are only there to be able to run kernel tests. If you create your own KernelCI instance, you can run tests with your own user-space built using Yocto and extend testing to cover some user-space if you want.

Q: Is there some kind of test config to require a certain kernel flag active? I am basically thinking about running some pre-defined test base, based on my own kernel config and then report back the results with something like “ran test X, which requires kernel config flag Y, on architecture/platform Z on kernel version V”.

Yes, there are a couple of ways to adjust the kernel config on kernelci.org. One way is with a special syntax like defconfig+CONFIG_SOMETHING=y. Another way is to define a config fragment. Each KernelCI test result will have the information you mentioned as meta-data.

Q: Which firewall streams must be permitted in order for KernelCI to use a custom Lab? I mean if we want to contribute a lab (with associated boards) to KernelCI.org?

LAVA exposes a REST API over HTTPS. It’s also possible to have the LAVA server hosted publicly and using LAVA dispatchers in a private network which will be connecting to the server as clients, with no incoming connections.

When not using LAVA, you can also periodically poll storage.kernelci.org for new kernel builds to appear, and download them to test them then send results to api.kernelci.org. In this case, no incoming connections are required either.

Q: In real life how are tests that need to check hardware I/O done? For example in your audio playback case it’s probably not enough to run the play command but we want to check that something was actually played e.g. by capturing the output.

For audio (and video), some hardware has loopback devices which can be used to compare against expected output. For more advanced setup, labs can have external capture equipment as well. But this ends up to be lab-specific since there are many ways to do it.

Follow-up impromptu video discussion

As we neared the end of the time slot for the “Let’s Test with KernelCI” talk, we decided to start a public video call with anyone who was interested and attended the talk. We discussed various general things about the project, and a few notes and Q&A were captured:

Q: How can a test lab get added to kernelci.org?

This is something that would require better documentation. We can distinguish 3 different “levels” of integration for labs:

LAVA-style: fully integrated into the pipeline
If you have a LAVA lab, it’s the easiest way to contribute test results to KernelCI. It also enables automated bisection and is the most efficient way of getting tests run.
Asynchronous test lab
If you have a test lab with no way to receive requests to run tests, you can look for kernelci.org builds to appear and submit results with kci_data. A typical example is Labgrid. One way to improve this is to implement some notification protocol so these labs could avoid polling and get requests to run tests like the LAVA labs.
Autonomous CI system: KCIDB
With options 1 and 2, tests use kernel builds from kernelci.org and report results to the same backend. This is called the “native” KernelCI tests. Option 3 is for full CI systems creating their own kernel builds and running their own set of tests. The results are sent to the common reporting database using the KCIDB tools.

Q: Where can we find the source and definition of tests visible on kernelci.org frontend?

This is also something that would require better documentation, with a directory of all the test plans and how they are created. Functional tests are fairly recent on kernelci.org, which is why we don’t have that yet.

All the tests are normally defined in the kernelci-core repository. This includes building some test suites from source and including them in user-space rootfs images, and defining how to run the tests.

User story: Checking results for devices in “my” lab across all the branches and revisions.