Introducing Common Reporting

By August 21, 2020August 24th, 2020Blog

During the past year we’ve been working on a way to bring reports from all the separate kernel testing systems together. Our aim is to send a single report email to kernel developers when testing results are ready, and to provide a single go-to place to view them.

For that purpose we needed a unified schema and a submission protocol which all the testing systems could use to submit their reports, including the existing kernelci.org service itself. Below is an introduction to the third version of the schema and the protocol, based on our close cooperation with Red Hat’s CKI Project, and discussions with maintainers of other testing systems, such as 0-Day and syzbot. It is currently used to aggregate results from both kernelci.org and CKI, present them on the prototype dashboard, and send proof-of-concept notifications to a staging maillist. The development is ongoing in a GitHub repository, and we would like to invite requirements from more testing systems, so we can accommodate their data too.

The Schema

We chose JSON as our data format, and created a single schema to hold any type of the accepted report data, in any combination. The only required data item is the schema version, so the minimum (no-effect) submission would be:

{
    "version": {
        "major": 3,
        "minor": 0
    }
}

Otherwise the report can include “revisions”, “builds”, and “tests” objects, in any amount or combination, stored in arrays under correspondingly-named keys:

Each of the objects has only the minimal number of structural fields required:

{
    "revisions": [
        {
            "id": "84780c5438efd96cfd27fc0d7722aee3b3fe44e6",
            "origin": "kernelci"
        }
    ],
    "builds": [
        {
            "id": "kernelci:staging.kernelci.org:5ee01d0d9625b0a76b8c6d9a",
            "origin": "kernelci",
            "revision_id": "84780c5438efd96cfd27fc0d7722aee3b3fe44e6"
        },
        {
            "id": "kernelci:staging.kernelci.org:5ee01f18d90ac282ea8c6d6a",
            "origin": "kernelci",
            "revision_id": "84780c5438efd96cfd27fc0d7722aee3b3fe44e6"
        }
    ],
    "tests": [
        {
            "id": "kernelci:staging.kernelci.org:5ee01da0f17c85a0878c6d6c",
            "origin": "kernelci",
            "build_id": "kernelci:staging.kernelci.org:5ee01d0d9625b0a76b8c6d9a"
        },
        {
            "id": "kernelci:staging.kernelci.org:5ee01da0f17c85a0878c6d6e",
            "origin": "kernelci",
            "build_id": "kernelci:staging.kernelci.org:5ee01d0d9625b0a76b8c6d9a"
        }
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

The minimal number of required fields makes it easier for the submitters to start sending their reports, since they don’t need to extract much data from their systems, and lets them increase their involvement and contribution gradually.

The object “id” fields (highlighted above) are used to link builds to revisions and tests to builds, and are generated by the submitter, who is responsible for ensuring their IDs are unique across the objects they submit. Database-wide uniqueness is ensured by prepending the submitter’s name (“origin”) to the IDs, with the exception of revisions, which IDs are used to correlate results for the same revisions across submitters.

Linked objects don’t have to appear in the same submissions and can be sent separately. This allows for gradual reporting of long-running tests. E.g. first revisions, when they’re discovered and building is started, then builds as they finish, and then tests as they complete. However, the order could be reversed as well. The data is only considered for reporting when it makes sense and is sufficient.

Finally, the “origin” field allows identifying the testing system which submitted the report to help with debugging, and reporting testing bugs.

Revision

A “revision” object represents a version of the kernel code and describes a way it could be obtained. A commit in a git repo:

could be detected by Red Hat’s CKI (referred to in “origin”) and reported as such:

{
    "revisions": [
        {
            "id": "11a48a5a18c63fd7621bb050228cebf13566e4d8",
            "origin": "redhat",
            "git_repository_url": "https://git.kernel.org/pub/scm/linux/.../rdma.git",
            "git_repository_branch": "for-rc",
            "git_commit_hash": "11a48a5a18c63fd7621bb050228cebf13566e4d8",
            "discovery_time": "2020-02-18T15:50:56.526000+00:00",            
            "valid": true
        }
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

Commit revisions are identified by their commit hash. This one describes a commit in the “for-rc” branch of the “rdma” tree. The “discovery_time” field contains the time at which the submitter noticed it.

A patchset:

Could be reported like this:

{
    "revisions": [
        {
            "id": "391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f263362...e871d8195d777c",
            "origin": "0day",
            "discovery_time": "2020-07-08T07:57:24+03:00",
            "git_repository_url": "https://git.kernel.org/pub/.../atorgue/stm32.git",
            "git_commit_hash": "391e437eedc0dab0a9f2c26997e68e040ae04ea3",
            "git_repository_branch": "stm32-next",
            "patch_mboxes": [
                {
                    "name": "0001-irqchip-stm32-exti-map-direct-event-to-irq-parent.patch",
                    "url": "https://github.com/0day-ci/linux/commit/3f47dd3217f24e.patch"
                }
            ],
            "message_id": "20200706081106.25125-1-alexandre.torgue@st.com",
            "contacts": [
                "Alexandre Torgue <alexandre.torgue@st.com>",
                "kbuild-all <kbuild-all@lists.01.org>",
                "Marc Zyngier <maz@kernel.org>",
                "Thomas Gleixner <tglx@linutronix.de>",
                "Jason Cooper <jason@lakedaemon.net>",
                "LKML <linux-kernel@vger.kernel.org>"
            ],
            "valid": true
        }
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

This one describes a single-patch patchset on top of a commit in the “stm32-next” branch of the “stm32” tree. This is a theoretical submission corresponding to a report from Intel’s 0-day system. The “patch_mboxes” field contains links to patchset’s patches in the application order. The “message_id” field is the Message-ID header of the maillist message representing the patchset (e.g. the cover message, or the message with the last patch). The “contacts” array lists email addresses of contacts concerned with the revision, e.g. all the recipients from the patchset messages.

Finally, the “valid” field could be “false” in case the patches didn’t apply, which could be reported as a reply to the message in “message_id” with recipients from “contacts”, for example.

Patchset revisions are identified by a string containing the base commit hash, a “+” character, and a sha256 hash over sha256 hashes of patches in order, e.g. produced by “sha256sum *.patch | cut -c-64 | sha256sum | cut -c-64”.

Builds

A “build” object represents an (attempted) build of the code described by the linked “revision” object. As mentioned above, only the absolute minimum structural fields are required for all objects, but here’s a (somewhat abbreviated) example of a successful build submitted by KernelCI:

{
    "builds": [
        {
            "id": "kernelci:staging.kernelci.org:5ef1b0a2cf22a40215533477",
            "origin": "kernelci",
            "revision_id": "dd0d718152e4c65b173070d48ea9dfc06894c3e5",
            "architecture": "x86_64",
            "compiler": "gcc (Debian 8.3.0-6) 8.3.0",
            "config_name": "x86_64_defconfig",
            "config_url": "https://storage.staging.kernelci.org/.../gcc-8/kernel.config",
            "description": "v5.8-rc2-23-gdd0d718152e4",
            "duration": 822.14,
            "log_url": "https://storage.staging.kernelci.org/.../gcc-8/build.log",
            "output_files": [
                {
                    "name": "System.map",
                    "url": "https://storage.staging.kernelci.org/.../gcc-8/System.map"
                },
                {
                    "name": "kernel_image",
                    "url": "https://storage.staging.kernelci.org/.../gcc-8/bzImage"
                },
                {
                    "name": "modules",
                    "url": "https://storage.staging.kernelci.org/.../gcc-8/modules.tar.xz"
                }
            ],
            "start_time": "2020-06-23T07:34:58.704000+00:00",
            "valid": true
        },
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

This build report has the architecture, compiler, and the kernel configuration specified (both the name and the file), it also says when it was started and how long it took to complete, has the output log attached, and finally has several output files including the kernel image, the modules and the “System.map” file.

Here’s how a failed build could be reported by CKI:

{
    "builds": [
        {
            "id": "redhat:721904",
            "origin": "redhat",
            "revision_id": "e9842f9e58e1597ad62a7c899e7460bb861d9485",
            "architecture": "aarch64",
            "command": "make -j30 INSTALL_MOD_STRIP=1 targz-pkg",
            "compiler": "aarch64-linux-gnu-gcc (GCC) 9.2.1 20190827 (Red Hat Cross 9.2.1-1)",
            "config_name": "fedora",
            "log_url": "https://cki-artifacts.s3.amazonaws.com/.../497202/build_aarch64.log",
            "start_time": "2020-03-20T10:51:39.425000+00:00",
            "valid": false
        }
    ],  
    "version": {
        "major": 3,
        "minor": 0
    }
}

This has less data compared to the kernelci.org one, in part because it failed (so has no output), and in part because CKI records less build information. However, it specifies the build command, and of course marks the build as failed with “valid” being set to “false”.

Tests

A “test” object describes a test executed in a certain environment on the “build” it links to. Here’s an example of a successful test run submitted by Red Hat’s CKI:

{
    "tests": [
        {
            "id": "redhat:113555631",
            "origin": "redhat",
            "build_id": "redhat:942598",
            "path": "ltp",
            "description": "LTP",
            "start_time": "2020-07-23T04:06:52+00:00",
            "duration": 6827.0,
            "output_files": [
                {
                    "name": "8598136_x86_64_4_console.log",
                    "url": "https://s3.us-east-2.amazonaws.com/.../8598136_x86_64_4_console.log"
                },
                {
                    "name": "8598136_x86_64_4_resultoutputfile.log",
                    "url": "https://s3.us-east-2.amazonaws.com/...x86_64_4_resultoutputfile.log"
                },
                {
                    "name": "8598136_x86_64_4_dmesg.log",
                    "url": "https://s3.us-east-2.amazonaws.com/.../8598136_x86_64_4_dmesg.log"
                }
            ],
            "status": "PASS",
            "waived": false
        }
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

This is an execution of the LTP test suite, and the report includes the time and the duration of the test, as well as links to output files. It also includes the resulting test status as “PASS”. There are four more statuses accepted: “FAIL”, “ERROR”, “DONE”, and “SKIP”. The report also includes the “waived” flag set to “false”, which says the report should not be ignored when summarizing the testing results for the build. Setting “waived” to “true” is useful when you’re just introducing the test, or fixing an issue with it, and cannot trust it, but want it to still produce results, so that you could see how it’s doing and accumulate statistics.

Finally, the report includes the “path” field, which is used to identify the particular test or test suite, and optionally parts of it. We support both submitters who can only report results for a whole test suite, and those who can report each of its parts. The whole LTP test suite could be reported as “ltp”, and its “signal06” test case could be reported as “ltp.signal06”. We maintain a catalogue of agreed-upon top-level names. A submitter is not required to use those, or use any at all. However, using agreed-upon names is beneficial to both kernel developers and submitters, as it allows correlating results of the same test coming from different CI systems.

We are still working on how to best describe “environments” the tests are executed in, so that test results could be correlated across similar hardware, and this report is not including any environment information. However, below is a failed test report from kernelci.org which provides a text summary of the environment:

{
    "tests": [
        {
            "id": "kernelci:staging.kernelci.org:5f169b9d55fe5e789f6383d7",
            "origin": "kernelci",
            "build_id": "kernelci:staging.kernelci.org:5f169ac2bb300e54bb638380",
            "environment": {
                "description": "mt8173-elm-hana in lab-collabora"
            },
            "output_files": [
                {
                    "name": "txt",
                    "url": "https://storage.staging.kernelci.org/...mt8173-elm-hana.txt"
                },
                {
                    "name": "html",
                    "url": "https://storage.staging.kernelci.org/...mt8173-elm-hana.html"
                },
                {
                    "name": "lava_json",
                    "url": "https://storage.staging.kernelci.org/.../lava-mt8173-elm-hana.json"
                }
            ],
            "path": "v4l2-compliance-uvc.device-presence",
            "start_time": "2020-07-21T07:39:09.064000+00:00",
            "status": "FAIL",
            "waived": false
        }
    ],
    "version": {
        "major": 3,
        "minor": 0
    }
}

More Data

Our schema is constantly evolving, some fields are added, renamed, or removed. To help with deciding what changes we need, and to help submitters reach developers with their specific data before we can fully and properly accommodate it, each major object in the schema accepts a special field called “misc”, containing arbitrary objects.

The submitters can use these fields to add data they want to see in the database, and to give developers sooner. For example, see a build and a test with extra data added by kernelci.org. The “misc” fields are also useful for debugging. E.g. see how Red Hat’s CKI is using them to link a revision, a build, and a test run to internal objects.

The Protocol

If your report JSON complies with the schema, then all you need to do is send it over to us. There’s no handshake, synchronization, or object ID allocation. To send it you need to specify an authentication key and the destination, both of which we supply you with. Contact us at kernelci@groups.io, or the #kernelci channel on freenode.net to get yours. You can use either the command-line tool, or the Python 3 library to submit your reports.

In practice, a command-line submission could look like this:

$ export GOOGLE_APPLICATION_CREDENTIALS=~/.kernelci-production-ci-cki.json
$ kcidb-submit -p kernelci-production -t kernelci_new < report.json

A Python 3 submission code could look like this (using “GOOGLE_APPLICATION_CREDENTIALS” from the environment):

from kcidb import Client
client = Client(project_id="kernelci-production", topic_name="kernelci_new")
client.submit(report)

Where “report” would be the standard Python representation of the JSON data.

After you’ve sent the data, you only need to make sure the URLs you submitted inside reports stay valid for an acceptable time, and that if you submit further linked objects, they use the IDs you sent.

What’s Next

We’re working on correlating results between different submitters: identifying and joining revisions, builds, test environments, and test runs. We’re improving the dashboard, as far as possible with Grafana, and looking for another implementation beyond that. Finally, we’re working on the emails notifying the developers of the results we collect, beyond the POC we already have.

How to help

Send us your kernel testing reports, and volunteer to receive email notifications, join the development, share your ideas and report bugs! Write to kernelci@groups.io if you want to help!