Path for stabilizing libtest's json output? (original) (raw)

January 9, 2024, 8:03pm 1

I proposed to T-testing-devex that our first order of business should be to stabilize --format json. Think of this as a Pre-eRFC. Unsure what route the more official proposal will take for being approved.What I'm looking for is input on what we should consider when vetting the format to see if it'll be sufficient. See the list below that I've collected so far.

Rust 1.70 fixed libtest so that --format json required being used with a nightly which highlighted how important it is to people.

--format json could also help us improve cargo test, including

Wanting to run test binaries in parallel, like cargo nextest
Lack of summary across all binaries
Noisy test output (see also #5089)
Confusing command-line interactions (see also #8903, #10392)
Poor messaging when a filter doesn't match
Smarter test execution order (see also #8685, #10673)
JUnit output is incorrect when running multiple test binaries
Lack of failure when test binaries exit unexpectedly

Most of that involves shifting responsibilities from the test harness to the test runner which has the side effects of:

Allowing more powerful experiments with custom test runners (e.g. cargo nextest) as they'll have more information to operate on
Lowering the barrier for custom test harnesses (like libtest-mimic) as UI responsibilities are shifted to the test runner (cargo test)

Proposed Plan

While having a plan for evolution takes some burden off of the format, we should still do some due diligence in ensuring the format works well for our intended uses.

My rough idea for a plan is

Create an experimental test harness that uses a serde structure for passing information from its core to different --format modes, emulating what libtest and cargos relationship will be like on a smaller scale for faster iteration
Transition libtest to this proposed interface
Add experimental support for cargo to interact with test binaries through json
Create a stabilization report for json for T-libs-api and a cargo RFC for custom test harnesses to opt into this new protocol

Potential considerations when running the experiment to vet the format:

Plan for future evolution
Ability to implement different format modes on top
- Both test running and --list mode
Ability to run test harnesses in parallel
Tests with multiple failures
Bench support
Static and dynamic parameterized tests / test fixtures
Static and dynamic test skipping
Test markers
doctests
Test location (for IDEs)
Collect metrics related to tests
- Elapsed time
- Temp dir sizes
- RNG seed

Warning: This doesn't mean they'll all be supported in the initial stabilization just that we feel confident the format will support them)

If you are interested in helping this stabilize the json output, contributing to the experimental harness (when we get there) will be a good area for first-time contributors to the Rust project as most of the code will be new, small, and with no stability / correctness pressure of an existing user base.

Misc

Comments made on libtests format

Existing formats

junit
subunit
TAP

See also

It's funny you opened this just now, I posted about something somewhat related in on Zulip. I'm planning on writing a proposal outlining:

A mechanism for opting into nightly-only machine human/machine-readable formats on the stable channel that isn't RUSTC_BOOTSTRAP=1.
A stability/evolution evolution policy for said unstable formats (such as a versioning scheme) and a mechanism for tooling authors to opt-into notifications for changes.

One of the motivating cases was libtest's JSON output, with the hope that the barrier to feedback is reduced and implementers don't feel hamstrung by notions like "perfect is the enemy of good". How would you feel about something along those lines for libtest?

epage January 9, 2024, 9:07pm 3

While I think it would generally be helpful to think in terms of making it easier to experiment with unstable features, I don't think it would help in our case. We have a path forward that can help us learn a lot, quickly outside of the rust-lang/rust tree, more so than I expect a process like this to allow. It might help get whats there today into people's hands but there isn't even a definition of what exists today for us to version, just inference on behavior. I don't expect any useful feedback from doing so because I honestly expect to throw the existing format out. That also means I don't expect much benefit from investing in the current format.

Ah, my bad! If you can go faster out-of-tree, then that's good to hear. I assumed that you'd want to partially add/stabilize support for additional features based on this comment:

At least for things like test locations, benches or doctests, I (naively) imagined that it might not end up in the initial stabilization.

epage January 9, 2024, 9:45pm 5

They won't. I had assumed you were talking more generally for the stabilization process. Sounds like you are instead referring to the item "Plan for future evolution". That will be something we focus on through the effort and would appreciate input on then.

epage January 19, 2024, 7:40pm 6

infogulch January 20, 2024, 12:29am 7

Have you considered how benchmarking harnesses like iai-callgrind could work with the new apis?

epage January 20, 2024, 1:26am 8

Benchmarks are listed in the RFC is one of the areas we will explore. In the next T-testing-devex meeting, we're going to be talking with one of the creators of divan about this (who also happens to be on the team).

system Closed April 19, 2024, 1:27am 9

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.