Proposal: parameters and programmatic API (original) (raw)

Aleksey Shipilev aleksey.shipilev at oracle.com
Mon May 13 08:55:01 PDT 2013

Previous message: hg: code-tools/jmh: Minor cleanup in OpsPerInvocation handling code.
Next message: Proposal: parameters and programmatic API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

Please see the proposal for introducing parameters and programmatic API to JMH. These two things are coming together, and the choices in one area mandate appropriate choices in the other area. Hence we'd try to resolve both improvements at the same time.

We will appreciate the reviews, comments, and general feedback about this.

I. PROBLEM STATEMENT

a. Parameters. Many of the workloads we have here explore some configuration space. To iterate that space, we end up doing something like this:

@State class B { private List list;

  @Setup
  public void setup() {
      int size = Integer.getInteger("size");
      list = new ArrayList<>(size);
      init(list, size);
  }

  @GenerateMicroBenchmark
  public void test() {
      for (int i = 0; i < size; i++) { ... list.get(i) ... }
  }

}

...which requires the external script to iterate with -Dsize=...

b. Programmatic API. It turns out some of the use cases for JMH include embedding JMH as the part of bigger experiment. This also subsumes the need for explicit scenarios language, as building out the scenario using pure Java API seems to be less error-prone than winding up yet another explicit DSL.

However, API is heavily tied with the parameters, because API should have the ability to set the parameters declared in the benchmarks. Vice versa, parameters should be able to pick up some of the environment settings from the JMH launcher.

II. APPROACH

+++ a. Parametrized @Env-s

Since most of the parameters are required to initialize @State-s, it seem beneficial to start from there. While it is tempting to parametrize @States, it is better for all various reasons to have another class which bears the settings. The proposed syntax follows:

@State class B {

  @Env
  class Settings {
      @Param("N")
      int size;
  }

  private List<Integer> list;

  @Setup
  public void setup(Env e) {
      list = new ArrayList<>(e.size);
      init(list, e.size);
  }

  @GenerateMicroBenchmark
  public void test(Env e) {
      for (int i = 0; i < e.size; i++) { ... list.get(i) ... }
  }

}

Here, we inject the field with @Param describing the label to be used to set it externally. The convention is that JMH should set all parameters before calling any of the fixture methods, which will allow to naturally init the state depending on the parameter value. It should be the semantic error to modify the parameter field from the microbenchmark.

Naturally, for many usages, we may want to simplify this to:

// @Env is implicit class B {

  @Param("N")
  int size;

  private List<Integer> list;

  @Setup
  public void setup() {
      list = new ArrayList<>(size);
      init(list, size);
  }

  @GenerateMicroBenchmark
  public void test() {
      for (int i = 0; i < size; i++) { ... list.get(i) ... }
  }

}

Note the @Env on benchmark class is implicit, and the no intervention is needed to get the parameter field to use either in fixtures, or the benchmark method.

This annotation will allow us to also set the default values for the parameters: @Param(name = "N", value = 100)

Also, it would be nice to allow JMH to produce the results for each parameter configuration, where configurations are defined as the Cartesian products for all the parameter domains. E.g., if we have the ability to set the domains:

class B {

  @Param(name = "N", values = {1, 10, 100, 1000, 10000})
  int size;

  @Param(name = "stride", values = {1, 2, 3, 4})
  int stride;

  private List<Integer> list;

  @Setup
  public void setup() {
      list = new ArrayList<>(size);
      init(list, size);
  }

  @GenerateMicroBenchmark
  public void test() {
      for (int i = 0; i < size; i += stride) { ... list.get(i) ... }
  }

}

...should treat test() with (N=1, stride=1), (N=1, stride=2), ..., (N=10000, stride=4) as separate benchmarks, and (by default) run them all.

Caveat: since Java annotations are not generic, we will end up with type-specific annotations: @IntParam, @LongParam, @DoubleParam, @StringParam.

+++ b. Parametric Java API

This parameter notation opens up the way to execute the benchmark from the Java API. The proposed syntax is as follows:

Result execute(Class benchmarkClass, String test, Settings settings);

...where Settings is the special class holding the parameters (and also environmental parameters, see below). Letting Settings have the proper builder will allow us to execute the example benchmark from the previous section as follows:

Result r = execute(B.class, "test", Settings.set("N", 1000) .set("stride", 10)); // process r, get the metrics, etc.

Fixing only one of the parameters will still allow JMH to traverse the projection of the configuration space, e.g. traverse the strides with N fixed, etc. We will amend this API as we unfold other parts of parameter story.

+++ c. Command line parameters

We would need some good way to map these parameters to appropriate command-line acceptors. Our general line of thinking is that command line executors should ultimately use the Java API to invoke JMH. That means the mapping of command line parameters to API calls is localized in the specific cli module; which is currently TBD.

+++ d. Environment parameters

At times you need to get the current running modes from the JMH to drive your initialization or even the benchmark itself. We have the doorway in another direction: some annotations, like @Threads, @Fork, @OperationsPerInvocations, etc, allow to request the specific running mode from JMH. It would be tempting to use the same doorway other way around.

For example:

class B { @OperationsPerInvocation(42) // each operation is 42 times larger @GenerateMicroBenchmark public void test() { // do something } }

This can be modified to:

class B { @OperationsPerInvocation(42) // each operation is 42 times larger private int ops;

  @GenerateMicroBenchmark
  public void test() {
      // do something
  }

}

Notice the symmetry between these two cases: if we don't need the environmental value, it's perfectly fine to omit the field, and place the relevant annotation to the method as usual. There is a difference though: the annotation on field means the environment is shared for all the @GMB methods:

class B { @OperationsPerInvocation(42) private int ops; // shared for both test1 and test2

  @GenerateMicroBenchmark
  public void test1() {
      // do something
  }

  @GenerateMicroBenchmark
  public void test2() {
      // do something
  }

}

Luckily, we can still isolate the environments for different methods, as with States:

class B {

  @Env
  class E1 {
     @OperationsPerInvocation(42)
     private int ops;
  }

  @Env
  class E2 {
     @OperationsPerInvocation(84)
     private int ops;
  }

  @GenerateMicroBenchmark
  public void test1(E1 e) {
      // do something
  }

  @GenerateMicroBenchmark
  public void test1(E2 e) {
      // do something
  }

}

Of course, this is also the better way to do this, by allowing these annotation to accept the range of values;

class B { @OperationsPerInvocation({42, 43}) private int ops;

  @GenerateMicroBenchmark
  public void test1() {
      // do something
  }

}

We will see why splitting the environment sometimes is a good idea in the next section.

The programmatic API invocation for this test can be seen as follows. We would like to be refactoring-resistant and use the class literals for the environement and relevant annotation when referencing the parameter we want to set. This also helps to set the environment parameter not having the field it is bound to:

execute(B.class, "test1", Settings.set(B.class, OperationsPerInvocation.class, 42);

+++ e. Asymmetric benchmarks

The real problem with the programmatic API is embracing asymmetric benchmarks. There, we sometimes need to set the environment settings for distinct thread types in isolation. This can be achieved by splitting the environments:

class B {

  @Env
  class RE {
     @Threads(4) // four readers in the group
     int readers;
  }

  @Env
  class WE {
     @Threads(1) // single writer in the group
     int writers;
  }

  @State
  class G {
     Target t = ...;
  }

  @Groups(1) // single group by default
  int groups;

  @GenerateMicroBenchmark
  @Group("asymmetric")
  public void doReads(RE e, G g) {
      g.t.read();
  }

  @GenerateMicroBenchmark
  @Group("asymmetric")
  public void doWrites(WE e, G g) {
      g.t.write();
  }

}

@Env-s are naturally providing the namespaces for environmental parameters. Hence, we can execute this via programmatic API with: execute(B.class, "asymmetric", Settings.set(RE.class, Threads.class, 1) .set(WE.class, Threads.class, 3) .set(B.class, Groups.class, 4) );

Unfortunately, the example above does not solve the most frequent case: what if I need both readers and writers to initialize G? This can be solved by allowing the fixture methods in States to accept environments:

class B {

  @Env
  class RE {
     @Threads(4) // four readers in the group
     int readers;
  }

  @Env
  class WE {
     @Threads(1) // single writer in the group
     int writers;
  }

  @State
  class G {
     Target t = ...;

     @Setup
     public void init(RE r, WE w) {
    t = new Target(r.readers + w.writers);
     }
  }

  @Groups(1) // single group by default
  int groups;

  @GenerateMicroBenchmark
  @Group("asymmetric")
  public void doReads(RE e, G g) {
      g.t.read();
  }

  @GenerateMicroBenchmark
  @Group("asymmetric")
  public void doWrites(WE e, G g) {
      g.t.write();
  }

}

Thus, we will provide both customizable parameters, default values and iteration, feedback from JMH back to microbenchmark, and fold it all into Java API. Please voice your concerns about this line of thinking before we start to implement this.

Thanks, Aleksey.

Previous message: hg: code-tools/jmh: Minor cleanup in OpsPerInvocation handling code.
Next message: Proposal: parameters and programmatic API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the jmh-dev mailing list