Using the Job Specification Language (original) (raw)

The Job Specification Language (JSL) enables you to define the steps in a job and their execution order using an XML file. The following example shows how to define a simple job that contains one chunk step and one task step:

<job id="loganalysis" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
                      version="1.0">
  <properties>
    <property name="input_file" value="input1.txt"/>
    <property name="output_file" value="output2.txt"/>
  </properties>

  <step id="logprocessor" next="cleanup">
    <chunk checkpoint-policy="item" item-count="10">
      <reader ref="com.example.pkg.LogItemReader"></reader>
      <processor ref="com.example.pkg.LogItemProcessor"></processor>
      <writer ref="com.example.pkg.LogItemWriter"></writer>
    </chunk>
  </step>

  <step id="cleanup">
    <batchlet ref="com.example.pkg.CleanUp"></batchlet>
    <end on="COMPLETED"/>
  </step>
</job>

This example defines the loganalysis batch job, which consists of thelogprocessor chunk step and the cleanup task step. Thelogprocessor step transitions to the cleanup step, which terminates the job when completed.

The job element defines two properties, input_file andoutput_file. Specifying properties in this manner enables you to run a batch job with different configuration parameters without having to recompile its Java batch artifacts. The batch artifacts can access these properties using the context objects from the batch runtime.

The logprocessor step is a chunk step that specifies batch artifacts for the reader (LogItemReader), the processor (LogItemProcessor), and the writer (LogItemWriter). This step creates a checkpoint for every ten items processed.

The cleanup step is a task step that specifies the CleanUp class as its batch artifact. The job terminates when this step completes.

The following sections describe the elements of the Job Specification Language (JSL) in more detail and show the most common attributes and child elements.

The job Element

The job element is always the top-level element in a job definition file. Its main attributes are id and restartable. The job element can contain one properties element and zero or more of each of the following elements: listener, step, flow, and split. For example:

<job id="jobname" restartable="true">
  <listeners>
    <listener ref="com.example.pkg.ListenerBatchArtifact"/>
  </listeners>
  <properties>
    <property name="propertyName1" value="propertyValue1"/>
    <property name="propertyName2" value="propertyValue2"/>
  </properties>
  <step ...> ... </step>
  <step ...> ... </step>
  <decision ...> ... </decision>
  <flow ...> ... </flow>
  <split ...> ... </split>
</job>

The listener element specifies a batch artifact whose methods are invoked before and after the execution of the job. The batch artifact is an implementation of the javax.batch.api.listener.JobListenerinterface. See The Listener Batch Artifacts for an example of a job listener implementation.

The first step, flow, or split element inside the job element executes first.

The step Element

The step element can be a child of the job and flow elements. Its main attributes are id and next. The step element can contain the following elements.

One chunk element for chunk-oriented steps or one batchlet element for task-oriented steps.
One properties element (optional).
This element specifies a set of properties that batch artifacts can access using batch context objects.
One listener element (optional); one listeners element if more than one listener is specified.
This element specifies listener artifacts that intercept various phases of step execution.
For chunk steps, the batch artifacts for these listeners can be implementations of the following interfaces: StepListener,ItemReadListener, ItemProcessListener, ItemWriteListener,ChunkListener, RetryReadListener, RetryProcessListener,RetryWriteListener, SkipReadListener, SkipProcessListener, andSkipWriteListener.
For task steps, the batch artifact for these listeners must be an implementation of the StepListener interface.
One partition element (optional).
This element is used in partitioned steps which execute in more than one thread.
One end element if this is the last step in a job.
This element sets the batch status to COMPLETED.
One stop element (optional) to stop a job at this step.
This element sets the batch status to STOPPED.
One fail element (optional) to terminate a job at this step.
This element sets the batch status to FAILED.
One or more next elements if the next attribute is not specified.
This element is associated with an exit status and refers to another step, a flow, a split, or a decision element.

The following is an example of a chunk step:

<step id="stepA" next="stepB">
  <properties> ... </properties>
  <listeners>
    <listener ref="MyItemReadListenerImpl"/>
    ...
  </listeners>
  <chunk ...> ... </chunk>
  <partition> ... </partition>
  <end on="COMPLETED" exit-status="MY_COMPLETED_EXIT_STATUS"/>
  <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step0"/>
  <fail on="MY_ERROR_EXIT_STATUS" exit-status="MY_ERROR_EXIT_STATUS"/>
</step>

The following is an example of a task step:

<step id="stepB" next="stepC">
  <batchlet ...> ... </batchlet>
  <properties> ... </properties>
  <listener ref="MyStepListenerImpl"/>
</step>

The chunk Element

The chunk element is a child of the step element for chunk-oriented steps. The attributes of this element are listed in Table 58-2.

Table 58-2 Attributes of the chunk Element

Attribute Name	Description	Default Value
checkpoint-policy	Specifies how to commit the results of processing each chunk: "item": the chunk is committed after processing item-count items "custom": the chunk is committed according to a checkpoint algorithm specified with the checkpoint-algorithm element The checkpoint is updated when the results of a chunk are committed. Every chunk is processed in a global Java EE transaction. If the processing of one item in the chunk fails, the transaction is rolled back and no processed items from this chunk are stored.	"item"
item-count	Specifies the number of items to process before committing the chunk and taking a checkpoint.	10
time-limit	Specifies the number of seconds before committing the chunk and taking a checkpoint when checkpoint-policy="item". If item-count items have not been processed by time-limit seconds, the chunk is committed and a checkpoint is taken.	0 (no limit)
buffer-items	Specifies if processed items are buffered until it is time to take a checkpoint. If true, a single call to the item writer is made with a list of the buffered items before committing the chunk and taking a checkpoint.	true
skip-limit	Specifies the number of skippable exceptions to skip in this step during chunk processing. Skippable exception classes are specified with the skippable-exception-classes element.	No limit
retry-limit	Specifies the number of attempts to execute this step if retryable exceptions occur. Retryable exception classes are specified with the retryable-exception-classes element.	No limit

The chunk element can contain the following elements.

One reader element.
This element specifies a batch artifact that implements the ItemReaderinterface.
One processor element.
This element specifies a batch artifact that implements theItemProcessor interface.
One writer element.
This element specifies a batch artifact that implements the ItemWriterinterface.
One checkpoint-algorithm element (optional).
This element specifies a batch artifact that implements theCheckpointAlgorithm interface and provides a custom checkpoint policy.
One skippable-exception-classes element (optional).
This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing should skip. The skip-limit attribute from the chunk element specifies the maximum number of skipped exceptions.
One retryable-exception-classes element (optional).
This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing will retry. The retry-limit attribute from the chunk element specifies the maximum number of attempts.
One no-rollback-exception-classes element (optional).
This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that should not cause the batch runtime to roll back the current chunk, but to retry the current operation without a rollback instead.
For exception types not specified in this element, the current chunk is rolled back by default when an exception occurs.

The following is an example of a chunk-oriented step:

<step id="stepC" next="stepD">
  <chunk checkpoint-policy="item" item-count="5" time-limit="180"
         buffer-items="true" skip-limit="10" retry-limit="3">
    <reader ref="pkg.MyItemReaderImpl"></reader>
    <processor ref="pkg.MyItemProcessorImpl"></processor>
    <writer ref="pkg.MyItemWriterImpl"></writer>
    <skippable-exception-classes>
      <include class="pkg.MyItemException"/>
      <exclude class="pkg.MyItemSeriousSubException"/>
    </skippable-exception-classes>
    <retryable-exception-classes>
      <include class="pkg.MyResourceTempUnavailable"/>
    </retryable-exception-classes>
  </chunk>
</step>

This example defines a chunk step and specifies its reader, processor, and writer artifacts. The step updates a checkpoint and commits each chunk after processing five items. It skips all MyItemExceptionexceptions and all its subtypes, except for MyItemSeriousSubException, up to a maximum of ten skipped exceptions. The step retries a chunk when a MyResourceTempUnavailable exception occurs, up to a maximum of three attempts.

The batchlet Element

The batchlet element is a child of the step element for task-oriented steps. This element only has the ref attribute, which specifies a batch artifact that implements the Batchlet interface. Thebatch element can contain a properties element.

The following is an example of a task-oriented step:

<step id="stepD" next="stepE">
  <batchlet ref="pkg.MyBatchletImpl">
    <properties>
      <property name="pname" value="pvalue"/>
    </properties>
  </batchlet>
</step>

This example defines a batch step and specifies its batch artifact.

The partition Element

The partition element is a child of the step element. It indicates that a step is partitioned. Most partitioned steps are chunk steps where the processing of each item does not depend on the results of processing previous items. You specify the number of partitions in a step and provide each partition with specific information on which items to process, such as the following.

A range of items. For example, partition 1 processes items 1 through 500, and partition 2 processes items 501 through 1000.
An input source. For example, partition 1 processes the items ininput1.txt and partition 2 processes the items in input2.txt.

When the number of partitions, the number of items, and the input sources for a partitioned step are known at development or deployment time, you can use partition properties in the job definition file to specify partition-specific information and access these properties from the step batch artifacts. The runtime creates as many instances of the step batch artifacts (reader, processor, and writer) as partitions, and each artifact instance receives the properties specific to its partition.

In most cases, the number of partitions, the number of items, or the input sources for a partitioned step can only be determined at runtime. Instead of specifying partition-specific properties statically in the job definition file, you provide a batch artifact that can access your data sources at runtime and determine how many partitions are needed and what range of items each partition should process. This batch artifact is an implementation of the PartitionMapper interface. The batch runtime invokes this artifact and then uses the information it provides to instantiate the step batch artifacts (reader, writer, and processor) for each partition and to pass them partition-specific data as parameters.

The rest of this section describes the partition element in detail and shows two examples of job definition files: one that uses partition properties to specify a range of items for each partition, and one that relies on a PartitionMapper implementation to determine partition-specific information.

The partition element can contain the following elements.

One plan element, if the mapper element is not specified.
This element defines the number of partitions, the number of threads, and the properties for each partition in the job definition file. Theplan element is useful when this information is known at development or deployment time.
One mapper element, if the plan element is not specified.
This element specifies a batch artifact that provides the number of partitions, the number of threads, and the properties for each partition. The batch artifact is an implementation of thePartitionMapper interface. You use this option when the information required for each partition is only known at runtime.
One reducer element (optional).
This element specifies a batch artifact that receives control when a partitioned step begins, ends, or rolls back. The batch artifact enables you to merge results from different partitions and perform other related operations. The batch artifact is an implementation of thePartitionReducer interface.
One collector element (optional).
This element specifies a batch artifact that sends intermediary results from each partition to a partition analyzer. The batch artifact sends the intermediary results after each checkpoint for chunk steps and at the end of the step for task steps. The batch artifact is an implementation of the PartitionCollector interface.
One analyzer element (optional).
This element specifies a batch artifact that analyzes the intermediary results from the partition collector instances. The batch artifact is an implementation of the PartitionAnalyzer interface.

The following is an example of a partitioned step using the planelement:

<step id="stepE" next="stepF">
  <chunk>
    <reader ...></reader>
    <processor ...></processor>
    <writer ...></writer>
  </chunk>
  <partition>
    <plan partitions="2" threads="2">
      <properties partition="0">
        <property name="firstItem" value="0"/>
        <property name="lastItem" value="500"/>
      </properties>
      <properties partition="1">
        <property name="firstItem" value="501"/>
        <property name="lastItem" value="999"/>
      </properties>
    </plan>
  </partition>
  <reducer ref="MyPartitionReducerImpl"/>
  <collector ref="MyPartitionCollectorImpl"/>
  <analyzer ref="MyPartitionAnalyzerImpl"/>
</step>

In this example, the plan element specifies the properties for each partition in the job definition file.

The following example uses a mapper element instead of a planelement. The PartitionMapper implementation dynamically provides the same information as the plan element provides in the job definition file:

<step id="stepE" next="stepF">
  <chunk>
    <reader ...></reader>
    <processor ...></processor>
    <writer ...></writer>
  </chunk>
  <partition>
    <mapper ref="MyPartitionMapperImpl"/>
    <reducer ref="MyPartitionReducerImpl"/>
    <collector ref="MyPartitionCollectorImpl"/>
    <analyzer ref="MyPartitionAnalyzerImpl"/>
  </partition>
</step>

The flow Element

The flow element can be a child of the job, flow, and splitelements. Its attributes are id and next. Flows can transition to flows, steps, splits, and decision elements. The flow element can contain the following elements:

One or more step elements
One or more flow elements (optional)
One or more split elements (optional)
One or more decision elements (optional)

The last step in a flow is the one with no next attribute or nextelement. Steps and other elements in a flow cannot transition to elements outside the flow.

The following is an example of the flow element:

<flow id="flowA" next="stepE">
  <step id="flowAstepA" next="flowAstepB">...</step>
  <step id="flowAstepB" next="flowAflowC">...</step>
  <flow id="flowAflowC" next="flowAsplitD">...</flow>
  <split id="flowAsplitD" next="flowAstepE">...</split>
  <step id="flowAstepE">...</step>
</flow>

This example flow contains three steps, one flow, and one split. The last step does not have the next attribute. The flow transitions tostepE when its last step completes.

The split Element

The split element can be a child of the job and flow elements. Its attributes are id and next. Splits can transition to splits, steps, flows, and decision elements. The split element can only contain one or more flow elements that can only transition to other flowelements in the split.

The following is an example of a split with three flows that execute concurrently:

<split id="splitA" next="stepB">
  <flow id="splitAflowA">...</flow>
  <flow id="splitAflowB">...</flow>
  <flow id="splitAflowC">...</flow>
</split>

The decision Element

The decision element can be a child of the job and flow elements. Its attributes are id and next. Steps, flows, and splits can transition to a decision element. This element specifies a batch artifact that decides the next step, flow, or split to execute based on information from the execution of the previous step, flow, or split. The batch artifact implements the Decider interface. The decisionelement can contain the following elements.

One or more end elements (optional).
This element sets the batch status to COMPLETED.
One or more stop elements (optional).
This element sets the batch status to STOPPED.
One or more fail elements (optional).
This element sets the batch status to FAILED.
One or more next elements (optional).
One properties element (optional).

The following is an example of the decider element:

<decision id="decisionA" ref="MyDeciderImpl">
  <fail on="FAILED" exit-status="FAILED_AT_DECIDER"/>
  <end on="COMPLETED" exit-status="COMPLETED_AT_DECIDER"/>
  <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step2"/>
</decision>