Run Batch workloads on cost-effective Spot VMs - Azure Batch (original) (raw)

Azure Batch offers Spot virtual machines (VMs) to reduce the cost of Batch workloads. Spot VMs make new types of Batch workloads possible by enabling a large amount of compute power to be used for a low cost.

Spot VMs take advantage of surplus capacity in Azure. When you specify Spot VMs in your pools, Azure Batch can use this surplus, when available.

The tradeoff for using Spot VMs is that those VMs might not always be available, or they might get preempted at any time, depending on available capacity. For this reason, Spot VMs are most suitable for batch and asynchronous processing workloads where the job completion time is flexible and the work is distributed across many VMs.

Spot VMs are offered at a reduced price compared with dedicated VMs. To learn more about pricing, see Batch pricing.

Differences between Spot and low-priority VMs

Batch offers two types of low-cost preemptible VMs:

The type of node you get depends on your Batch account's pool allocation mode, which can be set during account creation. Batch accounts that use the user subscription pool allocation mode always get Spot VMs. Batch accounts that use the Batch managed pool allocation mode always get low-priority VMs.

Azure Spot VMs and Batch low-priority VMs are similar but have a few differences in behavior.

Spot VMs Low-priority VMs
Supported Batch accounts User-subscription Batch accounts Batch-managed Batch accounts
Supported Batch pool configurations Virtual Machine Configuration Virtual Machine Configuration and Cloud Service Configuration (deprecated)
Available regions All regions that support Spot VMs All regions except Microsoft Azure operated by 21Vianet
Customer eligibility Not available for some subscription offer types. See more about Spot limitations. Available for all Batch customers
Possible reasons for eviction Capacity Capacity
Pricing Model Variable discounts relative to standard VM prices Fixed discounts relative to standard VM prices
Quota model Subject to core quotas on your subscription Subject to core quotas on your Batch account
Availability SLA None None

Batch support for Spot VMs

Azure Batch provides several capabilities that make it easy to consume and benefit from Spot VMs:

Considerations and use cases

Many Batch workloads are a good fit for Spot VMs. Consider using Spot VMs when jobs are broken into many parallel tasks, or when you have many jobs that are scaled out and distributed across many VMs.

Some examples of batch processing use cases that are well suited for Spot VMs are:

Batch pools can be configured to use Spot VMs in a few ways:

Keep in mind the following practices when planning your use of Spot VMs:

Create and manage pools with Spot VMs

A Batch pool can contain both dedicated and Spot VMs (also referred to as compute nodes). You can set the target number of compute nodes for both dedicated and Spot VMs. The target number of nodes specifies the number of VMs you want to have in the pool.

The following example creates a pool using Azure virtual machines, in this case Linux VMs, with a target of 5 dedicated VMs and 20 Spot VMs:

ImageReference imageRef = new ImageReference(
    publisher: "Canonical",
    offer: "UbuntuServer",
    sku: "20.04-LTS",
    version: "latest");

// Create the pool
VirtualMachineConfiguration virtualMachineConfiguration =
    new VirtualMachineConfiguration("batch.node.ubuntu 20.04", imageRef);

pool = batchClient.PoolOperations.CreatePool(
    poolId: "vmpool",
    targetDedicatedComputeNodes: 5,
    targetLowPriorityComputeNodes: 20,
    virtualMachineSize: "Standard_D2_v2",
    virtualMachineConfiguration: virtualMachineConfiguration);

You can get the current number of nodes for both dedicated and Spot VMs:

int? numDedicated = pool1.CurrentDedicatedComputeNodes;
int? numLowPri = pool1.CurrentLowPriorityComputeNodes;

Pool nodes have a property to indicate if the node is a dedicated or Spot VM:

bool? isNodeDedicated = poolNode.IsDedicated;

Spot VMs might occasionally be preempted. When preemption happens, tasks that were running on the preempted node VMs are requeued and run again when capacity returns.

For Virtual Machine Configuration pools, Batch also performs the following behaviors:

Scale pools containing Spot VMs

As with pools solely consisting of dedicated VMs, it's possible to scale a pool containing Spot VMs by calling the Resize method or by using autoscale.

The pool resize operation takes a second optional parameter that updates the value of targetLowPriorityNodes:

pool.Resize(targetDedicatedComputeNodes: 0, targetLowPriorityComputeNodes: 25);

The pool autoscale formula supports Spot VMs as follows:

Configure jobs and tasks

Jobs and tasks may require some extra configuration for Spot nodes:

View metrics for Spot VMs

New metrics are available in the Azure portal for Spot nodes. These metrics are:

To view these metrics in the Azure portal:

  1. Navigate to your Batch account in the Azure portal.
  2. Select Metrics from the Monitoring section.
  3. Select the metrics you desire from the Metric list.

Limitations

Next steps