Asynchronous Inference Request — OpenVINO™ documentation (original) (raw)

Asynchronous Inference Request runs an inference pipeline asynchronously in one or several task executors depending on a device pipeline structure. OpenVINO Runtime Plugin API provides the base ov::IAsyncInferRequest class:

AsyncInferRequest Class#

OpenVINO Runtime Plugin API provides the base ov::IAsyncInferRequest class for a custom asynchronous inference request implementation:

class AsyncInferRequest : public ov::IAsyncInferRequest { public: AsyncInferRequest(const std::shared_ptr& request, const std::shared_ptrov::threading::ITaskExecutor& task_executor, const std::shared_ptrov::threading::ITaskExecutor& wait_executor, const std::shared_ptrov::threading::ITaskExecutor& callback_executor);

~AsyncInferRequest();
void cancel() override;

private: std::function<void()> m_cancel_callback; std::shared_ptrov::threading::ITaskExecutor m_wait_executor; };

Class Fields#

Note

If a plugin can work with several instances of a device, m_wait_executor must be device-specific. Otherwise, having a single task executor for several devices does not allow them to work in parallel.

AsyncInferRequest()#

The main goal of the AsyncInferRequest constructor is to define a device pipeline m_pipeline. The example below demonstrates m_pipeline creation with the following stages:

ov::template_plugin::AsyncInferRequest::AsyncInferRequest( const std::shared_ptrov::template_plugin::InferRequest& request, const std::shared_ptrov::threading::ITaskExecutor& task_executor, const std::shared_ptrov::threading::ITaskExecutor& wait_executor, const std::shared_ptrov::threading::ITaskExecutor& callback_executor) : ov::IAsyncInferRequest(request, task_executor, callback_executor), m_wait_executor(wait_executor) { // In current implementation we have CPU only tasks and no needs in 2 executors // So, by default single stage pipeline is created. // This stage executes InferRequest::infer() using cpuTaskExecutor. // But if remote asynchronous device is used the pipeline can by splitted tasks that are executed by cpuTaskExecutor // and waiting tasks. Waiting tasks can lock execution thread so they use separate threads from other executor. constexpr const auto remoteDevice = false;

m_cancel_callback = [request] {
    request->cancel();
};
if (remoteDevice) {
    m_pipeline = {{task_executor,
                   [this, request] {
                       OV_ITT_SCOPED_TASK(itt::domains::TemplatePlugin,
                                          "TemplatePlugin::AsyncInferRequest::infer_preprocess_and_start_pipeline");
                       request->infer_preprocess();
                       request->start_pipeline();
                   }},
                  {m_wait_executor,
                   [this, request] {
                       OV_ITT_SCOPED_TASK(itt::domains::TemplatePlugin,
                                          "TemplatePlugin::AsyncInferRequest::wait_pipeline");
                       request->wait_pipeline();
                   }},
                  {task_executor, [this, request] {
                       OV_ITT_SCOPED_TASK(itt::domains::TemplatePlugin,
                                          "TemplatePlugin::AsyncInferRequest::infer_postprocess");
                       request->infer_postprocess();
                   }}};
}

}

The stages are distributed among two task executors in the following way:

Note

m_callback_executor is also passed to the constructor and it is used in the base ov::IAsyncInferRequest class, which adds a pair of callback_executor and a callback function set by the user to the end of the pipeline.

~AsyncInferRequest()#

In the asynchronous request destructor, it is necessary to wait for a pipeline to finish. It can be done using the ov::IAsyncInferRequest::stop_and_wait method of the base class.

ov::template_plugin::AsyncInferRequest::~AsyncInferRequest() { ov::IAsyncInferRequest::stop_and_wait(); }

cancel()#

The method allows to cancel the infer request execution:

void ov::template_plugin::AsyncInferRequest::cancel() { ov::IAsyncInferRequest::cancel(); m_cancel_callback(); }