Pipelines (original) (raw)

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See thetask summary for examples of use.

There are two categories of pipeline abstractions to be aware about:

The pipeline() which is the most powerful object encapsulating all other pipelines.
Task-specific pipelines are available for audio, computer vision, natural language processing, and multimodal tasks.

The pipeline abstraction

The pipeline abstraction is a wrapper around all the other available pipelines. It is instantiated as any other pipeline but can provide additional quality of life.

Simple call on one item:

pipe = pipeline("text-classification") pipe("This restaurant is awesome") [{'label': 'POSITIVE', 'score': 0.9998743534088135}]

If you want to use a specific model from the hub you can ignore the task if the model on the hub already defines it:

pipe = pipeline(model="FacebookAI/roberta-large-mnli") pipe("This restaurant is awesome") [{'label': 'NEUTRAL', 'score': 0.7313136458396912}]

To call a pipeline on many items, you can call it with a list.

pipe = pipeline("text-classification") pipe(["This restaurant is awesome", "This restaurant is awful"]) [{'label': 'POSITIVE', 'score': 0.9998743534088135}, {'label': 'NEGATIVE', 'score': 0.9996669292449951}]

To iterate over full datasets it is recommended to use a dataset directly. This means you don’t need to allocate the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on GPU. If it doesn’t don’t hesitate to create an issue.

import datasets from transformers import pipeline from transformers.pipelines.pt_utils import KeyDataset from tqdm.auto import tqdm

pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0) dataset = datasets.load_dataset("superb", name="asr", split="test")

for out in tqdm(pipe(KeyDataset(dataset, "file"))): print(out)

For ease of use, a generator is also possible:

from transformers import pipeline

pipe = pipeline("text-classification")

def data(): while True:

    yield "This is a test"

for out in pipe(data()): print(out)

#### transformers.pipeline

Pipelines (original) (raw)

The pipeline abstraction

Pipeline batching

On GTX 970

Streaming no batching 100%|██████████████████████████████████████████████████████████████████████| 5000/5000 [00:26<00:00, 187.52it/s]

Streaming batch_size=8 100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:04<00:00, 1205.95it/s]

Streaming batch_size=64 100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:02<00:00, 2478.24it/s]

Pipeline chunk batching

Pipeline FP16 inference

Pipeline custom code

Implementing a pipeline

Audio

AudioClassificationPipeline

class transformers.AudioClassificationPipeline

__call__

AutomaticSpeechRecognitionPipeline

class transformers.AutomaticSpeechRecognitionPipeline

__call__

TextToAudioPipeline

class transformers.TextToAudioPipeline

__call__

ZeroShotAudioClassificationPipeline

class transformers.ZeroShotAudioClassificationPipeline

__call__

Computer vision

DepthEstimationPipeline

class transformers.DepthEstimationPipeline

__call__

ImageClassificationPipeline

class transformers.ImageClassificationPipeline

__call__

ImageSegmentationPipeline

class transformers.ImageSegmentationPipeline

__call__

ImageToImagePipeline

class transformers.ImageToImagePipeline

__call__

ObjectDetectionPipeline

class transformers.ObjectDetectionPipeline

__call__

VideoClassificationPipeline

class transformers.VideoClassificationPipeline

__call__

ZeroShotImageClassificationPipeline

class transformers.ZeroShotImageClassificationPipeline

__call__

ZeroShotObjectDetectionPipeline

class transformers.ZeroShotObjectDetectionPipeline

__call__

Natural Language Processing

FillMaskPipeline

class transformers.FillMaskPipeline

__call__

QuestionAnsweringPipeline

class transformers.QuestionAnsweringPipeline

__call__

create_sample

span_to_answer

SummarizationPipeline

class transformers.SummarizationPipeline

__call__

TableQuestionAnsweringPipeline

class transformers.TableQuestionAnsweringPipeline

__call__

TextClassificationPipeline

class transformers.TextClassificationPipeline

__call__

TextGenerationPipeline

class transformers.TextGenerationPipeline

__call__

Text2TextGenerationPipeline

class transformers.Text2TextGenerationPipeline

__call__

check_inputs

TokenClassificationPipeline

class transformers.TokenClassificationPipeline

__call__

aggregate_words

gather_pre_entities

group_entities

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call