NeuronPerf FAQ — AWS Neuron Documentation (original) (raw)
Contents
- When should I use NeuronPerf?
- When should I not use NeuronPerf?
- Which frameworks does NeuronPerf support?
- Which Neuron instance types does NeuronPerf support?
- Is NeuronPerf Open Source?
- What is the secret to obtaining the best numbers?
- What are the “best practices” that NeuronPerf uses?
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2
NeuronPerf FAQ#
Table of contents
- When should I use NeuronPerf?
- When should I not use NeuronPerf?
- Which frameworks does NeuronPerf support?
- Which Neuron instance types does NeuronPerf support?
- Is NeuronPerf Open Source?
- What is the secret to obtaining the best numbers?
- What are the “best practices” that NeuronPerf uses?
When should I use NeuronPerf?#
When you want to measure the highest achievable performance for your model with Neuron.
When should I not use NeuronPerf?#
When measuring end-to-end performance that includes your network serving stack. Instead, your should compare your e2e numbers to those obtained by NeuronPerf to optimize your serving overhead.
Which frameworks does NeuronPerf support?#
See NeuronPerf Framework Notes.
Which Neuron instance types does NeuronPerf support?#
PyTorch and TensorFlow support all instance types. MXNet support is limited to inf1.
Is NeuronPerf Open Source?#
Yes. You can download the source here.
What is the secret to obtaining the best numbers?#
There is no secret sauce. NeuronPerf follows best practices.
What are the “best practices” that NeuronPerf uses?#
- These vary slightly by framework and how your model was compiled
- For a model compiled for a single NeuronCore (DataParallel):
- To maximize throughput, for
N
models, use2 * N
worker threads - To minimize latency, use 1 worker thread per model
- To maximize throughput, for
- Use a new Python process for each model to avoid GIL contention
- Ensure you benchmark long enough for your numbers to stabilize
- Ignore outliers at the start and end of inference benchmarking
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2