NeuronPerf API — AWS Neuron Documentation (original) (raw)

This document is relevant for: Inf1, Inf2, Trn1, Trn2

NeuronPerf API#

Due to a bug in Sphinx, some of the type annotations may be incomplete. You can download the source code here. In the future, the source will be hosted in a more browsable way.

compile(compile_fn, model, inputs, batch_sizes: Union[int, List[int]] = None, pipeline_sizes: Union[int, List[int]] = None, performance_levels: Union[str, List[int]] = None, models_dir: str = 'models', filename: str = None, compiler_args: dict = None, verbosity: int = 1, *args, **kwargs) → str:#

Compiles the provided model with each provided example input, pipeline size, and performance level. Any additional compiler_args passed will be forwarded to the compiler on every invocation.

Parameters:

Returns:

A model index filename. If a configuration fails to compile, it will not be included in the index and an error will be logged.

Return type:

str

benchmark(load_fn: Callable[[str, int], Any], model_filename: str, inputs: Any, batch_sizes: Union[int, List[int]] = None, duration: float = BENCHMARK_SECS, n_models: Union[int, List[int]] = None, pipeline_sizes: Union[int, List[int]] = None, cast_modes: Union[str, List[str]] = None, workers_per_model: Union[int, None] = None, env_setup_fn: Callable[[int, Dict], None] = None, setup_fn: Callable[[int, Dict, Any], None] = None, preprocess_fn: Callable[[Any], Any] = None, postprocess_fn: Callable[[Any], Any] = None, dataset_loader_fn: Callable[[Any, int], Any] = None, verbosity: int = 1, multiprocess: bool = True, multiinterpreter: bool = False, return_timers: bool = False, device_type: str = 'neuron') → List[Dict]:#

Benchmarks the model index or individiual model using the provided inputs. If a model index is provided, additional fields such as pipeline_sizes andperformance_levels can be used to filter the models to benchmark. The default behavior is to benchmark all configurations in the model index.

Parameters:

Returns:

A list of benchmarking results.

Return type:

list[dict]

get_reports(results)#

Summarizes and combines the detailed results from neuronperf.benchmark, when run with return_timers=True. One report dictionary is produced per model configuration benchmarked. The list of reports can be fed directly to other reporting utilities, such as neuronperf.write_csv.

Parameters:

Returns:

A list of dictionaries that summarize the results for each model configuration.

Return type:

list[dict]

print_reports(reports, cols=SUMMARY_COLS, sort_by='throughput_peak', reverse=False)#

Print a report to the terminal. Example of default behavior:

neuronperf.print_reports(reports) throughput_avg latency_ms_p50 latency_ms_p99 n_models pipeline_size workers_per_model batch_size model_filename 329.667 6.073 6.109 1 1 2 1 models/model_b1_p1_83bh3hhs.pt

Parameters:

write_csv(reports: list[dict], filename: str = None, cols=REPORT_COLS)#

Write benchmarking reports to CSV file.

Parameters:

Returns:

The filename written.

Return type:

str

write_json(reports: list[dict], filename: str = None)#

Writes benchmarking reports to a JSON file.

param list[dict] reports:

Results from neuronperf.get_reports.

param str filename:

Filename to write. If not provided, generated from model_name in report and current timestamp.

return:

The filename written.

rtype:

str

model_index.append(*model_indexes: Union[str, dict]) → dict:#

Appends the model indexes non-destructively into a new model index, without modifying any of the internal data.

This is useful if you have benchmarked multiple related models and wish to combine their respective model indexes into a single index.

Model name will be taken from the first index provided. Duplicate configs will be filtered.

Parameters:

model_indexes – Model indexes or paths to model indexes to combine.

Returns:

A new dictionary representing the combined model index.

Return type:

dict

model_index.copy(old_index: Union[str, dict], new_index: str, new_dir: str) → str:#

Copy an index to a new location. Will rename old_indexto new_index and copy all model files into new_dir, updating the index paths.

This is useful for pulling individual models out of a pool.

Returns the path to the new index.

model_index.create(filename, input_idx=0, batch_size=1, pipeline_size=1, cast_mode=DEFAULT_CAST, compile_s=None)#

Create a new model index from a pre-compiled model.

Parameters:

Returns:

A new dictionary representing a model index.

Return type:

dict

model_index.delete(filename: str):

Deletes the model index and all associated models referenced by the index.

model_index.filter(index: Union[str, dict], **kwargs) → dict:#

Filters provided model index on provided criteria and returns a new index. Each kwarg is a standard (k, v) pair, where k is treated as a filter name and v may be one or more values used to filter model configs.

model_index.load(filename) → dict:#

Load a NeuronPerf model index from a file.

model_index.move(old_index: str, new_index: str, new_dir: str) → str:#

This is the same as copy followed by delete on the old index.

model_index.save(model_index, filename: str = None, root_dir=None) → str:#

Save a NeuronPerf model index to a file.

This document is relevant for: Inf1, Inf2, Trn1, Trn2