Speed Optimization - ESP32 - — ESP-IDF Programming Guide latest documentation (original) (raw)

[中文]

Overview

Optimizing execution speed is a key element of software performance. Code that executes faster can also have other positive effects, e.g., reducing overall power consumption. However, improving execution speed may have trade-offs with other aspects of performance such as Minimizing Binary Size.

Choose What to Optimize

If a function in the application firmware is executed once per week in the background, it may not matter if that function takes 10 ms or 100 ms to execute. If a function is executed constantly at 10 Hz, it matters greatly if it takes 10 ms or 100 ms to execute.

Most kinds of application firmware only have a small set of functions that require optimal performance. Perhaps those functions are executed very often, or have to meet some application requirements for latency or throughput. Optimization efforts should be targeted at these particular functions.

Measuring Performance

The first step to improving something is to measure it.

Basic Performance Measurements

You may be able to measure directly the performance relative to an external interaction with the world, e.g., see the examples wifi/iperf and ethernet/iperf for measuring general network performance. Or you can use an oscilloscope or logic analyzer to measure the timing of an interaction with a device peripheral.

Otherwise, one way to measure performance is to augment the code to take timing measurements:

#include "esp_timer.h"

void measure_important_function(void) { const unsigned MEASUREMENTS = 5000; uint64_t start = esp_timer_get_time();

for (int retries = 0; retries < MEASUREMENTS; retries++) {
    important_function(); // This is the thing you need to measure
}

uint64_t end = esp_timer_get_time();

printf("%u iterations took %llu milliseconds (%llu microseconds per invocation)\n",
       MEASUREMENTS, (end - start)/1000, (end - start)/MEASUREMENTS);

}

Executing the target multiple times can help average out factors, e.g., RTOS context switches, overhead of measurements, etc.

External Tracing

The Application Level Tracing Library allows measuring code execution with minimal impact on the code itself.

Tasks

If the option CONFIG_FREERTOS_GENERATE_RUN_TIME_STATS is enabled, then the FreeRTOS API vTaskGetRunTimeStats() can be used to retrieve runtime information about the processor time used by each FreeRTOS task.

SEGGER SystemView is an excellent tool for visualizing task execution and looking for performance issues or improvements in the system as a whole.

Improving Overall Speed

The following optimizations improve the execution of nearly all code, including boot times, throughput, latency, etc:

Reduce Logging Overhead

Although standard output is buffered, it is possible for an application to be limited by the rate at which it can print data to log output once buffers are full. This is particularly relevant for startup time if a lot of output is logged, but such problem can happen at other times as well. There are multiple ways to solve this problem:

Not Recommended

The following options also increase execution speed, but are not recommended as they also reduce the debuggability of the firmware application and may increase the severity of any bugs.

Targeted Optimizations

The following changes increase the speed of a chosen part of the firmware application:

Improving Startup Time

In addition to the overall performance improvements shown above, the following options can be tweaked to specifically reduce startup time:

The example project system/startup_time is pre-configured to optimize startup time. The file system/startup_time/sdkconfig.defaults contain all of these settings. You can append these to the end of your project's own sdkconfig file to merge the settings, but please read the documentation for each setting first.

Task Priorities

As ESP-IDF FreeRTOS is a real-time operating system, it is necessary to ensure that high-throughput or low-latency tasks are granted a high priority in order to run immediately. Priority is set when calling xTaskCreate() or xTaskCreatePinnedToCore() and can be changed at runtime by calling vTaskPrioritySet().

It is also necessary to ensure that tasks yield CPU (by calling vTaskDelay(), sleep(), or by blocking on semaphores, queues, task notifications, etc) in order to not starve lower-priority tasks and cause problems for the overall system. The Task Watchdog Timer (TWDT) provides a mechanism to automatically detect if task starvation happens. However, note that a TWDT timeout does not always indicate a problem, because sometimes the correct operation of the firmware requires some long-running computation. In these cases, tweaking the TWDT timeout or even disabling the TWDT may be necessary.

Built-in Task Priorities

ESP-IDF starts a number of system tasks at fixed priority levels. Some are automatically started during the boot process, while some are started only if the application firmware initializes a particular feature. To optimize performance, structure the task priorities of your application properly to ensure the tasks are not delayed by the system tasks, while also not starving system tasks and impacting other functions of the system.

This may require splitting up a particular task. For example, perform a time-critical operation in a high-priority task or an interrupt handler and do the non-time-critical part in a lower-priority task.

Header components/esp_system/include/esp_task.h contains macros for the priority levels used for built-in ESP-IDF tasks system. See Background Tasks for more details about the system tasks.

Common priorities are:

Choosing Task Priorities of the Application

Note

Task execution is always completely suspended when writing to the built-in SPI flash chip. Only IRAM-Safe Interrupt Handlers continues executing.

Improving Interrupt Performance

ESP-IDF supports dynamic Interrupt Allocation with interrupt preemption. Each interrupt in the system has a priority, and higher-priority interrupts preempts lower priority ones.

Interrupt handlers execute in preference to any task, provided the task is not inside a critical section. For this reason, it is important to minimize the amount of time spent in executing an interrupt handler.

To obtain the best performance for a particular interrupt handler:

Improving Network Speed

Improving I/O Performance

Using standard C library functions like fread and fwrite instead of platform-specific unbuffered syscalls such as read and write, may result in slower performance.

The fread and fwrite functions are designed for portability rather than speed, introducing some overhead due to their buffered nature. Check the example storage/fatfs/getting_started to see how to use these two functions.

In contrast, the read and write functions are standard POSIX APIs that can be used directly when working with FatFs through VFS, with ESP-IDF handling the underlying implementation. Check the example storage/fatfs/fs_operations to see how to use the two functions.

Additional tips are provided below, and further details can be found in FAT Filesystem Support.