uctypes – access binary data in a structured way — MicroPython latest documentation (original) (raw)

This module implements “foreign data interface” for MicroPython. The idea behind it is similar to CPython’s ctypes modules, but the actual API is different, streamlined and optimized for small size. The basic idea of the module is to define data structure layout with about the same power as the C language allows, and then access it using familiar dot-syntax to reference sub-fields.

Warning

uctypes module allows access to arbitrary memory addresses of the machine (including I/O and control registers). Uncareful usage of it may lead to crashes, data loss, and even hardware malfunction.

See also

Module struct

Standard Python way to access binary data structures (doesn’t scale well to large and complex structures).

Usage examples:

import uctypes

Example 1: Subset of ELF file header

https://wikipedia.org/wiki/Executable_and_Linkable_Format#File_header

ELF_HEADER = { "EI_MAG": (0x0 | uctypes.ARRAY, 4 | uctypes.UINT8), "EI_DATA": 0x5 | uctypes.UINT8, "e_machine": 0x12 | uctypes.UINT16, }

"f" is an ELF file opened in binary mode

buf = f.read(uctypes.sizeof(ELF_HEADER, uctypes.LITTLE_ENDIAN)) header = uctypes.struct(uctypes.addressof(buf), ELF_HEADER, uctypes.LITTLE_ENDIAN) assert header.EI_MAG == b"\x7fELF" assert header.EI_DATA == 1, "Oops, wrong endianness. Could retry with uctypes.BIG_ENDIAN." print("machine:", hex(header.e_machine))

Example 2: In-memory data structure, with pointers

COORD = { "x": 0 | uctypes.FLOAT32, "y": 4 | uctypes.FLOAT32, }

STRUCT1 = { "data1": 0 | uctypes.UINT8, "data2": 4 | uctypes.UINT32, "ptr": (8 | uctypes.PTR, COORD), }

Suppose you have address of a structure of type STRUCT1 in "addr"

uctypes.NATIVE is optional (used by default)

struct1 = uctypes.struct(addr, STRUCT1, uctypes.NATIVE) print("x:", struct1.ptr[0].x)

Example 3: Access to CPU registers. Subset of STM32F4xx WWDG block

WWDG_LAYOUT = { "WWDG_CR": (0, { # BFUINT32 here means size of the WWDG_CR register "WDGA": 7 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, "T": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, }), "WWDG_CFR": (4, { "EWI": 9 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, "WDGTB": 7 << uctypes.BF_POS | 2 << uctypes.BF_LEN | uctypes.BFUINT32, "W": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, }), }

WWDG = uctypes.struct(0x40002c00, WWDG_LAYOUT)

WWDG.WWDG_CFR.WDGTB = 0b10 WWDG.WWDG_CR.WDGA = 1 print("Current counter:", WWDG.WWDG_CR.T)

Defining structure layout

Structure layout is defined by a “descriptor” - a Python dictionary which encodes field names as keys and other properties required to access them as associated values:

{ "field1": , "field2": , ... }

Currently, uctypes requires explicit specification of offsets for each field. Offset are given in bytes from the structure start.

Following are encoding examples for various field types:

Module contents

class uctypes.struct(addr, descriptor, layout_type=NATIVE, /)

Instantiate a “foreign data structure” object based on structure address in memory, descriptor (encoded as a dictionary), and layout type (see below).

uctypes.LITTLE_ENDIAN

Layout type for a little-endian packed structure. (Packed means that every field occupies exactly as many bytes as defined in the descriptor, i.e. the alignment is 1).

uctypes.BIG_ENDIAN

Layout type for a big-endian packed structure.

uctypes.NATIVE

Layout type for a native structure - with data endianness and alignment conforming to the ABI of the system on which MicroPython runs.

uctypes.sizeof(struct, layout_type=NATIVE, /)

Return size of data structure in bytes. The struct argument can be either a structure class or a specific instantiated structure object (or its aggregate field).

uctypes.addressof(obj)

Return address of an object. Argument should be bytes, bytearray or other object supporting buffer protocol (and address of this buffer is what actually returned).

uctypes.bytes_at(addr, size)

Capture memory at the given address and size as bytes object. As bytes object is immutable, memory is actually duplicated and copied into bytes object, so if memory contents change later, created object retains original value.

uctypes.bytearray_at(addr, size)

Capture memory at the given address and size as bytearray object. Unlike bytes_at() function above, memory is captured by reference, so it can be both written too, and you will access current value at the given memory address.

uctypes.UINT8

uctypes.INT8

uctypes.UINT16

uctypes.INT16

uctypes.UINT32

uctypes.INT32

uctypes.UINT64

uctypes.INT64

Integer types for structure descriptors. Constants for 8, 16, 32, and 64 bit types are provided, both signed and unsigned.

uctypes.FLOAT32

uctypes.FLOAT64

Floating-point types for structure descriptors.

uctypes.VOID

VOID is an alias for UINT8, and is provided to conveniently define C’s void pointers: (uctypes.PTR, uctypes.VOID).

uctypes.PTR

uctypes.ARRAY

Type constants for pointers and arrays. Note that there is no explicit constant for structures, it’s implicit: an aggregate type without PTRor ARRAY flags is a structure.

Structure descriptors and instantiating structure objects

Given a structure descriptor dictionary and its layout type, you can instantiate a specific structure instance at a given memory address using uctypes.struct() constructor. Memory address usually comes from following sources:

Structure objects

Structure objects allow accessing individual fields using standard dot notation: my_struct.substruct1.field1. If a field is of scalar type, getting it will produce a primitive value (Python integer or float) corresponding to the value contained in a field. A scalar field can also be assigned to.

If a field is an array, its individual elements can be accessed with the standard subscript operator [] - both read and assigned to.

If a field is a pointer, it can be dereferenced using [0] syntax (corresponding to C * operator, though [0] works in C too). Subscripting a pointer with other integer values but 0 are also supported, with the same semantics as in C.

Summing up, accessing structure fields generally follows the C syntax, except for pointer dereference, when you need to use [0] operator instead of *.

Limitations

1. Accessing non-scalar fields leads to allocation of intermediate objects to represent them. This means that special care should be taken to layout a structure which needs to be accessed when memory allocation is disabled (e.g. from an interrupt). The recommendations are:

2. Range of offsets supported by the uctypes module is limited. The exact range supported is considered an implementation detail, and the general suggestion is to split structure definitions to cover from a few kilobytes to a few dozen of kilobytes maximum. In most cases, this is a natural situation anyway, e.g. it doesn’t make sense to define all registers of an MCU (spread over 32-bit address space) in one structure, but rather a peripheral block by peripheral block. In some extreme cases, you may need to split a structure in several parts artificially (e.g. if accessing native data structure with multi-megabyte array in the middle, though that would be a very synthetic case).