dlpack — Python array API standard 2024.12 documentation (original) (raw)
array.__dlpack__(*, stream: int | Any | None = None, max_version: tuple[int, int] | None = None, dl_device: tuple[Enum, int] | None = None, copy: bool | None = None) → PyCapsule¶
Exports the array for consumption by from_dlpack() as a DLPack capsule.
Parameters:
- self (array) – array instance.
- stream (Optional [ Union [ int , Any ] ]) –
for CUDA and ROCm, a Python integer representing a pointer to a stream, on devices that support streams.stream
is provided by the consumer to the producer to instruct the producer to ensure that operations can safely be performed on the array (e.g., by inserting a dependency between streams via “wait for event”). The pointer must be an integer larger than or equal to-1
(see below for allowed values on each platform). Ifstream
is-1
, the value may be used by the consumer to signal “producer must not perform any synchronization”. The ownership of the stream stays with the consumer. On CPU and other device types without streams, onlyNone
is accepted.
For other device types which do have a stream, queue, or similar synchronization/ordering mechanism, the most appropriate type to use forstream
is not yet determined. E.g., for SYCL one may want to use an object containing an in-ordercl::sycl::queue
. This is allowed when libraries agree on such a convention, and may be standardized in a future version of this API standard.
Note
Support for astream
value other thanNone
is optional and implementation-dependent.
Device-specific values ofstream
for CUDA:None
: producer must assume the legacy default stream (default).1
: the legacy default stream.2
: the per-thread default stream.> 2
: stream number represented as a Python integer.0
is disallowed due to its ambiguity:0
could mean eitherNone
,1
, or2
.
Device-specific values ofstream
for ROCm:None
: producer must assume the legacy default stream (default).0
: the default stream.> 2
: stream number represented as a Python integer.- Using
1
and2
is not supported.
Note
Whendl_device
is provided explicitly,stream
must be a valid construct for the specified device type. In particular, whenkDLCPU
is in use,stream
must beNone
and a synchronization must be performed to ensure data safety.
Tip
It is recommended that implementers explicitly handle streams. If they use the legacy default stream, specifying1
(CUDA) or0
(ROCm) is preferred.None
is a safe default for developers who do not want to think about stream handling at all, potentially at the cost of more synchronizations than necessary.
- max_version (Optional [ tuple [ int , int ] ]) – the maximum DLPack version that the consumer (i.e., the caller of
__dlpack__
) supports, in the form of a 2-tuple(major, minor)
. This method may return a capsule of versionmax_version
(recommended if it does support that), or of a different version. This means the consumer must verify the version even whenmax_version
is passed. - dl_device (Optional [ tuple [ enum.Enum , int ] ]) –
the DLPack device type. Default isNone
, meaning the exported capsule should be on the same device asself
is. When specified, the format must be a 2-tuple, following that of the return value of array.__dlpack_device__(). If the device type cannot be handled by the producer, this function must raiseBufferError
.
The v2023.12 standard only mandates that a compliant library should offer a way for__dlpack__
to return a capsule referencing an array whose underlying memory is accessible to the Python interpreter (represented by thekDLCPU
enumerator in DLPack). If a copy must be made to enable this support butcopy
is set toFalse
, the function must raiseBufferError
.
Other device kinds will be considered for standardization in a future version of this API standard. - copy (Optional [ bool ]) –
boolean indicating whether or not to copy the input. IfTrue
, the function must always copy (performed by the producer; see also Copy keyword argument behavior). IfFalse
, the function must never copy, and raise aBufferError
in case a copy is deemed necessary (e.g. if a cross-device data movement is requested, and it is not possible without a copy). IfNone
, the function must reuse the existing memory buffer if possible and copy otherwise. Default:None
.
When a copy happens, theDLPACK_FLAG_BITMASK_IS_COPIED
flag must be set.
Note
If a copy happens, and if the consumer-providedstream
anddl_device
can be understood by the producer, the copy must be performed overstream
.
Returns:
capsule (PyCapsule) – a DLPack capsule for the array. See Data interchange mechanisms for details.
Raises:
BufferError – Implementations should raise BufferError
when the data cannot be exported as DLPack (e.g., incompatible dtype or strides). Other errors are raised when export fails for other reasons (e.g., incorrect arguments passed or out of memory).
Notes
The DLPack version scheme is SemVer, where the major DLPack versions represent ABI breaks, and minor versions represent ABI-compatible additions (e.g., new enum values for new data types or device types).
The max_version
keyword was introduced in v2023.12, and goes together with the DLManagedTensorVersioned
struct added in DLPack 1.0. This keyword may not be used by consumers until a later time after introduction, because producers may implement the support at a different point in time.
It is recommended for the producer to use this logic in the implementation of __dlpack__
:
if max_version is None:
# Keep and use the DLPack 0.X implementation
# Note: from March 2025 onwards (but ideally as late as
# possible), it's okay to raise BufferError here
else:
# We get to produce DLManagedTensorVersioned
now. Note that
# our_own_dlpack_version is the max version that the producer
# supports and fills in the DLManagedTensorVersioned::version
# field
if max_version >= our_own_dlpack_version:
# Consumer understands us, just return a Capsule with our max version
elif max_version[0] == our_own_dlpack_version[0]:
# major versions match, we should still be fine here -
# return our own max version
else:
# if we're at a higher major version internally, did we
# keep an implementation of the older major version around?
# For example, if the producer is on DLPack 1.x and the consumer
# is 0.y, can the producer still export a capsule containing
# DLManagedTensor and not DLManagedTensorVersioned?
# If so, use that. Else, the producer should raise a BufferError
# here to tell users that the consumer's max_version is too
# old to allow the data exchange to happen.
And this logic for the consumer in from_dlpack():
try: x.dlpack(max_version=(1, 0), ...) # if it succeeds, store info from the capsule named "dltensor_versioned", # and need to set the name to "used_dltensor_versioned" when we're done except TypeError: x.dlpack(...)
This logic is also applicable to handling of the new dl_device
and copy
keywords.
DLPack 1.0 added a flag to indicate that the array is read-only (DLPACK_FLAG_BITMASK_READ_ONLY
). A consumer that does not support read-only arrays should ignore this flag (this is preferred over raising an exception; the user is then responsible for ensuring the memory isn’t modified).
Changed in version 2022.12: Added BufferError.
Changed in version 2023.12: Added the max_version
, dl_device
, and copy
keyword arguments.
Changed in version 2023.12: Added recommendation for handling read-only arrays.
Changed in version 2024.12: Resolved conflicting exception guidance.