NeuralODELayer - Neural ODE layer - MATLAB (original) (raw)
Neural ODE layer
Since R2023b
Description
A neural ODE layer outputs the solution of an ODE.
Creation
Syntax
Description
`layer` = neuralODELayer(`net`,`tspan`)
creates a neural ODE layer and sets the Network
andTimeInterval
properties.
`layer` = neuralODELayer(`net`,`tspan`,`Name=Value`)
specifies additional properties using one or more name-value arguments.
Properties
Neural network characterizing neural ODE function, specified as adlnetwork
object.
If Network
has one input, thenpredict(net,Y)
defines the ODE system, where net
is the network. If Network
has two inputs, thenpredict(net,T,Y)
defines the ODE system, whereT
is a time step repeated over the batch dimension.
The network size and format of the network inputs and outputs must match.
When GradientMode
is "adjoint"
, the network State
property must be empty. To use a network with a nonempty State
property, set GradientMode
to "direct"
.
Interval of integration, specified as a numeric vector with two or more elements. The elements in TimeInterval
must be all increasing or all decreasing.
The solver imposes the initial conditions given by Y0
at the initial time TimeInterval(1)
, then integrates the ODE function fromTimeInterval(1)
to TimeInterval(end)
.
- If
TimeInterval
has two elements,[t0 tf]
, then the solver returns the solution evaluated at pointtf
. - If
TimeInterval
has more than two elements,[t0 t1 ... tf]
, then the solver returns the solution evaluated at the given points[t1 ... tf]
. The solver does not step precisely to each point specified inTimeInterval
. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified inTimeInterval
. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.
Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can negatively affect memory management.
Since R2025a
Solver for neural ODE operation, specified as one of these values:
"ode45"
— Solver for nonstiff differential equations using explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. The"ode45"
solver is well suited for most tasks and can be faster and more accurate than other solvers."ode1"
— Solver for nonstiff differential equations using Euler method. The"ode1"
solver uses a fixed step size and can be better suited for code generation and Simulink tasks.
If Solver
is "ode1"
, then theRelativeTolerance
and AbsoluteTolerance
properties have no effect.
If you specify the SolverOptions
property, then the default value matches the corresponding solver. Otherwise, the default value is"ode45"
.
Since R2025a
Solver options object, specified as a deep.ode.options.ODE45 or a deep.ode.options.ODE1 object.
To set the solver options of the layer, use dot notation. For example, to set the relative tolerance to 1e-4
, uselayer.SolverOptions.RelativeTolerance = 1e-4
, wherelayer
is an instance of the neural ODE layer.
In most cases, you do not need to create the options object directly. If you specify the SolverOptions
property, then you must not specify theRelativeTolerance
, AbsoluteTolerance
, andGradientMode
properties.
To see which options the "ode45"
and "ode1"
solvers support, see deep.ode.options.ODE45 and deep.ode.options.ODE1, respectively.
Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient
function, specified as one of these values:
"direct"
— Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or whenTimeInterval
contains many values."adjoint"
— Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or whenTimeInterval
contains a small number of values.
Tip
To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.
When GradientMode
is "adjoint"
, the network State
property must be empty. To use a network with a nonempty State
property, set GradientMode
to "direct"
.
The dlaccelerate
function does not support accelerating forward passes of networks with neural ODE layers with the GradientMode
property set to"direct"
. To accelerate forward passes of networks with neural ODE layers, set the GradientMode
property to "adjoint"
or"adjoint-seminorm"
, or accelerate parts of your code that do not perform forward passes of the network.
Warning
When GradientMode
is "adjoint"
, all layers in Network
must support acceleration. Otherwise, the software can return unexpected results.
When GradientMode
is "adjoint"
, the software traces the ODE function input to determine the computation graph used for automatic differentiation. This tracing process can take some time and can end up recomputing the same trace. By optimizing, caching, and reusing the traces, the software can speed up the gradient computation.
For more information on deep learning function acceleration, see Deep Learning Function Acceleration for Custom Training Loops.
The NeuralODELayer
object stores this property as a character vector.
Relative error tolerance, specified as a positive scalar. This tolerance measures the error relative to the magnitude of each solution component. Roughly speaking, it controls the number of correct digits in all solution components, except those smaller than the absolute tolerance AbsoluteTolerance
.
At each step, the ODE solver estimates the local error e
in the i
th component of the solution. To be successful, the step must have acceptable error, as determined by both the relative and absolute error tolerances:
|e(i)| <= max(RelativeTolerance*abs(y(i)),AbsoluteTolerance(i))
If Solver
is "ode1"
, then theRelativeTolerance
property has no effect.
Tip
To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.
Data Types: single
| double
Absolute error tolerance, specified as a positive scalar or vector. This tolerance is a threshold below which the value of the solution becomes unimportant. If the solution |y|
is smaller than AbsoluteTolerance
, then the solver does not need to obtain any correct digits in |y|
. For this reason, the value of AbsoluteTolerance
should take into account the scale of the solution components.
If AbsoluteTolerance
is a vector, then it must be the same length as the solution. If AbsoluteTolerance
is a scalar, then the value applies to all solution components.
At each step, the ODE solver estimates the local error e
in the i
th component of the solution. To be successful, the step must have acceptable error, as determined by both the relative and absolute error tolerances:
|e(i)| <= max(RelativeTolerance*abs(y(i)),AbsoluteTolerance(i))
If Solver
is "ode1"
, then theAbsoluteTolerance
property has no effect.
Tip
To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.
Data Types: single
| double
Examples
Create a neural ODE layer. Specify an ODE network containing a convolution layer followed by a tanh layer. Specify a time interval of [0, 1].
inputSize = [14 14 8];
layersODE = [ imageInputLayer(inputSize) convolution2dLayer(3,8,Padding="same") tanhLayer];
netODE = dlnetwork(layersODE);
tspan = [0 1]; layer = neuralODELayer(netODE,tspan)
layer = NeuralODELayer with properties:
Name: ''
TimeInterval: [0 1]
Learnable Parameters Network: [1×1 dlnetwork]
Solver properties GradientMode: 'direct' RelativeTolerance: 1.0000e-03 AbsoluteTolerance: 1.0000e-06 Solver: ode45
Show all properties
Create a neural network containing a neural ODE layer.
layers = [ imageInputLayer([28 28 1]) convolution2dLayer([3 3],8,Padding="same",Stride=2) reluLayer neuralODELayer(netODE,tspan) fullyConnectedLayer(10) softmaxLayer];
net = dlnetwork(layers)
net = dlnetwork with properties:
Layers: [6×1 nnet.cnn.layer.Layer]
Connections: [5×2 table]
Learnables: [6×3 table]
State: [0×3 table]
InputNames: {'imageinput'}
OutputNames: {'softmax'}
Initialized: 1
View summary with summary.
Since R2025a
Create a neural ODE layer. Specify an ODE network containing a convolution layer followed by a tanh layer. Specify a time interval of [0, 1].
inputSize = [14 14 8];
layersODE = [ imageInputLayer(inputSize) convolution2dLayer(3,8,Padding="same") tanhLayer];
netODE = dlnetwork(layersODE);
tspan = [0 1]; layer = neuralODELayer(netODE,tspan)
layer = NeuralODELayer with properties:
Name: ''
TimeInterval: [0 1]
Learnable Parameters Network: [1×1 dlnetwork]
Solver properties GradientMode: 'direct' RelativeTolerance: 1.0000e-03 AbsoluteTolerance: 1.0000e-06 Solver: ode45
Show all properties
Specify an initial step size of 1e-3
and a maximum step size of 1e-2
.
layer.SolverOptions.InitialStep = 1e-3; layer.SolverOptions.MaxStep = 1e-2;
View the solver options.
ans = ODE45 with properties:
InitialStep: 1.0000e-03
MaxStep: 0.0100
RelativeTolerance: 1.0000e-03
AbsoluteTolerance: 1.0000e-06
GradientMode: 'direct'
Tips
- To apply the neural ODE operation in deep learning models defined as functions or in custom layer functions, use dlode45.
Algorithms
The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE y′=f(t,y,θ) for the time horizon (t0,t1) and with the initial condition y(t0) = y0, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y0 is either the network input or the output of another deep learning operation.
To apply the operation, NeuralODELayer
uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver—in computing_y(tn)_, it needs only the solution at the immediately preceding time point, y(tn-1) [2] [3].
Layers in a layer array or layer graph pass data to subsequent layers as formatted dlarray objects. The format of a dlarray
object is a string of characters in which each character describes the corresponding dimension of the data. The format consists of one or more of these characters:
"S"
— Spatial"C"
— Channel"B"
— Batch"T"
— Time"U"
— Unspecified
For example, you can describe 2-D image data that is represented as a 4-D array, where the first two dimensions correspond to the spatial dimensions of the images, the third dimension corresponds to the channels of the images, and the fourth dimension corresponds to the batch dimension, as having the format "SSCB"
(spatial, spatial, channel, batch).
You can interact with these dlarray
objects in automatic differentiation workflows, such as those for developing a custom layer, using a functionLayer object, or using the forward and predict functions withdlnetwork
objects.
This table shows the supported input formats of NeuralODELayer
objects and the corresponding output format. If the software passes the output of the layer to a custom layer that does not inherit from the nnet.layer.Formattable
class, or aFunctionLayer
object with the Formattable
property set to 0
(false
), then the layer receives an unformatted dlarray
object with dimensions ordered according to the formats in this table. The formats listed here are only a subset. The layer may support additional formats such as formats with additional "S"
(spatial) or"U"
(unspecified) dimensions.
If TimeInterval
contains more than two elements, then the layer outputs data with a "T"
(time) dimension.
Input Format | TimeInterval | Output Format |
---|---|---|
"CB" (channel, batch) | [t0 tf] | "CB" (channel, batch) |
[t0 t1 ... tf] | "CBT" (channel, batch, time) | |
"SCB" (spatial, channel, batch) | [t0 tf] | "SCB" (spatial, channel, batch) |
[t0 t1 ... tf] | "SCBT" (spatial, channel, batch, time) | |
"SSCB" (spatial, spatial, channel, batch) | [t0 tf] | "SSCB" (spatial, spatial, channel, batch) |
[t0 t1 ... tf] | "SSCBT" (spatial, spatial, channel, batch, time) | |
"SSSCB" (spatial, spatial, spatial, channel, batch) | [t0 tf] | "SSSCB" (spatial, spatial, spatial, channel, batch) |
[t0 t1 ... tf] | "SSSCBT" (spatial, spatial, spatial, channel, batch, time) | |
"SC" (spatial, channel) | [t0 tf] | "SC" (spatial, channel) |
[t0 t1 ... tf] | "SCT" (spatial, channel, time) | |
"SSC" (spatial, spatial, channel) | [t0 tf] | "SSC" (spatial, spatial, channel) |
[t0 t1 ... tf] | "SSCT" (spatial, spatial, channel, time) | |
"SSSC" (spatial, spatial, spatial, channel, batch) | [t0 tf] | "SSSC" (spatial, spatial, spatial, channel, batch) |
[t0 t1 ... tf] | "SSSCT" (spatial, spatial, spatial, channel, batch, time) | |
"SB" (spatial, batch) | [t0 tf] | "SB" (spatial, batch) |
[t0 t1 ... tf] | "SBT" (spatial, batch, time) | |
"SSB" (spatial, spatial, batch) | [t0 tf] | "SSB" (spatial, spatial, batch) |
[t0 t1 ... tf] | "SSBT" (spatial, spatial, batch, time) | |
"SSSB" (spatial, spatial, spatial, batch) | [t0 tf] | "SSSB" (spatial, spatial, spatial, batch) |
[t0 t1 ... tf] | "SSSBT" (spatial, spatial, spatial, batch, time) | |
"SS" (spatial, spatial) | [t0 tf] | "SS" (spatial, spatial) |
[t0 t1 ... tf] | "SST" (spatial, spatial, time) | |
"SSS" (spatial, spatial, spatial) | [t0 tf] | "SSS" (spatial, spatial, spatial) |
[t0 t1 ... tf] | "SSST" (spatial, spatial, spatial, time) |
Version History
Introduced in R2023b
Specify the solver using the Solver
property. To further customize the "ode45"
or "ode1"
solvers, use theSolverOptions
argument.