smoothdata - Smooth noisy data - MATLAB (original) (raw)
Syntax
Description
[B](#bvhejau-B) = smoothdata([A](#bvhejau-A))
smooths entries of A
using a moving average.smoothdata
determines the moving window size from the entries in A
. The window slides down the length of the vector, computing an average over the elements within each window.
- If
A
is a matrix, thensmoothdata
computes the moving average down each column ofA
. - If
A
is a multidimensional array, thensmoothdata
operates along the first dimension ofA
whose size does not equal 1. - If
A
is a table or timetable with numeric variables, thensmoothdata
operates on each variable ofA
separately.
[B](#bvhejau-B) = smoothdata([A](#bvhejau-A),[dim](#bvhejau-dim))
specifies the dimension of A
to operate along. For example, if A
is a matrix, then smoothdata(A,2)
smooths the data in each row of A
.
[B](#bvhejau-B) = smoothdata(___,[method](#bvhejau-method))
specifies the smoothing method for either of the previous syntaxes. For example,smoothdata(A,"sgolay")
uses a Savitzky-Golay filter to smooth the data in A
.
[B](#bvhejau-B) = smoothdata(___,[method](#bvhejau-method),[window](#bvhejau-window))
specifies the smoothing method window size. For example,smoothdata(A,"movmedian",5)
smooths the data inA
by taking the median over a five-element sliding window.
[B](#bvhejau-B) = smoothdata(___,[nanflag](#mw%5Fe409a946-1017-45d9-989d-f7c241e07f4a))
specifies whether to omit or include NaN
values inA
. For example,smoothdata(A,"includenan")
includes allNaN
values when smoothing. By default,smoothdata
ignores NaN
values.
[B](#bvhejau-B) = smoothdata(___,[Name,Value](#namevaluepairarguments))
specifies additional parameters for smoothing using one or more name-value arguments. For example, if t
is a vector of time values, thensmoothdata(A,"SamplePoints",t)
smooths the data inA
relative to the times in t
.
[[B](#bvhejau-B),[winsize](#bvhejau-window-dup1)] = smoothdata(___)
also returns the moving window size.
Alternative
You can use smoothdata
functionality interactively by adding the Smooth Data task to a live script.
Examples
Smooth Data Using Moving Average
Create a vector containing noisy data, and smooth the data with a moving average.
x = 1:100; rng(0,"twister") A = cos(2pi0.05x+2pirand) + 0.5randn(1,100);
B = smoothdata(A);
Plot the original and smoothed data.
plot(x,A) hold on plot(x,B) legend("Input Data","Smoothed Data")
Matrix of Noisy Data
Create a matrix whose rows represent three noisy signals. Smooth the three signals using a moving average, and plot the smoothed data.
x = 1:100; rng(0,"twister") s1 = cos(2pi0.03x+2pirand) + 0.5randn(1,100); s2 = cos(2pi0.04x+2pirand) + 0.4randn(1,100) + 5; s3 = cos(2pi0.05x+2pirand) + 0.3randn(1,100) - 5; A = [s1; s2; s3];
B = smoothdata(A,2);
plot(x,B(1,:)) hold on plot(x,B(2,:)) plot(x,B(3,:)) legend("s1","s2","s3")
Gaussian Filter
Smooth a vector of noisy data with a Gaussian-weighted moving average filter. Display the window size used by the filter.
x = 1:100; rng(0,"twister") A = cos(2pi0.05x+2pirand) + 0.5randn(1,100);
[B,winsize] = smoothdata(A,"gaussian"); winsize
Smooth the original data with a larger window containing 20 elements. Plot the smoothed data for both window sizes.
C = smoothdata(A,"gaussian",20); plot(x,B) hold on plot(x,C) legend("Small Window","Large Window")
Smoothing Involving Missing Values
Create a noisy vector containing NaN
values, and smooth the data ignoring NaN
values.
rng(0,"twister") A = [NaN randn(1,48) NaN randn(1,49) NaN]; B = smoothdata(A);
Smooth the data including NaN
values. The average in a window containing any NaN
value is NaN
.
C = smoothdata(A,"includenan");
Plot the smoothed data in B
and C
.
plot(1:100,B,"-o") hold on plot(1:100,C,"-x") legend("Ignore Missing","Include Missing")
Smooth Data with Sample Points
Create a vector of noisy data that corresponds to a time vector t
. Smooth the data relative to the times in t
, and plot the original data and the smoothed data.
x = 1:100; rng(0,"twister") A = cos(2pi0.05x+2pirand) + 0.5randn(1,100); t = datetime(2017,1,1,0,0,0) + hours(0:99); B = smoothdata(A,"SamplePoints",t);
plot(t,A) hold on plot(t,B) legend("Input Data","Smoothed Data")
Input Arguments
A
— Input data
vector | matrix | multidimensional array | table | timetable
Input data, specified as a vector, matrix, multidimensional array, table, or timetable. If A
is a table or timetable, then either the variables must be numeric, or you must use theDataVariables
name-value argument to list numeric variables explicitly. Specifying variables is useful when you are working with a table that also contains nonnumeric variables.
Data Types: double
| single
|int8
| int16
|int32
| int64
|uint8
| uint16
|uint32
| uint64
|logical
| table
|timetable
Complex Number Support: Yes
dim
— Dimension to operate along
positive integer scalar
Dimension to operate along, specified as a positive integer scalar. If you do not specify the dimension, then the default is the first array dimension whose size does not equal 1.
Consider an m
-by-n
input matrix,A
:
smoothdata(A,1)
smooths the data in each column ofA
and returns anm
-by-n
matrix.smoothdata(A,2)
smooths the data in row ofA
and returns anm
-by-n
matrix.
For table or timetable input data, dim
is not supported and operation is along each table or timetable variable separately.
method
— Smoothing method
"movmean"
(default) | "movmedian"
| "gaussian"
| "lowess"
| "loess"
| "rlowess"
| "rloess"
| "sgolay"
Smoothing method, specified as one of these values:
"movmean"
— Average over each window ofA
. This method is useful for reducing periodic trends in data."movmedian"
— Median over each window ofA
. This method is useful for reducing periodic trends in data when outliers are present."gaussian"
— Gaussian-weighted average over each window ofA
."lowess"
— Linear regression over each window ofA
. This method can be computationally expensive, but results in fewer discontinuities."loess"
— Quadratic regression over each window ofA
. This method is slightly more computationally expensive than"lowess"
."rlowess"
— Robust linear regression over each window ofA
. This method is a more computationally expensive version of the method"lowess"
, but it is more robust to outliers."rloess"
— Robust quadratic regression over each window ofA
. This method is a more computationally expensive version of the method"loess"
, but it is more robust to outliers."sgolay"
— Savitzky-Golay filter, which smooths according to a quadratic polynomial that is fitted over each window ofA
. This method can be more effective than other methods when the data varies rapidly.
window
— Window size
positive integer or duration
scalar | two-element vector of nonnegative integer or duration
values
Window size, specified as a positive integer orduration
scalar or two-element vector of nonnegative integer or duration
values.smoothdata
defines the window relative to the sample points.
- When
window
is a positive integer scalar, then the window has lengthwindow
and is centered about the current element. - When
window
is a two-element vector of nonnegative integers[b f]
, the window contains the current element,b
preceding elements, andf
succeeding elements.
When A
is a timetable orSamplePoints
contains datetime
orduration
values, window
must be of type duration
.
For more information about the window position, see Moving Window Size.
Example: smoothdata(A,"movmean",4)
Example: smoothdata(A,"movmedian",[2 3])
nanflag
— Missing value condition
"omitmissing"
(default) | "omitnan"
| "includemissing"
| "includenan"
Missing value condition, specified as one of these values:
"omitmissing"
or"omitnan"
— IgnoreNaN
values inA
when smoothing. If all elements in the window areNaN
, then the corresponding elements inB
areNaN
."omitmissing"
and"omitnan"
have the same behavior."includemissing"
or"includenan"
— IncludeNaN
values inA
when smoothing. If any element in the window isNaN
, then the corresponding elements inB
areNaN
."includemissing"
and"includenan"
have the same behavior.
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Example: smoothdata(A,SmoothingFactor=0.5)
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: smoothdata(A,"SmoothingFactor",0.5)
Data Options
SamplePoints
— Sample points
vector | table variable name | scalar | function handle | table vartype
subscript
Sample points, specified as a vector of sample point values or one of the options in the following table when the input data is a table. The sample points represent the _x_-axis locations of the data, and must be sorted and contain unique elements. Sample points do not need to be uniformly spaced. The vector [1 2 3 ...]
is the default.
When the input data is a table, you can specify the sample points as a table variable using one of these options:
Indexing Scheme | Examples |
---|---|
Variable name: A string scalar or character vector | "A" or 'A' — A variable namedA |
Variable index: An index number that refers to the location of a variable in the tableA logical vector. Typically, this vector is the same length as the number of variables, but you can omit trailing 0 orfalse values | 3 — The third variable from the table[false false true] — The third variable |
Function handle: A function handle that takes a table variable as input and returns a logical scalar | @isnumeric — One variable containing numeric values |
Variable type: A vartype subscript that selects one variable of a specified type | vartype("numeric") — One variable containing numeric values |
Note
This name-value argument is not supported when the input data is atimetable
. Timetables use the vector of row times as the sample points. To use different sample points, you must edit the timetable so that the row times contain the desired sample points.
Moving windows are defined relative to the sample points. For example, if t
is a vector of times corresponding to the input data, then smoothdata(rand(1,10),3,"SamplePoints",t)
has a window that represents the time interval betweent(i)-1.5
and t(i)+1.5
.
When the sample points vector has data typedatetime
or duration
, the window size must have type duration
.
Example: smoothdata(A,"SamplePoints",0:0.1:10)
Example: smoothdata(T,"SamplePoints","Var1")
Data Types: double
| single
| datetime
| duration
DataVariables
— Table variables to operate on
table variable name | scalar | vector | cell array | pattern | function handle | table vartype
subscript
Table variables to operate on, specified as one of the options in this table. The DataVariables
value indicates which variables of the input table to smooth.
Other variables in the table not specified byDataVariables
pass through to the output without being smoothed.
Indexing Scheme | Values to Specify | Examples |
---|---|---|
Variable names | A string scalar or character vectorA string array or cell array of character vectorsA pattern object | "A" or 'A' — A variable named A["A" "B"] or {'A','B'} — Two variables named A andB"Var"+digitsPattern(1) — Variables named"Var" followed by a single digit |
Variable index | An index number that refers to the location of a variable in the tableA vector of numbersA logical vector. Typically, this vector is the same length as the number of variables, but you can omit trailing0 (false) values. | 3 — The third variable from the table[2 3] — The second and third variables from the table[false false true] — The third variable |
Function handle | A function handle that takes a table variable as input and returns a logical scalar | @isnumeric — All the variables containing numeric values |
Variable type | A vartype subscript that selects variables of a specified type | vartype("numeric") — All the variables containing numeric values |
Example: smoothdata(T,"DataVariables",["Var1" "Var2" "Var4"])
ReplaceValues
— Replace values indicator
true
or1
(default) | false
or 0
Replace values indicator, specified as one of these values whenA
is a table or timetable:
true
or1
— Replace input table variables with table variables containing smoothed data.false
or0
— Append input table variables with table variables containing smoothed data.
For vector, matrix, or multidimensional array input data,ReplaceValues
is not supported.
Example: smoothdata(T,"ReplaceValues",false)
Smoothing Options
SmoothingFactor
— Window size factor
scalar ranging from 0 to 1
Window size factor, specified as a scalar ranging from 0 to 1. Generally, the value ofSmoothingFactor
adjusts the level of smoothing by scaling the window size that smoothdata
determines from the entries in A
. Values near 0 produce smaller moving window sizes, resulting in less smoothing. Values near 1 produce larger moving window sizes, resulting in more smoothing. In some cases, depending on the entries that smoothdata
uses to determine the window size, the value ofSmoothingFactor
may not have a significant impact on the window size.
SmoothingFactor
is 0.25 by default. You can only specifySmoothingFactor
when you do not specifywindow
.
Degree
— Savitzky-Golay degree
nonnegative integer
Savitzky-Golay degree, specified as a nonnegative integer. This name-value argument can only be specified when "sgolay"
is the specified smoothing method. The value of Degree
corresponds to the degree of the polynomial in the Savitzky-Golay filter that fits the data within each window, which is 2 by default.
The value of Degree
must be less than the window size for uniform sample points. For nonuniform sample points, the value must be less than the maximum number of points in any window.
Output Arguments
B
— Smoothed data
vector | matrix | multidimensional array | table | timetable
Smoothed data, returned as a vector, matrix, multidimensional array, table, or timetable.
B
is the same size as A
unless the value of ReplaceValues
is false
. If the value of ReplaceValues
is false
, then the width of B
is the sum of the input data width and the number of data variables specified.
winsize
— Window size
positive integer or duration
scalar | two-element vector of nonnegative integer or duration
values
Window size, returned as a positive integer or duration
scalar or a two-element vector of nonnegative integer orduration
values.
If you specify window
as an input argument, thenwinsize
is the same as window
. If you do not specify window
as an input argument, thensmoothdata
determines the window size from the entries in A
.
More About
Moving Window Size
This table illustrates the window position across the default uniformly spaced sample points vector [1 2 3 4 5 6 7]
.
Description | Window Size and Location | Sample Points in Window | Diagram |
---|---|---|---|
For a scalar window size, the leading edge of the window is included and the trailing edge of the window is excluded. | window = 3Current sample point = 4 | 3, 4, 5 | ![Given elements 1 to 7, if the current sample point is 4, then the corresponding window spans the range 2.5, 5.5). |
window = 4Current sample point = 4 | 2, 3, 4, 5 | ![Given elements 1 to 7, if the current sample point is 4, then the corresponding window spans the range 2, 6). | |
For a vector window size, the leading edge and the trailing edge are included. | window = [2 2]Current sample point = 4 | 2, 3, 4, 5, 6 | ![]() |
For sample points near the endpoints of the input data, these moving statistic smoothing methods truncate the window so it begins at the first sample point or ends at the last sample point. "movmean""movmedian""gaussian" | window = [2 2]Current sample point = 2 | 1, 2, 3, 4 | ![]() |
For sample points near the endpoints of the input data, these local regression smoothing methods shift the window to include the first or last sample point. "lowess""loess""rlowess""rloess""sgolay" | window = [2 2]Current sample point = 2 | 1, 2, 3, 4, 5 | ![]() |
Algorithms
When the window size for the smoothing method is not specified, smoothdata
computes a default window size based on a heuristic. For a smoothing factor τ, the heuristic estimates a moving average window size that attenuates approximately 100*τ percent of the energy of the input data.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
Thesmoothdata
function supports tall arrays with the following usage notes and limitations:
- Tall timetables are not supported.
- The
"rlowess"
and"rloess"
methods are not supported. - Multiple outputs are not supported.
- You must specify the window size.
smoothdata
heuristically determining the window size is not supported. - The
SamplePoints
andSmoothingFactor
name-value arguments are not supported. - The value of
DataVariables
cannot be a function handle.
For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
dim
must be constant.- For complex input
A
, thewindow
argument must be specified. - Variable-size
window
arguments are not supported. - For fixed-size code generation, all input arguments other than
A
must be constant. - For datetime
SamplePoints
values or timetable input data with datetimeRowTimes
, a window size must be specified.
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2017a
R2023a: Specify missing value condition
Omit or include missing values in the input data when smoothing by using the"omitmissing"
or "includemissing"
options. These options have the same behavior as the "omitnan"
and"includenan"
options, respectively.
R2022a: Append smoothed values
For table or timetable input data, append, instead of replace, input table variables with table variables containing smoothed data by setting theReplaceValues
name-value argument tofalse
.
R2021b: Specify sample points as table variable
For table input data, specify the sample points as a table variable using theSamplePoints
name-value argument.