Time Series (original) (raw)
Time series data is a sequence of data points in which insights are gained by analyzing changes over time.
Time series data is generally composed of these components:
- Time when the data point was recorded.
- metadata (sometimes referred to as source), which is a label or tag that identifies a data series and rarely changes. Metadata is stored in a metaField. You cannot add metaFields to time series documents after you create them. For more information on metaField behavior and selection, see metaFields.
- Measurements (sometimes referred to as metrics or values), which are the data points tracked at increments in time. Generally these are key-value pairs that change over time.
This table shows examples of time series data:
Example | Measurement | Metadata |
---|---|---|
Stock data | Stock price | Stock ticker, exchange |
Weather data | Temperature | Sensor identifier, location |
Website visitors | View count | URL |
For efficient time series data storage, MongoDB provides time series collections.
New in version 5.0.
Time series collections efficiently store time series data. In time series collections, writes are organized so that data from the same source is stored alongside other data points from a similar point in time.
Important
Backwards-Incompatible Feature
You must drop time series collections before downgrading:
- MongoDB 6.0 or later to MongoDB 5.0.7 or earlier.
- MongoDB 5.3 to MongoDB 5.0.5 or earlier.
Compared to normal collections, storing time series data in time series collections improves query efficiency and reduces the disk usage for time series data and secondary indexes. MongoDB 6.3 and later automatically creates a compound index on the time and metadata fields for new time series collections.
Time series collections use an underlying columnar storage format and store data in time-order. This format provides the following benefits:
- Reduced complexity for working with time series data
- Improved query efficiency
- Reduced disk usage
- Reduced I/O for read operations
- Increased WiredTiger cache usage
Time Series collections are optimal for analyzing data over time. The following table illustrates use cases for time series data:
Industry | Examples |
---|---|
Internet of Things (IoT) | Sensor data (for example, smart home devices or fleet logistics)Machine learning and artificial intelligence scraping |
Financial Services | High frequency tradingFinancial quantitative analysisBanking data (for example, accounting of banking transactions over time)Stock market data |
Retail and E-Commerce | Transaction, sales, and price analysisInventory management |
DevOps | Application loggingInfrastructure and network monitoring |
Time Series collections are not intended for the following types of data:
- Unordered data
- Data that is not time-dependent
Time series collections generally behave like other MongoDB collections. You insert and query data as usual.
Warning
Match expressions in update commands can only specify the metaField. You can't update other fields in a time series document. For more details, see Time Series Update Limitations.
MongoDB treats time series collections as writable non-materializedviews backed by an internal collection. When you insert data, the internal collection automatically organizes time series data into an optimized storage format.
Starting in MongoDB 6.3: if you create a new time series collection, MongoDB also generates a compound indexon the metaField and timeField fields. To improve query performance, queries on time series collections use the new compound index. The compound index also uses the optimized storage format.
Warning
Do not attempt to create a time series collection or view with the name system.profile
. MongoDB 6.3 and later versions return anIllegalOperation
error if you attempt to do so. Earlier MongoDB versions crash.
Starting in MongoDB 8.0, use of the timeField
as a shard key in a time series collection is deprecated.
Also, starting in MongoDB 8.0, if you create a time series collection with a shard key containing the timeField
, a log message is added to the log file on the primary shard. In addition, a log message is added every 12 hours on the primary node of the config server replica set. The log messages state that using the timeField
as a shard key in a time series collection is deprecated and you must reshard your collection using themetaField
.
Time series documents can contain a metaField with metadata about each document. MongoDB uses the metaField to group sets of documents, both for internal storage optimization and query efficiency. For more information about the metaField, see metaField Considerations.
MongoDB automatically creates a compound index on both the metaField and timeField of a time series collection.
Zone sharding does not support time series collections. The balancer always distributes data in sharded time series collections evenly across all shards in the cluster.
To get started with time series collections, see the tutorials on the following pages: