BSON Types (original) (raw)

BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org.

Each BSON type has both integer and string identifiers as listed in the following table:

Type Number Alias Notes
Double 1 "double"
String 2 "string"
Object 3 "object"
Array 4 "array"
Binary data 5 "binData"
Undefined 6 "undefined" Deprecated.
ObjectId 7 "objectId"
Boolean 8 "bool"
Date 9 "date"
Null 10 "null"
Regular Expression 11 "regex"
DBPointer 12 "dbPointer" Deprecated.
JavaScript 13 "javascript"
Symbol 14 "symbol" Deprecated.
32-bit integer 16 "int"
Timestamp 17 "timestamp"
64-bit integer 18 "long"
Decimal128 19 "decimal"
Min key -1 "minKey"
Max key 127 "maxKey"

To determine a field's type, see Type Checking.

If you convert BSON to JSON, see the Extended JSON reference.

The following sections describe special considerations for particular BSON types.

A BSON binary binData value is a byte array. A binData value has a subtype that indicates how to interpret the binary data. The following table shows the subtypes:

Number Description
0 Generic binary subtype
1 Function data
2 Binary (old)
3 UUID (old)
4 UUID
5 MD5
6 Encrypted BSON value
7 Compressed time series data_New in version 5.2_.
8 Sensitive data, such as a key or secret. MongoDB does not log literal values for binary data with subtype 8. Instead, MongoDB logs a placeholder value of ###.
9 Vector data, which is densely packed arrays of numbers of the same type.
128 Custom data

ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values are 12 bytes in length, consisting of:

For timestamp and counter values, the most significant bytes appear first in the byte sequence (big-endian). This is unlike other BSON values, where the least significant bytes appear first (little-endian).

If an integer value is used to create an ObjectId, the integer replaces the timestamp.

In MongoDB, each document stored in a standard collection requires a unique_id field that acts as a primary key. If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.

This also applies to documents inserted through update operations with upsert: true.

MongoDB clients should add an _id field with a unique ObjectId. Using ObjectIds for the _id field provides the following additional benefits:

Important

While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:

Use the ObjectId() methods to set and retrieve ObjectId values.

Starting in MongoDB 5.0, mongosh replaces the legacy mongoshell. The ObjectId() methods work differently in mongosh than in the legacy mongo shell. For more information on the legacy methods, see Legacy mongo Shell.

BSON strings are UTF-8. In general, drivers for each programming language convert from the language's string format to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in BSON strings with ease.[1] In addition, MongoDB$regex queries support UTF-8 in the regex string.

BSON has a special timestamp type for internal MongoDB use and isnot associated with the regular Datetype. This internal timestamp type is a 64 bit value where:

While the BSON format is little-endian, and therefore stores the least significant bits first, the mongod instance always compares the time_t value before the ordinal value on all platforms, regardless of endianness.

In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON timestamp value.

Within a single mongod instance, timestamp values in theoplog are always unique.

Note

The BSON timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. See Date for more information.

BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future.

The official BSON specificationrefers to the BSON Date type as the UTC datetime.

BSON Date type is signed. [2] Negative values represent dates before 1970.

To construct a Date in mongosh, you can use thenew Date() or ISODate() constructor.

To construct a Date with the new Date() constructor, run the following command:

The mydate1 variable outputs a date and time wrapped as an ISODate:


ISODate("2020-05-11T20:14:14.796Z")

To construct a Date using the ISODate() constructor, run the following command:

The mydate2 variable stores a date and time wrapped as an ISODate:


ISODate("2020-05-11T20:14:14.796Z")

To print the Date in a string format, use the toString() method:


Mon May 11 2020 13:14:14 GMT-0700 (Pacific Daylight Time)

You can also return the month portion of the Date value. Months are zero-indexed, so that January is month 0.

decimal128 is a 128-bit decimal representation for storing very large or very precise numbers, whenever rounding decimals is important. It was created in August 2009 as part of theIEEE 754-2008revision of floating points. When you need high precision when working with BSON data types, you should use decimal128.

decimal128 supports 34 decimal digits of precision, orsignificand along with an exponent range of -6143 to +6144. The significand is not normalized in the decimal128 standard, allowing for multiple possible representations:10 x 10^-1 = 1 x 10^0 = .1 x 10^1 = .01 x 10^2, etc. Having the ability to store maximum and minimum values in the order of 10^6144and 10^-6143, respectively, allows for a lot of precision.

In MongoDB, you can store data in decimal128 format using theNumberDecimal() constructor. If you pass in the decimal value as a string, MongoDB stores the value in the database as follows:


NumberDecimal("9823.1297")

You can also pass in the decimal value as a double:


NumberDecimal(1234.99999999999)

You should also consider the usage and support your programming language has for decimal128. The following languages don’t natively support this feature and require a plugin or additional package to get the functionality:

When you perfom mathematical calculations programmatically, you can sometimes receive unexpected results. The following example in Node.js yields incorrect results:


> 0.1

0.1

> 0.2

0.2

> 0.1 * 0.2

0.020000000000000004

> 0.1 + 0.1

0.010000000000000002

Similarly, the following example in Java produces incorrect output:


1

class Main {

2

   public static void main(String[] args) {

3

      System.out.println("0.1 * 0.2:");

4

      System.out.println(0.1 * 0.2);

5

   }

6

}


1

0.1 * 0.2:

2

0.020000000000000004

The same computations in Python, Ruby, Rust, and other languages produce the same results. This happens because binary floating-point numbers do not represent base 10 values well.

For example, the 0.1 used in the above examples is represented in binary as 0.0001100110011001101. Most of the time, this does not cause any significant issues. However, in applications such as finance or banking where precision is important, use decimal128 as your data type.