bitcask (original) (raw)

Go Reference Build Status Test Status

A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout (LSM+WAL) similar to Riak

For a more feature-complete Redis-compatible server, distributed key/value store have a look at Bitraft which uses this library as its backend. Use Bitcask as a starting point or if you want to embed in your application, use Bitraft if you need a complete server/client solution with high availability with a Redis-compatible API.

Table of Contents:

Features

Migrating from Bitcask v1

If you are migrating from Bitcask v1 ([git.mills.io/prologic/bitcask)bitcask-v1), to Bitcask v2 (go.mills.io/bitcask/v2), please update your code as follows:

Is Bitcask right for my project?

Note

Please read this carefully to identify whether using Bitcask is suitable for your needs.

bitcask is a great fit for:

bitcask is not suited for:

Note

However that storing large amounts of data in terms of value(s) is totally fine. In other words, thousands to millions of keys with large values will work just fine.

Transactions

Bitcask supports transactions with ACID semantics. A call to Txn() returns a new transaction that is a snapshot of the current trie of keys. Keys written to a transaction are committed as a single batch operation, providing Atomicity.

As writes are performed in the transaction, we maintain an internal cache of new entries written within the transaction. Thus, any follow up reads on the same key by this transaction would see this write. But, other transactions won’t, providing Isolation and Consistency.

Finally Durability in Bitcask is guaranteed with by a write-ahead-log of the current datafile and further guaranteed by enabling synchronous writes with theWithSyncWrites(true) option.

Warning

A transaction is not thread safe and should only be used by a single goroutine.

Development

Install

Usage (library)

Install the package into your project:

See the GoDoc for further documentation and other examples.

See also examples

Configuration Options

The default options (if none are specified) default to a Bitcask instance with:

The defaults are designed for high performance in mind with recovery on startup and support limits of ~16M keys and ~1GB of persitent storage with the default file descriptor limits on most Linux systems.

Any of these options can be changed with any of the WithXXX(...) options.

Note

If you require better reliability over performance, please enable synchronous writes with the WithSyncWrites(true).

Bitcask is an embedded key/value store designed for handling write-intensive workloads. However, frequent write operations leading to a large number of new key-value pairs over time can result in issues like "Too many open files" (#193) errors due to the creation of numerous data files. These problems can be mitigated by periodically compacting the data through issuing a .Merge() operation, increasing the maximum value size with the MaxDatafileSize() option, and increasing the process file descriptor limit. Example: With a MaxDatafileSize(1<<30) (1GB) and a file descriptor limit of 1M (million) files, you are able to store up to 1PB (Petabytes) of (compacted) data before you hit "Too many open files", assuming a single machine can even handle this.

You should consider your read/write workloads carefully and ensure you set appropriate file descriptor limits with ulimit -n that suit your needs.

Usage (tool)

Usage (server)

There is also a builtin very simple Redis-compatible server called bitcaskd:

Example session:

Docker

You can also use the Bitcask Docker Image:

Performance

For 128B values:

The full benchmark above shows linear performance as you increase key/value sizes.

As far as benchmarks go, this is all contrived and generally not typical of any real workloads. These benchmarks were run on a 2022 Mac Studio M1 Max with 32GB of RAM. Your results may differ.

Contributors

Thank you to all those that have contributed to this project, battle-tested it, used it in their own projects or products, fixed bugs, improved performance and even fix tiny typos in documentation! Thank you and keep contributing!

You can find an AUTHORS file where we keep a list of contributors to the project. If you contribute a PR please consider adding your name there.

License

bitcask is licensed under the term of the MIT License