Sometimes an application has a need for a local key/value store. In these scenarios, there are several options, including RocksDB. Today’s exploration will be to dig into using RocksDB with F#.
If you’re unfamilar with RocksDB, it is a local key/value store that you can embed in your application. I’ve found it to be a valuable addition to the application toolbox. For anyone following along, The below examples use .NET version 5 and the RocksDB wrapper library RocksDbSharp version 6.2.2. I’ve also included some simple setup and helper functions that are used in the later examples.
1 | $ dotnet add package RocksDbNative --version 6.2.2 |
1 | open RocksDbSharp |
The obvious place to get started is with some simple saving and retrieval of key/value pairs. Before jumping right in, it is useful to know that RocksDB stores keys and values as byte arrays. This provides a good deal of flexibility, but it puts the responsibility on the developer to determine the best serialization method for object storage. Depending on the data being stored, this can be an extra step to worry about, but I like the power it provides with a raw interface. To this end, RocksDBSharp supports direct interactions using byte array keys and values. For convenience is also supports the common case of accepting strings as keys and values, converting them to byte arrays under the covers. For the following example, the scenario is storing multiple worker states in the key/value store. The first thing to do is open the database. In this particular case, I’ll also create the database if it doesn’t exist. The library supports many of the standard RocksDB database configuration options. Once the database is open, I can start to do something useful. Data is added using Put
, retrieved using Get
, and deleted using Remove
. It also provides a handy MultiGet
for retrieving multiple values into a collection.
1 | let dbPath = "/var/data/worker-data" |
Anyone who has worked with a key/value store recognizes an inherent challenge of key organization. A lot can be done using naming conventions, but sometimes there is a need for better segmentation. Although separate databases are an option, RocksDB has a nicer option. It supports column families. Column families are a way to group together related data into its own structure within the same database. By specifying the Column Family when doing gets/puts, the data is segmented appropriately. In the previous example I was storing just worker states. Assuming I need to support different types of data, it potentially makes sense to segment worker states from user session data. Obviously proper naming conventions for keys can provide simple groupings, but column families bring a more proper segmentation of data. It should be noted, this isn’t a security boundary, but a structural one to assist with data interactions.
Looking at the example below, there are a couple key parts. First is that the database must be opened with the available column family definitions. More specifically, this must include the definitions of all column families in the database. In this case, I’m defining two column families: one for worker states, and one for user session data. The second is that Get
and Put
must specify the column family where the data is located. Beyond that, the interactions are similar to the previous example.
1 | let dbPath = "/var/data/state-data" |
RocksDB isn’t limited to just single key lookups. It also supports iterators. Say, for example, that I want to grab a set of session data. Below is an example of how to do that. To start out, I create a fake set of sessions and store them in the database. This way I have something to query against. The iterator can limit by a range of keys. The Lower bound is defined by the initial Seek()
method. The upper bound is defined by the SetIterateUpperBound()
option defined when opening the iterator. An upper bound isn’t strictly required, if not defined it reads to the end of all keys. The example below will return all key/value pairs where the key is >= session_40
and < session_50
.
1 | // Setup db options |
Sometimes it is useful to save a set of changes to the database in a single batch or to work with a set of data prior to actually writing it to the database. RocksDB provides a WriteBatch interface that permits just that. It supports the common actions like Get/Put/Remove. This allows for the ability to keep data in memory to do data manipulation while leveraging the familiar database interface. Once the new data is in the desired state, then it can be saved to the database by calling Write
. This call is an atomic transaction for saving the data to key/value store.
1 | let options = DbOptions().SetCreateIfMissing(true) |
Next I want to discuss transaction support. This is also the perfect time to address some deficiencies and features in this particular library. The RocksDbSharp library does not have a nice wrapper for every bit of RocksDB functionality, including transaction support. Nicely enough, it does provide a reasonable middle ground to fill in those gaps. It includes “Native” wrappers around the raw apis. This means even if RocksDbSharp doesn’t have a nice wrapper, it at least provides a lower level mechanism to access the underlying RocksDB APIs. This is great, and exactly what I need to get transactions working. For example purposes I’m going to keep all the Native
calls together. In a real project, I’d put this into it’s own module/class to properly abstract the underlying apis for the rest of the application.
Since this is all lower level, it won’t be as clean as the previous code, but it gets me where I need to go. The most obvious thing about the code below is all the calls using the Native.Instance.<method>
syntax. This is the RocksDbSharp interface to the lower-level apis. Although I try to avoid them when possible, I need to use some mutable variables in order to manually cleanup objects with some .Dispose()
calls.
Now to walk through the process. First, RocksDB uses a specific transaction database
object. It also requires its own options object. For the particular example I needed to increase the transaction expiration timeout, the default was just too short. Your mileage may vary. The transaction object needs a write options object, so I set that up as well. I then setup a couple mutable variables so I can properly dispose of them in the finally
block.
There are a couple things to call out for the general flow.
- Open the transaction-supporting database
dbTrans <- Native.Instance.rocksdb_transactiondb_open(options.Handle, transactionOptions, dbPath)
- Begin the transaction
txn <- Native.Instance.rocksdb_transaction_begin(dbTrans, writeOptions, transactionOptions, nullptr)
- Add a key/value pair to the transaction
Native.Instance.rocksdb_transaction_put(txn, key, unativeint key.Length, value, unativeint value.Length, ref err)
- Rollback the transaction if necessary
Native.Instance.rocksdb_transaction_rollback(txn)
- Commit the transaction
Native.Instance.rocksdb_transaction_commit(txn)
- In the finally block, perform all the cleanup necessary
1 | // Define worker states |
Using a database is more than just saving and retrieving data. Backups and snapshots are often something that need to be handled. RocksDB and the RocksDbSharp library provide a simple way to address these issues using checkpoints. This is a way to easily snapshot the database state for either a point-in-time reference as a full data store backup.
1 | let options = DbOptions().SetCreateIfMissing(true) |
There is a lot more of RocksDB I could cover, but the goal is to give a taste of how the RocksDBSharp library can be leveraged. Hopefully this gives you enough of a start to take your F# project further using RocksDB. Until next time, rock on.