October 11, 2019

Why You Should Ditch Your Database for a Key-Value DB

We explore using a key-value store as your main database instead of a relational database and how it makes building applications simpler and faster.

As developers, we like to build, more often than not, solutions to problems we think we have. When starting a new project, the first task is often gathering up the tools, libraries, and services you’ll need to deploy your eventual project and scale it to billions of users. While I take no issue with dreaming big, developers love the latest trendy frameworks and services. The nobody was ever fired for buying IBM adage is still valid today, albeit more aptly if all the other cool developers are doing it, it can’t possibly be wrong!”

Why is it so hard to resist the urge to build a new wheel?

As developers, our job is to build. We design a system inside our heads and the act of coding transforms our understanding of it into instructions. However, much of the discovery process is lost as thoughts turn into code and comments become stale. Take a look at any code you wrote six months ago and you’ll need time to figure out what was going on and what you were thinking when you wrote it—nevermind if it was somebody else’s code.

So then how should we approach building applications if the natural inclination is to want to build large future-proof software projects? The answer lies in the motivation for building. If it’s to get the project in front of users, then it’s paramount to release it as quickly as possible in the form of a minimum viable product (MVP). This is true regardless whether it’s a chargeable product” or not. Even a UI-less library needs an MVP to get real-world use.

When you’re initially working on an app, build quickly, break well-established rules, and shamelessly reuse code and resources. The initial development stage is not the time to think about elegance. It is by its very nature unscalable—if it was well-architected, you have wasted valuable development time on ancillary aspects that don’t yet matter for apps without users.

For many apps, a database is vital, but I’m here to caution you that it doesn’t matter which one you choose; you’ll end up redesigning your data model if and when you scale. The best course of action is to use the simplest possible data store, like text files in the file system. Whether you’re writing a serverless app or not, integrating a key-value database is an easy choice to make.

By distilling your data model into a series of keys and values, you can better understand the relationships in your data design, and in the future, this will carry over to help scale the app, if it ever gets to that; key-value stores tend to be highly scalable beasts.

Of course, one could argue that you should use a relational model from the get-go. While there are complex apps that benefit from one, not every app needs a relational database. A simple key-value store will often suffice. Remember, it’s easier to over-engineer than under-engineer.

So what operations does a key-value database offer?

  • Get a value by its key
  • Set a key and its value
  • Delete a key and its associated value
  • Enumerate keys, starting at some arbitrary position

It doesn’t seem like much, but with these primitives, you can implement any conceivable data store. Key enumeration is what makes everything work: accessing the right portions of the key space using starting and ending keys to navigate through it.

Most apps will need to keep track of registered users and store information like email address, password hash, and other metadata. With a relational database, you’d typically model this with a users table and define columns for each attribute.

Let’s model it instead in the key-value domain:

users:<id> = {
    “email”: “user@example.com”,
    “created_at”: “2019-09-25T02:03:04Z”,
    ...
}

While it may be easy to work with a JSON value, the data isn’t denormalized and makes querying difficult. A more optimzed decomposition for a key-value model would add an additional key for email:

users:<id> = {
    “email”: “user@example.com”,
    “created_at: “2019-09-25T02:03:04Z”,
    ...
}
users:email:<email> = <user id>

Now, when we implement our login system, we can:

  1. Look up the user by email to get the user id
  2. Look up the user by id to get the actual user details

What we’ve essentially done is implement an index like in a traditional database, but what if we go a step further? Let’s duplicate the data so we only have to make one query instead of two:

users:<id> = {
    “email”: “user@example.com”,
    “created_at: “2019-09-25T02:03:04Z”,
    ...
}
users:email:<email> = {
    “email”: “user@example.com”,
    “created_at: “2019-09-25T02:03:04Z”,
    ...
}

A user lookup then just involves a single retrieval of users:email:<email>. While the duplicate values in the database may seem off-putting and even wrong at first, it turns out that key-value stores are exactly optimized for this kind of denormalization!

“There are only two hard things in computer science: cache invalidation and naming things.”

Phil Karlton

With this model, just think how will I query the data? and design your key space schema accordingly. There is a subtlety to naming your keys in a way that makes range queries possible. Say we wanted to query users by the date they joined, we would add a new key-value pair:

users:date:<date>:<user id> = <user id>

Then it becomes trivial to perform a range query using the prefix users:data:<date>: where date is a normalized ISO 8601 date, yielding a natural sort order. (As in the previous examples, you could store the full user details instead of just the user id.)

As you can see, a key-values are a more free-form approach to modeling your data, but let you focus on only the things that matter and are a great alternative to SQL databases. Often times, we end up spending countless hours setting up databases, schemas, and dealing with ORM systems, where we could simply treat data as keys and values. Of course, key-value databases are not a panacea, but you may be surprised how they can be a great fit for most apps.

In an upcoming blog post, we’ll cover how to practically perform these kinds of range queries with a more complete sample app, so stay tuned!

If you’d like to try building an app using only a key-value database and see how simple it is to apply these principles, I encourage you to give KVdb.io a try!


Tags: Getting Started Key Value Design Denormalization


Next post
Sync localStorage into the Cloud We build a remote storage interface for the Web Storage API, so you can persist localStorage into a remote key-value store.

Like what you see? Give KVdb a spin.