Schema design considerations for NoSQL databases
Here at Spot Trading, we try to stay current with technology, and in doing so, have explored some NoSQL Database solutions. Everyone has heard the hype of NoSQL being a solution to all data problems, and while NoSQL databases can be really awesome, if you don’t plan your schema correctly you can end up with a terrible schema and a system with serious lag time even when the database you pick is perfect for your data. I’ll be talking about SaltStack‘s mongo returner as an example of this.
We’ve been using SaltStack for a while and we chose MongoDB as the backend to store all return data because, in theory, the data is a perfect fit for MongoDB. This is a case where MongoDB is a great choice; however, that doesn’t matter if you don’t consider your schema at all. Now there are plenty of other returners that you could use and quickly swap out from using MongoDB that SaltStack supports, but we’ve been using MongoDB for a while, and I had a strong belief that MongoDB wasn’t at fault and that it was using a bad schema. So I decided to rewrite the returner and test if it improves our performance before we just scrap the database and blindly try something else.
In this instance, the initial returner for MongoDB was set up in such a way that, for every return the Salt Minions were sending, it would store those values in MongoDB in a separate collection named by the minion names with the returns as follows:
<code ” >
“_id” : ObjectId(“54ac07ae4dcec601d6000000”),
“fun” : “test.ping”,
“20150106100501937842” : true
The problem with having this type of schema is that it makes it very difficult to search for any jobs because you are using the job ID as a key, instead of as a value. To find the job ID in question, you will need to scan your entire collection to find the key that matches your job ID.
Secondly, it stores the jobs executed on the master by creating a collection with the job ID as the collection name, and then dumping the json generated by SaltStack into that collection. If you were running a test.ping across all minions, the return it sends to the master would look something like this:
<code ” >
“_id” : ObjectId(“534325be4f710b3d8b0012f9”)
, “tgt_type” : “glob”
, “jid” : “20140407172502227967”
, “tgt” : “*”
, “cmd” : “publish”
, “ret” : “mongo”
, “user” : “root”
, “arg” : [ ]
, “fun” : “test.ping”
What is nice about this return is that it includes the job ID as a value and not a key; what is less awesome about this is that you have to search the collection with the job ID name instead of being able to search for a set of jobs in a single collection.
When thinking about how to change the schema for the SaltStack mongo returner, what stood out to me is that it would make sense to have a jobs collection to store the command sent by the master, and then a saltReturns collection to store all the return data being sent by the minions. The jobs collection didn’t require too much modification since the load information already has the job ID as a value – the only update was to change it from saving a collection with the name of a job ID to simply save all jobs in one <code>jobs</code> collection. For the saltReturns collection, I updated the job ID field to have the key <code>jid</code> and value of the job ID and then to store the return as a value in the field <code>return</code>.
As a side note: I have changed the jobs and saltReturns collections to be capped collections, so that we use our disk space more efficiently. Since we run nightly backups on our mongodata, we have the ability to look back in time as needed; but with a capped collection it keeps us from having to manage our disk space manually or using a TTL (time to live) which does not use space very effectively.
The update to the returner made a huge difference to our times. We went from returns regularly taking over 10 seconds, to coming back to ~2 seconds, including a 2 second sleep that we have configured for reasons that are irrelevant to this blog post. You can see the drop in times in the graph in the upper right hand graph:
This graph shows how our lock % in MongoDB dramatically decreases after this change: