Why should I think about being smart when using MongoDB?

Siranjeevi Mahendran
3 min readJan 8, 2021

Building a sustainable system is achieved by either a smart user or a smart product.

Illusion 1: Read and Write tickets availability on MongoDB clusters

A common perception is that the number of reading/write tickets controls the number of connections and sessions on MongoDB clusters from clients. Actually, It's not the case. Read and write tickets control the concurrency in Wired Tiger, which is the default storage engine used by MongoDB. The number of Available read/write tickets controls the parallel number of operations on the Storage engine and not the concurrent operations in the database.

Default values :
read: 128
write: 128

MongoDB has a separate algorithm for saving and yielding operations. A zero ticket count is potentially caused by below 2 reasons :

1) Huge load

2) Slow performance

This can be either due to the slow hardware or no optimized application interactions with the database. Increasing the default values may help in handling the operation queue pressure on the System (CPU bound or I/O bound) but not the system's capacity. Below are some of the downside actions expected to happen when altering the default values.

  • Sustainable performance degradation
  • Expensive context-switches on CPU
  • Decreased throughput (due to increase in queue pressure)

Illusion 2: Multiple granularity locking

Consider a simple service connecting to MongoDB, deployed in k8s. It listens to some inventory events, transforms and persists them, and accepts external client requests via REST to inquire and manipulate the (to be) provisioned data. When there is a need for millions of incoming events to be processed in real-time, the services are deployed across multiple data centers to induce parallel consumption and processing. (Thanks to Pods and partitions in k8s and Kafka, respectively). Here comes the bottleneck where the multiple pods, when interacting to the same MongoDB collection (At times, the same document due to the event nature), the MongoDB ended up locking the different sessions from the service client, and at times this leads to the exhaustion of write tickets causing the server stuck. On most occasions, one is more interested in bringing down the services (Thanks to the scale-up/down button in the k8s dashboard) than analyzing the type of locks acquired by the sessions. This scribbling is to address those immature behaviors and how to handle them efficiently from both servers and client-side.

Lock/ Mutex (Mutual exclusion) comes into the picture for concurrency control, and there are several types of locks available in different database systems to accomplish the synchronization mechanism. MongoDB uses Multiple Granularity locking to guarantee this serializability.

MongoDB follows a tree pattern when locking a resource on intent mode (Global -> Database -> Collection -> Document (Available only on latest WiredTiger storage engine))

Apart from the classical lock modes, S (read/shared) and X (write/exclusive), there are three additional lock modes in multiple granularities:

  • Intention-Shared (IS): explicit locking at a lower level resource of the tree but only with shared locks.
  • Intention-Exclusive (IX): explicit locking at a lower level resource with exclusive or shared locks.
  • Shared & Intention-Exclusive (SIX): the subtree rooted by that node is locked explicitly in shared mode, and explicit locking is done at a lower level with exclusive mode locks.

How can I impose intelligence on application side when (over)using it ?

Sever-side :

One can write custom js scripts to monitor and handle the long-running sessions. One such instance can be the below one :

To find and kill all the long-running operations :

db.currentOp().inprog.forEach(function(opLog) {
//Log all running beyond 20 mins
if (opLog.secs_running > 1200)
printjson(opLog);
//Not recommended to kill but if it cross beyond threshold by concurrent writes
//Caution : Use kill unless there are recent truncate operations with same q with IX locks for w
db.killOp(opLog.opid)
})

Client-side :

  • Channelize the related set of events to a specific partition so that the redundant update of the same document from different threads are in control
  • Custom application code to set timeout and retry at operation level, especially for inserts/updates
  • Manual implementation of the Locking mechanism at App level: https://blog.codecentric.de/en/2012/10/mongodb-pessimistic-locking/
  • Usage of Asynchronous driver for “fire and forget” actions where performance is not a priority

--

--

Siranjeevi Mahendran

Programmer | Debugger | Beer lover | Weekend writer | Off-side Batsman | Mustang Rider | Fellow human