Mongo DB and its application (a case study of MongoDB in Healthcare)

Today I am sharing about Mongo DB and how it can be used in healthcare along with a comparison of MongoDB with CouchDB.

Recent days, any product development demands more than what we planned initially, in terms of scalability, flexibility and performance. So, considering the growing demands, we should pick tools and technology cautiously before we start the development.

With my need for creating a massive and flexible platform for healthcare industry that should serve nations like India, China, US and other countries, I was analyzing various tools and technologies to find the right tool for my demands.

Some of the objectives of my application

  1. Scalable – with ability for quick turnaround
  2. Constraint Less – my application should be highly customizable / configurable
  3. High Performance and availability
  4. Compatible for Desktop, Mobile and other devices
  5. Cloud Friendly and more…
Why NoSQL or Document Stores and not only RDBMS for healthcare?
Healthcare information is not just about number of records or entities. It deals with hugely diversified data from different specialties (like cardialogy, neurology, etc..), and need to deal with data of Clinical Care, Medication, Labs, History, Demographics, Reports and more. And we should clearly understand that each and every data will have changes every now and then. For example, healthcare standards organization may ask any healthcare providers to introduce / modify / remove a procedure in any given specialty any time. Hence the platform design should incorporate all these demands without any compromise.
Need for Dynamic Contents and Form:
Healthcare or any platform solutions demand for the need to have configurable or dynamically built forms. Here in healthcare, we should always consider to have all the forms and content dynamically. We can build a administrative portal that can be used to build the forms or content. This will be helpful to add / update / remove any procedures or protocols added to healthcare entity as we discussed in previous paragraph. This need to deal with various types of data elements and components. This demand will also require a database without any constraints (schema less).
Considering all these aspects, a healthcare solution cannot be built only using RDBMS or traditional methodologies. We can mix and match various tools for corresponding needs. Any document oriented database can be used to store the Patient record, healthcare record, medication records and lab records as they are subject to change quite often as well as it changes from one healthcare provider to other. Also, we need to have a database, that is highly configurable, without constraints for the need to build dynamic content and forms. There are quite a huge set of document stores / NoSQL databases available for us. But for this case, we picked MongoDB as it is document oriented store without any limitations or schema.
Please note that MongoDB need not be considered as complete database solution across the whole platform. I would recommend to use other technologies like Vertica (columnar database) for reporting and BI solution around same platform and Hadoop for free text based analysis for internal BI tools.

I am gonna talk only about MongoDB today.. But, why MongoDB? why not other document stores / NoSQL databases? There are few reasons for us to think about MongoDB.

Directly uses BSON / JSON Format

It is easy for any developer to use MongoDB as it deals with JSON objects. This eliminates the use of data transposal / manipulation in controllers layer. For eg. if we are using RDBMS or some other databases in backend and with some RIA in front end (most of the RIAs always gets and produces JSON data), you need to convert the JSON to resultset or some POJOs before you persist them in DB. If you use any document store, you can directly store JSON data from UI if you have properly designed both the ends.

Rich Driver Support

MongoDB drivers are available for most of the middle ware or server side scripting languages which you can refer here. The drivers have very good implementation of all the MongoDB controls.

Performance

MongoDB has a good score when compared against its counter parts. MongoDB has its own native drivers, where most of its counter parts has only REST feature available; where in case of MongoDB it has REST access as well as direct protocol access through its drivers. The native drivers will have some positive points over using only REST protocol to access data from our DB. As well, MongoDB uses prealloc strategy to store data, which may again help to improve the performance to a considerable level if we are dealing with lot of access to and from DB.

I am glad to share some of the testing results that I have done considering my healthcare platform

Test Case Details: MongoDB vs CouchDB (check another blog here)

The metrics are like this

Think that  one patient document has four different entities like demographic, medication, lab and more. If you see the test results, each time, I have changed the number of entities per parent entity.

Download / View: MongoDB_VS_CouchDB_TEST_RESULTS

No of Records Demographic Medication Lab Other columns Test Result of COUCH DB – Couch4j Library
Size in MB Time taken to save Time taken to get 1 attribute (parent : name) Time taken to create
1 as 1

100

100

100

100

1.8

665

644

450

100 as 1

100

100

100

100

180.1

58501

43595

18394

950 as 50

100

100

100

100

1700

561915

150221

10000 as 1

100

100

100

100

1 as 1

100

1000

500

100

12

3885

1266

25 as 1

100

1000

500

100

297.9

96821

18730

30 as 1

100

1000

500

100

40 as 1

100

1000

500

100

475 as 1

100

1000

500

100

5500

1541609

1463199

333686

4500 as 1

100

1000

500

100

No of Records Demographic Medication Lab History Max No of records can be stored in 32 Bit (2.5 GB Limitation) Test Result of Mongo DB
Size Time taken to store Time taken to get 1 attribute (parent : name) Time taken to create 1 object # Data Files
1 as 1

100

100

100

100

1000

64

389

249

652

1 + 1
100 as 1

100

100

100

100

190

5460

17565

11276

2 + 1
950 as 50

100

100

100

100

1930

68161

138568

5 + 1
10000 as 1

100

100

100

100

11900

757029

1342676

7 + 1
1 as 1

100

1000

500

100

100

208

693

1381

2 + 1
25 as 1

100

1000

500

100

448

9814

20550

3 + 1
30 as 1

100

1000

500

100

448

17196

24230

3 + 1
40 as 1

100

1000

500

100

448

15259

33704

3 + 1
475 as 1

100

1000

500

100

5900

192082

383862

7 +1
4500 as 1

100

1000

500

100

31900

1766936

3516905

20 + 1

Note: This case study is purely based on my own scenarios and test cases. It may vary depending your scenarios. I am sharing whatever results I got at that time based on my cases only. You need not consider this as your complete reference.

Performance Score based on test results above: 

If you see the above score card, it is very evident that MongoDB has won the performance scores.

Memory Score based on test results above:

You may look that the data file size is huge in terms of MongoDB against CouchDB. It is because of the pre allocation mechanism to store the data. Whenever you create a new document, it pre allocates a space and fills with dummies initially, so that it avoids the time to allocate memory at each and every write; which again slightly has some performance credits. Nowadays, we have luxury to use more memory as we have huge availability and not too costly too.

Compatibility (32 bit system and 64 bit systems):

In this sense, CouchDB has higher score when compared to MongoDB in terms of 32 bit systems. Because, in MongoDB you cannot store data more than 2 GB approxmiately. You can see more details here.

Compatibility with reporting tools and technologies:

MongoDB has highly compatible adapters / drivers to different frameworks other than data access drivers. It has adapter to BIRT reporting, Pentaho for ETL and reporting. Since we have lot of adapterd provided for different languages, we can write our own adapters to fit our needs.

Platform (PaaS) availability:

We have huge set of PaaS providers for MongoDB as listed here. These will reduce the burden of any developer / administrator and business to overcome the burden of maintaining the databases on their own. So, scaling of databases became piece of cake for users.

Support:

10gen is behind MongoDB. They have a very good premium support options. In terms of groups and public support, we have huge list of communities available to support us.

Overall:

If we re-assess all the aspects of the need for healthcare platform, it is pretty clear that we should use a tool like MongoDB as database that stores the data of clinical care, medication, labs, history and to manage the dynamic contents and forms as well. This improved a lot of our development turn around time, to make our platform as configurable as possible more scalable.

Important Notice: All these study and information given is purely based on my own analysis and assessment. It may change based on the respective business model and needs. Kindly do an elaborated analysis before finalizing on any tool for the system.

Advertisement

2 thoughts on “Mongo DB and its application (a case study of MongoDB in Healthcare)

  1. kousikraj Post author

    Now, I am seeing my analysis working fine in real world. At this moment, we are successful in handling thousands of patients with lot of records per patients in real world without any compromise in performance and highly customized data. But still, I wanted to see MongoDB working with millions of records. Will continue blogging about the real world cases as and when I encounter any interesting stats related to MongoDB.

    Reply
  2. Dharshan Rangegowda

    Kousik,

    Thanks for your insightful analysis. Are you deploying MongoDB in your own enterprise or are you using a cloud platform like EC2? I am the co-founder of a Seattle based startup (www.scalegrid.net). We help enterprises deploy MongoDB as a service on their existing private cloud environments like VMWare, CloudStack and OpenStack. We are adding support for public clouds like EC2, Rackspace in further releases. Our product helps easily deploy and manage large replica sets and shards. Other functions like backup, restore, clone etc are also provided. More details are here – http://www.scalegrid.net/mongodirector.htm. If you are interested I would be happy to chat further and understand your use case.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s