Today I am sharing about Mongo DB and how it can be used in healthcare along with a comparison of MongoDB with CouchDB.
Recent days, any product development demands more than what we planned initially, in terms of scalability, flexibility and performance. So, considering the growing demands, we should pick tools and technology cautiously before we start the development.
With my need for creating a massive and flexible platform for healthcare industry that should serve nations like India, China, US and other countries, I was analyzing various tools and technologies to find the right tool for my demands.
Some of the objectives of my application
- Scalable – with ability for quick turnaround
- Constraint Less – my application should be highly customizable / configurable
- High Performance and availability
- Compatible for Desktop, Mobile and other devices
- Cloud Friendly and more…
I am gonna talk only about MongoDB today.. But, why MongoDB? why not other document stores / NoSQL databases? There are few reasons for us to think about MongoDB.
Directly uses BSON / JSON Format
It is easy for any developer to use MongoDB as it deals with JSON objects. This eliminates the use of data transposal / manipulation in controllers layer. For eg. if we are using RDBMS or some other databases in backend and with some RIA in front end (most of the RIAs always gets and produces JSON data), you need to convert the JSON to resultset or some POJOs before you persist them in DB. If you use any document store, you can directly store JSON data from UI if you have properly designed both the ends.
Rich Driver Support
MongoDB drivers are available for most of the middle ware or server side scripting languages which you can refer here. The drivers have very good implementation of all the MongoDB controls.
Performance
MongoDB has a good score when compared against its counter parts. MongoDB has its own native drivers, where most of its counter parts has only REST feature available; where in case of MongoDB it has REST access as well as direct protocol access through its drivers. The native drivers will have some positive points over using only REST protocol to access data from our DB. As well, MongoDB uses prealloc strategy to store data, which may again help to improve the performance to a considerable level if we are dealing with lot of access to and from DB.
I am glad to share some of the testing results that I have done considering my healthcare platform
Test Case Details: MongoDB vs CouchDB (check another blog here)
The metrics are like this
Think that one patient document has four different entities like demographic, medication, lab and more. If you see the test results, each time, I have changed the number of entities per parent entity.
Download / View: MongoDB_VS_CouchDB_TEST_RESULTS
No of Records | Demographic | Medication | Lab | Other columns | Test Result of COUCH DB – Couch4j Library | |||
Size in MB | Time taken to save | Time taken to get 1 attribute (parent : name) | Time taken to create | |||||
1 as 1 |
100 |
100 |
100 |
100 |
1.8 |
665 |
644 |
450 |
100 as 1 |
100 |
100 |
100 |
100 |
180.1 |
58501 |
43595 |
18394 |
950 as 50 |
100 |
100 |
100 |
100 |
1700 |
561915 |
150221 |
|
10000 as 1 |
100 |
100 |
100 |
100 |
||||
1 as 1 |
100 |
1000 |
500 |
100 |
12 |
3885 |
1266 |
|
25 as 1 |
100 |
1000 |
500 |
100 |
297.9 |
96821 |
18730 |
|
30 as 1 |
100 |
1000 |
500 |
100 |
||||
40 as 1 |
100 |
1000 |
500 |
100 |
||||
475 as 1 |
100 |
1000 |
500 |
100 |
5500 |
1541609 |
1463199 |
333686 |
4500 as 1 |
100 |
1000 |
500 |
100 |
No of Records | Demographic | Medication | Lab | History | Max No of records can be stored in 32 Bit (2.5 GB Limitation) | Test Result of Mongo DB | ||||
Size | Time taken to store | Time taken to get 1 attribute (parent : name) | Time taken to create 1 object | # Data Files | ||||||
1 as 1 |
100 |
100 |
100 |
100 |
1000 |
64 |
389 |
249 |
652 |
1 + 1 |
100 as 1 |
100 |
100 |
100 |
100 |
190 |
5460 |
17565 |
11276 |
2 + 1 | |
950 as 50 |
100 |
100 |
100 |
100 |
1930 |
68161 |
138568 |
5 + 1 | ||
10000 as 1 |
100 |
100 |
100 |
100 |
11900 |
757029 |
1342676 |
7 + 1 | ||
1 as 1 |
100 |
1000 |
500 |
100 |
100 |
208 |
693 |
1381 |
2 + 1 | |
25 as 1 |
100 |
1000 |
500 |
100 |
448 |
9814 |
20550 |
3 + 1 | ||
30 as 1 |
100 |
1000 |
500 |
100 |
448 |
17196 |
24230 |
3 + 1 | ||
40 as 1 |
100 |
1000 |
500 |
100 |
448 |
15259 |
33704 |
3 + 1 | ||
475 as 1 |
100 |
1000 |
500 |
100 |
5900 |
192082 |
383862 |
7 +1 | ||
4500 as 1 |
100 |
1000 |
500 |
100 |
31900 |
1766936 |
3516905 |
20 + 1 |
Note: This case study is purely based on my own scenarios and test cases. It may vary depending your scenarios. I am sharing whatever results I got at that time based on my cases only. You need not consider this as your complete reference.
Performance Score based on test results above:
If you see the above score card, it is very evident that MongoDB has won the performance scores.
Memory Score based on test results above:
You may look that the data file size is huge in terms of MongoDB against CouchDB. It is because of the pre allocation mechanism to store the data. Whenever you create a new document, it pre allocates a space and fills with dummies initially, so that it avoids the time to allocate memory at each and every write; which again slightly has some performance credits. Nowadays, we have luxury to use more memory as we have huge availability and not too costly too.
Compatibility (32 bit system and 64 bit systems):
In this sense, CouchDB has higher score when compared to MongoDB in terms of 32 bit systems. Because, in MongoDB you cannot store data more than 2 GB approxmiately. You can see more details here.
Compatibility with reporting tools and technologies:
MongoDB has highly compatible adapters / drivers to different frameworks other than data access drivers. It has adapter to BIRT reporting, Pentaho for ETL and reporting. Since we have lot of adapterd provided for different languages, we can write our own adapters to fit our needs.
Platform (PaaS) availability:
We have huge set of PaaS providers for MongoDB as listed here. These will reduce the burden of any developer / administrator and business to overcome the burden of maintaining the databases on their own. So, scaling of databases became piece of cake for users.
Support:
10gen is behind MongoDB. They have a very good premium support options. In terms of groups and public support, we have huge list of communities available to support us.
Overall:
If we re-assess all the aspects of the need for healthcare platform, it is pretty clear that we should use a tool like MongoDB as database that stores the data of clinical care, medication, labs, history and to manage the dynamic contents and forms as well. This improved a lot of our development turn around time, to make our platform as configurable as possible more scalable.
Important Notice: All these study and information given is purely based on my own analysis and assessment. It may change based on the respective business model and needs. Kindly do an elaborated analysis before finalizing on any tool for the system.
Now, I am seeing my analysis working fine in real world. At this moment, we are successful in handling thousands of patients with lot of records per patients in real world without any compromise in performance and highly customized data. But still, I wanted to see MongoDB working with millions of records. Will continue blogging about the real world cases as and when I encounter any interesting stats related to MongoDB.
Kousik,
Thanks for your insightful analysis. Are you deploying MongoDB in your own enterprise or are you using a cloud platform like EC2? I am the co-founder of a Seattle based startup (www.scalegrid.net). We help enterprises deploy MongoDB as a service on their existing private cloud environments like VMWare, CloudStack and OpenStack. We are adding support for public clouds like EC2, Rackspace in further releases. Our product helps easily deploy and manage large replica sets and shards. Other functions like backup, restore, clone etc are also provided. More details are here – http://www.scalegrid.net/mongodirector.htm. If you are interested I would be happy to chat further and understand your use case.