The word Mongo is derived from the world humongous. So in essence, it implies that its sole purpose is to store mammoth amounts of data or what is otherwise coined as ‘big data’.
MongoDB is a document oriented database unlike its traditional relational counterparts such as SQL Server, DB2, Oracle, PostgreSQL etc. MongoDB uses a JSON like syntax which is made up of name value pairs.
The fact that MongoDB is a document database, one of the rules that govern MongoDB is that every document must be unique. Hence it should have an ID. It’s also important to remember that the size of this document is typically 16MB.
MongoDB is elastic and therefore scales horizontally unlike RDBMS that scale vertically.
One of the key concepts of MongoDB is that there should always be a copy of the primary database (although not required), it is recommended that there be at least two copies of a database. In the event of a failure of primary database, the database can be restored from one of secondary servers.
The world we live in today where businesses are constantly evolving, billions of people and ‘things’ are always communicating, changing the way organizations and customers interact with each other and the environment around them. Data comes from different geographical locations and across multiple channels. Managing this explosion of high-velocity dynamic data while maintaining customer privacy is a challenge with legacy systems to say the least. That said, data is therefore of paramount importance for any organization large or small.
The solution to support rapidly growing applications is to scale horizontally by adding servers instead of concentrating more capacity in a single server. NoSQL databases, on the other hand, usually support auto-sharding, meaning that they natively and automatically spread data across an arbitrary number of servers, without requiring the application to even be aware of the composition of the server pool. Data and query load are automatically balanced across servers, and when a server goes down, it can be quickly and transparently replaced with no application disruption. This is especially an ideal situation for web applications.
While it is true that NoSQL databases lack transaction support and semantics database element that offers guarantees about data consistency and persistence. This is a solid tradeoff based on MongoDB’s goal of being simple, fast, and scalable. Once you leave those heavyweight features at the door, it becomes much easier to scale horizontally.
Some of the big 20 companies such as Google, Facebook, LinkedIn use NoSQL databases and it fits very well with their business model. That said, MongoDB has truly become a global company with over 50,000 members, 100 User Groups all around the world. As it stands right now, MongoDB has 29 offices in 14 different countries with over 3000 customers.
What’s in MongoDB
- Key-values stores (JSON like syntax i.e. key/value pairs)
- Column Family Stores (hierarchical schemas)
- Document Databases
- Multi-nested data
- High velocity data coming at a very high rate of speed
- Graph Databases
- Unstructured data and by not enforcing transactional consistency
- Document oriented databases are schema-agnostic. This allows for agility and highly iterative development
- Durability when used in tandem with at least three servers (minimum)
- Profiling Queries
Types of NoSQL Databases
- MongoDB & Redis (key-value stores)
- Cassandra & Hbase, Hypertable (wide-column stores are optimized for queries over large datasets)
- Neo4, Infinite Graph, OrientDB, FlockDB (mainly used for storing data about networks, social connections et al.)
Is MongoDB the Right Choice?
While it’s all fine and dandy to jump to this new shiny NoSQL database, but you ought to step back and ask yourself the following questions:
- What are some of characteristics of my data?
- What are the business needs of my data?
- Am I required to query across multiple tables and possibly across multiple databases?
- Do I need my data to be transactional?
There’s no question, when compared to relational databases, NoSQL databases are more scalable and provide superior performance. Following are some of its benefits:
- Speed and large volumes of rapidly changing structure, semi-structure and unstructured data
- Open source, hence mostly Free
- Object-oriented programming that is easy to use and flexible
- Simplicity (with virtually no complex rules i.e. tables, relationships and less Object-Relational impedance mismatch
- Geographically distributed scale-out architecture instead of expensive, monolithic architecture
MongoDB uses an open data forma called BSON which is similar to the JSON format. The BSON data format was developed by the MongoDB team. This format is special in that it facilitates searching of documents rapidly including the ability to add types for handling binary data. MongoDB stores data in BSON documents which is self-contained.
Relational databases are designed to scale vertically, in that a single server has to host the entire database to ensure acceptable performance for cross-table joins and transactions. This can become an expensive proposition while placing limits on scale. The solution to support rapidly growing applications is to scale horizontally, by adding servers or cloud instances instead of concentrating more capacity in a single server.