MongoDB - Jordi Torres

in 10 minutes
Mohannad El Dafrawy
Sara Rodriguez
Lino Valdivia Jr
What is MongoDB?
Document database
Data is structured as schema-less JSON documents
One of the most popular NoSQL solutions
Cross-platform and open source
written in C++
supports Windows, Linux, Mac OS X, Solaris
Features (I)
Document-based storage and querying
o Queries themselves are JSON documents
Full Index Support
o Allows indexing on any attribute, just like in a
traditional SQL solution
Replication & High Availability
o Supports mirroring of data for scalability
Features (II)
Auto-Sharding (horizontal scaling)
o Large data sets can be divided and distributed over
multiple shards
Fast In-Place Updates
o Update operations are atomic for contention-free
Integrated Map/Reduce framework
o Can perform map/reduce operations on top of the
First developed by 10gen (later MongoDB,
Inc.) in 2007
Name comes from “humongous”
Became open source in 2009
Latest stable release (2.4.9) released Jan
Basic Ideas
_id: 1234,
author: { name: “Bob Jones”, email: “[email protected]” },
post: “In these troubled times I like to ...“,
date: { $date: “2014-03-12 13:23UTC” },
● Collections of JSON objects
location: [ -121.2322, 48.1223222 ],
rating: 2.2,
comments: [
{ user: “[email protected]”,
● Embed objects within a
single document
upVotes: 22,
downVotes: 14,
text: “Great point! I agree” },
● Flexible schema
{ user: “[email protected]”,
upVotes: 421,
downVotes: 22,
text: “You are a...” }
tags: [ “databases”, “mongo” ]
● References
Query Example
db.posts.find({ “mike” })
db.posts.find({ rating: { $gt: 2 }})
db.posts.find({ tags: “software” })
db.posts.find().sort({date: -1}).limit(10)
// select * from posts where ‘economy’ in tags order by ts DESC
db.posts find( {tags :‘economy’}) .sort({ts :-1 }).limit(10);
Note on internals
documents stored as
BSON (Binary JSON)
{_id: ObjectId(XXXXXXXXX),
memory-mapped files
indexes are B-Trees
hello: “world”}
\x27\x00\x00\x07 _i d\x00 X X X X X X X X\x02
h e l l o\x00\x06\x00
\x00\x00 w o r l d\x00\x00
Cassandra (1.2)
MongoDB (2.2)
Best used:
Best used:
• When you write more than you read (logging).
• If every component of the system must be in
• If you require Availability + Partition Tolerance
• If you need dynamic queries.
• If you prefer to define indexes, not map/reduce functions.
• If you need good performance on a big DB.
• If you require Consistency + Partition Tolerance
For example: Banking, financial industry (though
For example: For most things that you would do with MySQL
not necessarily for financial transactions, but these
or PostgreSQL, but having predefined columns really holds
industries are much bigger than that.) Writes are
you back.
faster than reads, so one natural niche is data
Why (and why not) MongoDB?
If you need dynamic queries
If you need good performance on a big DB
It doesn't support SQL
It doesn't have any built-in revisioning like
If you wanted CouchDB, but your data
changes too much, filling up disks
It lacks transactions, so if you're a bank,
don’t use it
If you prefer to define indexes, not
map/reduce functions
It doesn't have real full text searching
Production Users
•Archiving - Craigslist
•Content Management - MTV Networks
•E-Commerce - Customink
•Real-time Analytics - intuit
•Social Networking - Foursquare
Long-term goals for MongoDB
To add new features as:
Natural language processing
Full text search engine
More real-time search in data
Personal conclusion
Getting up to speed with MongoDB (document oriented and schema free)
Advanced usage (tons of features)
Administration (Easy to admin,replication,sharding)
Advanced usage (Index & aggregation)
BSON and Memory-Mapped
There are times where not all clients can read or write. CP (Consistency and Partition
• (
Wikipedia: MongoDB (
DB-Engines Ranking (
Interview about the future of MongoDB (
MongoDB Inside and Outside by Kyle Banker (
How This Web Site Uses MongoDB (
Cassandra and MongoDB comparison (

similar documents