Mongodb Tutorial 1 - Introduction

Sun, Jun 26, 2016 Last Modified: Jul 8, 2016
Category: tutorial Tags: [mongodb] [NoSQL] [database]

To run mongo commands from the source of a JavaScript file,

cat source.js | mongo # or
mongo < source.js # or just
mongo source.js

To import/export data,

$ mongoimport -d <database> -c <collection> -f <file>
$ mongoimport -d <dababase> -c <collection> < file.json
$ mongoexport -d <dababase> -c <collection> --out file.json
$ mongorestore -d <database> -c <collection> file.bson
# by default writes BSON file to dump/ in current directory
$ mongodump -d <database> -c <collection [--out <path>]

What is MongoDB?

A document based NoSQL database with JSON (javascript object notation) elements. One important advantage is to support common data access patterns with one single query without joins. Actually, MongoDB does not support join, which makes it easier to shard/scale out. Joins and multi-table transactions are difficult to do in parallel, which requires scaling up (expensive single server).

Application Architecture

Mongo shell/driver connects to mongod server process through TCP. The course will build a blog website with MongoDB as the datastore. The java course uses SparkJava and Freemarker . The python course uses Bottle and its simple template engine. The drivers will be mongo java and pymongo.

JSON and BSON Documents

For more details on JSON standard, read here . General format: key value pairs in the form of { key : value}. Keys must be strings followed by colon (:) and the corresponding value. Fields are separated by comma (,). Value types include string, date, number, boolean, array, object, and nested fields (recursive).

You can find BSON specs here . MongoDB actually stores data in BSON, binary JSON format. MondoDB drivers sends/receives data as BSON. The drivers map BSON to language appropriate data types. BSON is lightweight, traversable (writing, reading, indexing), and efficient (encoding/decoding quickly).

BSON supports more data types:

  • number (byte, int32, int64, double)
  • date
  • binary
  • supports images

How json documents are encoded as bson:

//JSON
{ "hello" : "world" }
//BSON
"\x16\x00\x00\x00\x02hello\x00
\x06\x00\x00\x00world\x00\x00"
//length of document, type of value, field length, null terminators .etc

Installing MongoDB

Downdload mongodb from here . A tip for the linux versions: after extracting the tarballs, you could simply copy the executables from the bin folder into your virtualenv’s bin path/to/venv/bin folder, assuming you are using pymonogo in a virtualenv with python 2. Alternatively you can copy to /usr/local/bin as suggested for a global use.

$ tar xvf mongodb-linux-x86_64-ubuntu1404-3.2.6.tgz
$ cp mongodb-linux-x86_64-ubuntu1404-3.2.6/bin/* path/to/venv/bin/
$ cd path/to/venv/
$ source bin/activate
$ mkdir -p /data/db
$ sudo chmod 777 /data
$ sudo chmod 777 /data/db
(venv) $ mongod
# in another terminal window
(venv) $ mongo
MongoDB shell version: 3.2.6
connecting to: test
> db.names.insert({'name':'Andrew Erlicson'})
WriteResult({ "nInserted" : 1 })
> db.names.find()
{ "_id" : ObjectId("5778642550b9dd3f38d82b4e"), "name" : "Andrew Erlicson" }

On Windows, download the msi installer and install as directed. Add mongodb bin (C:\Program Files\MongoDB\Server\3.2\bin) folder to PATH.

CRUD Operations

In mongo shell,

> help //list of mongo commands
> show dbs
local 0.070GB
test 0.001GB
> show collections
names
> // video.movies refers to video database movies collection
> db.names.find() // global variable db refers to current database
> use video // mongodb creates database in lazy fashion when data inserted
> db.movies.insertOne({"title": "Jaws", "year": 1975, "imdb": "tt0073195"})
{ "acknowledged":true, "insertId":ObjectId("5778b5782430a299a54686b5")}
// mongodb will add _id field if not specified
> db.movies.find()
{ "_id" : ObjectId("5778b5782430a299a54686b5"),
  "title" : "Jaws", "year" : 1975, "imdb" : "tt0073195" }
> db.movies.find({}).pretty()
{
  "_id" : ObjectId("5778b5782430a299a54686b5"),
  "title" : "Jaws",
  "year" : 1975,
  "imdb" : "tt0073195"
}
> var c = db.movies.find() // returns a cursor
> c.hasNext()
true
> c.next()
{
  "_id" : ObjectId("5778b5782430a299a54686b5"),
  "title" : "Jaws",
  "year" : 1975,
  "imdb" : "tt0073195"
}

Example Project Blog Site

Relational model for the blog. We will need six tables fully denormalized.

posts comments tags post-tags post-comments authors
post_id comment_id tag_id post_id post_id author_id
author_id name name tag_id comment_id username
title comment - - - password
post email - - - -
date - - - - -

In order to show a blog post with comments and tags, we need to join all the six tables.

As for the document model, for a post JSON document in posts collection:

{
  title : "free online tutorial",
  body : "......",
  author : "erlicson",
  date : ISODate(......),
  comments : [ { name: "joe biden", email : "joe@mongodb.org", comment:"..." },
              {.....}, {.....}
             ],
  tags: ["cycling", "education", "startups"]
}

We will need a authors collection with username as primary key:

{
  _id : "erlicosn",
  password : "..."
}

The data is hierarchical. If email is missing from any comment, it does not have to be there. You can leave it out. MongoDB is schemaless and flexible about that.We only need 1 collection, the post collection to display a blog post.

Introduction to Schema Design

“To embed or not embed, that is the question.”

With relational database, you consider normal forms (3rd, 4th, Boyce-Codd) and dependencies. Maybe start with 3rd normal form and combine a few things.

With mongodb, how do you know when to embed? For example, to embed the tags and comments into the posts collection. The answer is that they are typically accessed at the same time. It’s very rare to access a tag independently of accessing a post. The comment itself does not apply to more than one post.

An operation like changing a tag named “cycling” to “biking” for the whole site would be easier in relational world but it is an unusual change to make, something you are not changing all the time.

Another practical concern is the document size. In mongodb, documents cannot be more than 16 MB.

MongoDB Basics Cheatsheet

command effect
db.runCommand{dropDatabase:1} drop the current database, deleting associated files
db.dropDatabase() same as above
db.runCommand{drop:collname} delete collname collection from the current database and associated indexes
db.collname.drop() same as above
show dbs show all databases
show collections show all collections in current database
use dbname switch to dbname database

Resources

  1. MongoDB University Classes
  2. MongoDB Docs

back to top

Link to the MongoDB tutorial series.

comments powered by Disqus