Find and Findone in MongoDB

This post is in continuation of our last post where we have learned basic commands to run on MongoDB. In that post we have seen that we have find method to search something in a document. In today’s post we will see how to extend the find method and use of findone.
Before start, lets create a collection and insert some records (documents) into it:

db.Employee.insert({
Name:{FName:’Deepak’,LName:’Sharma’},
TechnicalSkill:['SQL Server','MSBI','mongoDB'],
Experience:’8yrs’,
JobLocation:’Noida’
})
db.Employee.insert({
Name:{FName:’Sachin’,LName:’Sharma’},
TechnicalSkill:['SQL Server','Sharepoint'],
Experience:’4yrs’,
JobLocation:’Banglore’
})
db.Employee.insert({
Name:{FName:’Abhishek’},
TechnicalSkill:['Perl','C++','Testing'],
Experience:’8yrs’,
JobLocation:’Noida’
})
db.Employee.insert({
Name:{FName:’Ruby’},
TechnicalSkill:['SQL','Testing'],
Experience:’8yrs’,
JobLocation:’Noida’
})
db.Employee.insert({
Name:{FName:’Suresh’,LName:’Chaudhary’},
TechnicalSkill:['SQL Server','MSBI','Informatica'],
Experience:’8yrs’,
JobLocation:’Gurgaon’
})

1. db.collection.find()
This method selects all the documents which matches the condition, if condition is not specified it returns all documents within the collection.
The find method returns 20 documents default, you need to type it to get more results.
find takes two optional parameters- search condition and fields which would be returned by query.
Examples
a. Find all the documents in a collection
db.Employee.find()
This is the simplest form of find method-without any parameters, it will return all documents in collection Employee.
b. Find with select criteria
db.Employee.find({JobLocation:’Noida’})
In this query, we bound our find method to select all documents where JobLocation is Noida. The point to remember here, the above statement return all fields.
c. Find with specify fields
db.Employee.find({},{_id:0,Name:1,JobLocation:1})
This query return Name and JobLocation all documents. _id is the default field which return with every find method, here 1 denotes to True and 0 denotes to False. We can not mix True and False of fields in one statement, this is possible only with _id, if we run
db.Employee.find({},{_id:false,Name:true,JobLocation:false}) , it will throw an error “You cannot currently mix including and excluding fields. Contact us if this is an issue.”
d. Find with forEach
db.Employee.find().forEach(printjson)
This query returns all the documents will all fields in arranged format.

ForEach

find
e. Find with Limit
db.Employee.find().limit(2)
This query will return only 2 documents with all the fields.
f. More options with find method
db.Employee.find()[0]
This query returns first document with all the fields (first document means the document which inserted very first time)
db.Employee.find()[0]._id
This query will return ObjectId of very first document
db.Employee.find()[0]._id.getTimestamp()
This will return the time when the objectID was generated by MongoDB.
FindArray

2.db.collection.findone()
This method selects all the fields which satisfy the optional search criteria and return only one document. If multiple documents qualify the search criteria then it will return the one document according to the insertion order. Again, similar to find, findone also takes two optional parameters- search condition and fields which would be returned by query.
db.Employee.findOne()
will return only first row, in this case it returns all the fields of document where name is Deepak Sharma, because that is the very first document which we insert.
findone

The difference between find and findOne comes when we works with embedded documents, like in the above example, if we filter our search on field Name. 
db.Employee.find().Name will not return anything but db.Employee.findOne().Name will return FName and LName of very first record.
find_findOne

Now, in the last of this post we will see how to use java script in mongoDB shell. MongoDB shell support Java script directly- means, write your code and run it direct on mongoDB shell. The easiest example of Java Script use in mongoDB is:
var json=db.Employee.findOne()
json
We declare a variable name json and assigned the value of it as db.Employee.findOne(), when we write json on mongoDB shell and hit enter then it will execute db.Employee.findOne() and give us the result.

JavaScript

MongoDB- MapReduce Example

In our last post we have learned some basics of Map Reduce in MongoDB. In today’s post we will discuss the same in detail and with an example. As we have already discussed that Map Reduce is two step function- Map and Reduce.
Step 1 – Map
Map step is used to Group the data based on Key-Value. The structure of Map function is:

function(){…..
emit(key,value)
}
emit is a special method which must be invoked by every map. It takes two arguments – key: to group by and value: values to be reduced. Map function can call emit 0 or “n” number of times, which depends on the condition given in Map function. Like in below example, emit will run only when status of customer is active:

function(){
if(this.Customer_Status==’Active’)
emit(this.Customer_ID,this.Order_Quantity)
}
We have to reference of current document in Map function by using keyword this.

Step 2 – Reduce
Reduce step takes the output of Map as input and aggregate the values and return the result. The basic structure of Reduce function is:

function(){…
return result;
}

Reduce step in MongoDB will work only those keys who has array of values, it will not work for a key which has only single value.  Reduce function can invoke multiple times for the same key, in that case the output of one reduce function works as an Input for next reduce.

The next and final step is to call these map and reduce functions in mapReduce function.

Step 3 – mapReduce
The last step is to call mapReduce function with three arguments- Map, Reduce and out. Out specify how the result is return- in form of document or inline.
When we want the result of mapReduce in document then we have to specify the document name, if the document does not exist then mapReduce will create a new document and if document already exists then it will overwrite the values.

When we want to return the result inline, then we can use inline in out.
However mapReduce can take more arguments, we will discuss about them later.

To demonstrate Map Reduce first create a document “Orders” and insert some values into it:

db.Orders.insert({
Customer_Name:”Deepak”,
Order_Date:new Date(“Sept 11, 2014″),
Order_Quantity:2})

db.Orders.insert({
Customer_Name:”Deepak”,
Order_Date:new Date(“Sept28, 2014″),
Order_Quantity:6})

db.Orders.insert({
Customer_Name:”Sachin”,
Order_Date:new Date(“Sept 12, 2014″),
Order_Quantity:4})

db.Orders.insert({
Customer_Name:”Sachin”,
Order_Date:new Date(“Aug 12, 2014″),
Order_Quantity:4})

db.Orders.insert({
Customer_Name:”Abhishek”,
Order_Date:new Date(“Aug 1, 2014″),
Order_Quantity:3})
Create Collection

 

 

 

 

 

 

 

 

Step 1 – Map

var map1=function(){
emit(this.Customer_Name,this.Order_Quantity)
}

MapIn map function we have passed Customer_Name and Order_Quantity, emit takes Customer_Name as key and grouped it on and return array of values-Order_Quantity.

Step 2 – Reduce

var reduce1=function(Customer_Name,arrOrder_Quantity){
return Array.sum(arrOrder_Quantity)
}

ReduceIn reduce function we have passed key value Customer_Name and it apply SUM aggregate function on arrays returned by map function. In this example we store the aggregated result as arrOrder_Quantity.

Step 3 – mapReduce

a.  mapReduce with Document as Out
     db.Orders.mapReduce(map1,reduce1,{out:”Dropthis”})
MapReduce_DocumentStore

 

 

It takes map1 and reduce1 as parameters and stores the result of mapReduce in a new document “Dropthis”. The following will be the output when we run :
db.Dropthis.find()
MapReduce_DocumentStore_Result

 

 

 

 

 

 

 

 

 

b. mapReduce with inline as Out
    db.Orders.mapReduce(map1,reduce1,{out:{inline:1}})

MapReduce_InlineIt gives the aggregated result as inline.

Conclusion: This is introductory post on MapReduce in MongoDB. In examples of this post we have used very simple document which do not have any embedded document or do not have any array of values. We will see later some complex examples of MapReduce.

 

Map Reduce in MongoDB

Map Reduce is data processing approach which takes high or large volume of data as input and gives useful aggregated result. We can compare this by “Group By” and “Aggregated Functions” in RDBMS.

Map Reduce works on two functions: Map and Reduce. In Map function, each input document (which meets the query condition) arranges as Key-Value pairs- Some Keys have multiple Values. In Map function all these entries are clubbed in an array.

Reduce function takes the output of Map function as input and applies the aggregate functions on it and gives the final result in collection.

All Map Reduce function in MongoDB is Java Script code and run within the MongoD process. Before doing Map Reduce by Java Script, let’s understand this by an example.

Suppose we have a Collection like:

Customer_Name Order_Date Order_Quantity
Deepak 12/07/2014 2
Sachin 13/07/2014 4
Deepak 29/07/2014 6
Abhishek 02/08/2014 3
Sachin 08/08/2014 4

 

Now, if we want to know how many orders are requested by Customers, then our answer would be:

Customer_Name Order_Quantity
Deepak 8
Sachin 8
Abhishek 3

 

In SQL, we can write the same as:

SELECT Customer_Name, SUM (Order_Quantity) AS Order_Quantity FROM Customer_Orders
GROUP BY Customer_Names

Now, the same is done by Map Reduce in MongoDB as:

Step 1: Map data: In this step data is arranged in key-values pair. The output looks like:

Deepak[2,6]
Sachin[4,4]
Abhishek[3]

Step 2: Reduce data: In this step the output of Map function is used as input an aggregated function applies on it. As per our requirement we need sum of orders so, SUM function will be used an aggregated function and result look like:

Deepak [8]
Sachin [8]
Abhishek[3]

I hope you are clear now on Map Reduce, in our next post we will discuss the implementation of it in MongoDB.

NoSQL- Introduction


Introduction
What NoSQL is
Types of NoSQL supported databases

Introduction: Today’s market is full of the new buzzwords- NoSQL, Big Data and Clouds etc. In this post we will discuss about latest buzzword- NoSQL and sometime I believe it is most confusing buzzword in today’s world. For last 25 years we are using databases to store data electronically (In 1978 Oracle Version 1 was launched-Source: Wikipedia). In databases, data is stored in Tables and Tables have rows and columns. Tables may have relation with other tables and called Parent and Child table very often; so these databases are called RDMBS. RDBMS must follow the ACID rules: Automaticity, Consistency, Isolation and Durability. RDMBS works perfectly when we work with structured and organized data. Each row has fixed columns in RDBMS, we have to decide the structure (schema) first of table before inserting data into it. Like table Car have predefined columns: ModelNo, Color, Make.

Now in today’s world the nature of data is changing very rapidly. We can consider anything as data- my email id is data, my likes, my posts, my pictures, my browser history, my call detail and infect my geostationary location anything can be used as data. Since the nature of data is growing rapidly the storage capacity is also growing. To store and manage the unstructured data is a big problem.  Since nature of data is changing so it is not very easy to maintain it by using traditional ways. More data mean more space requirement and more work in data management scalability can be an issue with RDBMS.

After consideration of all above limitations companies want a solution which support non-relational environment, which is not schema specific and supports dynamic schema and very easy to maintain and scalable- not only vertically but also horizontally. To overcome all the limitations of RDBMS a new term introduced – NoSQL.

What NoSQL is: NoSQL term came into picture in early of 2009 (However Carlo Strozzi used the term NoSQL in 1998). Carlo pronounces NoSQL as no-seequel. But today NoSQL is known as “Not Only SQL”. NoSQL is the concept which does not follow traditional RDBMS. The very special feature of NoSQL supported database are: they do not work in relational model, they do not use SQL to query data, they support dynamic schema and they are scalable and guarantee the data availability as RDBMS support ACID properties. NoSQL supported databases support BASE (Basic Availability, Soft state, Eventual consistency) properties as RDBMS supports ACID.

Types of NoSQL supported databases: As per the data model used, Query Model structure and Consistency Model structure, NoSQL supported databases can be divided into four categories:

  • Document Databases
  • Key-Value Stores
  • Graph Databases
  • Wide-Column Stores

Document Model: As relational databases store data in rows and columns, document databases store data in documents and fields. These documents use structure in JSON (JavaScript Object Notation). Documents contain one or more fields and each field contains a value with specific data type such as a string, date or array. Document can contain Arrays or even nested documents. Like in RDBMS data stores in multiple rows and columns in tables in document model each record and data associated are typically stored in a single document. This makes very simple to data access. In a document database, the schema is dynamic: each document can contain different fields (which is opposite to fix columns in each row in a single table in RDBMS). This approach makes life of developer, database programmer and database administrators easier when add some new fields in documents in future.

Examples: MongoDB, CouchDB

Key-Value Stores: Key-Value store type of NoSQL databases are simplest databases and very similar to document store. In key-value type every value is stored against a key. Similar to document store, there is no need to define schema for key-value store. Key-Value store database require the key before storing the data and that key must be known while extracting the record. This model stores the key-value in Hash Table.

Examples: Riak, Redis, FoundationDB

Graph Databases: This type of NoSQL database based on Graph Theory and data is stored in form of Graphs- made by Nodes, Edges and properties to store/represent data. The most useful property of this type of database is Index Free, which means every value directly links or points to its associated value; hence no need of Indexes to lookup the values. Nodes in graph database are similar to entities in E-R diagram like people, company, and department. Graph database stores values in Nodes which have properties and organized by relationship which also have some properties. Edge is the connection or relationship between two nodes.

Examples: Neo4J, Titan, Infinite Graph

Wide-Column Stores: In RDBMS data is stored in two dimensional Row and Columns whereas in Wide Column Stores type of NoSQL database store data in one dimensional- In Column only. In this approach columns can be nested and called super columns. Example: In RDMBS, the Employee table can have below structure:

Employee_ID First_Name Last_Name Salary
E001 Deepak Sharma 10,000
E002 Sachin Sharma 12,000

In column store database, the same structure can be stored as:

E001,E002
Deepak,Sachin
Sharma,Sharma
10,000,12,000

Examples: Cassandra, Hadoop, Cloudata.

Conclusion: All RDBMS are most stable and trusted sources to store data. NoSQL databases are still in development. However, we have stable release of each and every NoSQL databases and companies have started to use them. In the last I would like to say that NoSQL is not the replacement of SQL it is just an alternate of SQL.

top

MongoDB-Architecure

In our previous posts we have discussed about installation steps and general database concepts used in MongoDB. In this post we will discuss about Architecture of MongoDB.

As we have discussed in our last post that the table in relational databases is equivalent to collection in MongoDB and rows is equivalent to documents.

Comparison

MongoDB is a document oriented database and supports dynamic schema. The document data model replaces the row in relation model with document-BSON (Binary JSON) in MongoDB. BSON documents can have one or more fields with predefined data type which can also contain Arrays and sub documents. This approach of allowing arrays and sub documents gives more flexibility to MongoDB – we can store complex hierarchy in a single record.

The maximum BSON document size is 16 MB, and every document is stored in records and every record has document as well as some extra space which is used by BSON documents when any update happen which cause to grow BSON document. Like in SQL Server all pages are stored contiguously; all records are contiguously stored in disk and when size of document grow more than 16MB, MongoDB allocate a new record. When MongoDB creates a new record it moves the document into that record and update all the indexes which used the reference of that document. All records are parts of Collection (Table in relational Database) and collection is made of logical grouped documents. Document in collection can have indexes also.

MongoDB documents can have can have all the data of a single record in a single document where as in relational database we can think multiple related tables to store the same. This is one more feature of MongoDB – more localized.

By dynamic schema in MongoDB we can say that documents can vary in structure- where as that is not possible in relational databases. MongoDB collections do not enforce you to define document structure. As an example of Student Collection in our previous post, we can say that for some documents students can have only four fields- Student_ID, F_Name, L_Name and Status but some documents have five fields- Student_ID, F_Name, L_Name, Status and Class. Fields can vary from document to document; if we need to add a new field in document then we can add this without affecting all other documents.

Now, we see the two approaches to store data in MongoDB collections- reference and embedded documents.
References: Like in any relational database, MongoDB also support references among collections by making relationship between documents.

RelationalEmbedded Data: In this new approach we make relationship by storing related data into a single document. We can do it in MongoDB by using arrays or subdocument. This approach helps to retrieve all related information at a single place. 

Embedded1

 

 

 

 

 

 

 

 

In our next posts we will continue our discussion on Architecture of MongoDB. We will see how Indexes are working, How Query is processed.