Main Tutorials

MongoDB – Aggregate and Group example

mongodb-group-example

In this tutorial, we will show you how to use MongoDB aggregate function to group documents (data).

1. Test Data

Data in JSON format, shows the hosting provider for website.

website.json
{ "_id" : 1, "domainName" : "test1.com", "hosting" : "hostgator.com" }
{ "_id" : 2, "domainName" : "test2.com", "hosting" : "aws.amazon.com"}
{ "_id" : 3, "domainName" : "test3.com", "hosting" : "aws.amazon.com" }
{ "_id" : 4, "domainName" : "test4.com", "hosting" : "hostgator.com" }
{ "_id" : 5, "domainName" : "test5.com", "hosting" : "aws.amazon.com" }
{ "_id" : 6, "domainName" : "test6.com", "hosting" : "cloud.google.com" }
{ "_id" : 7, "domainName" : "test7.com", "hosting" : "aws.amazon.com" }
{ "_id" : 8, "domainName" : "test8.com", "hosting" : "hostgator.com" }
{ "_id" : 9, "domainName" : "test9.com", "hosting" : "cloud.google.com" }
{ "_id" : 10, "domainName" : "test10.com", "hosting" : "godaddy.com" }

Imports into a “website” collection.

> mongoimport -d testdb -c website --file website.json
connected to: 127.0.0.1
Mon Jan 13 14:30:22.662 imported 10 objects
Note
If the collection is existed, add --upsert option to override the data.


> mongoimport -d testdb -c website --file website.json --upsert

2. Grouping Example

Uses db.collection.aggregate and $group to perform the data grouping.

2.1 The following example groups by the “hosting” field, and display the total sum of each hosting.


> db.website.aggregate(
    { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
    }
  );

Output


{
        "result" : [
                {
                        "_id" : "godaddy.com",
                        "total" : 1
                },
                {
                        "_id" : "cloud.google.com",
                        "total" : 2
                },
                {
                        "_id" : "aws.amazon.com",
                        "total" : 4
                },
                {
                        "_id" : "hostgator.com",
                        "total" : 3
                }
        ],
        "ok" : 1
}

The equivalent SQL.


SELECT hosting, SUM(hosting) AS total
       FROM website
       GROUP BY hosting

2.2 Add sorting with $sort.


>  db.website.aggregate(
     { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
     },
     {
	$sort : {total : -1}
     }
  );

Output – Display “total” in descending order. For ascending order, uses $sort : {total : 1}.


{
        "result" : [
                {
                        "_id" : "aws.amazon.com",
                        "total" : 4
                },
                {
                        "_id" : "hostgator.com",
                        "total" : 3
                },
                {
                        "_id" : "cloud.google.com",
                        "total" : 2
                },
                {
                        "_id" : "godaddy.com",
                        "total" : 1
                }
        ],
        "ok" : 1
}

2.3 Add $match condition, groups by “hosting” for “aws.amazon.com” only.


> db.website.aggregate(
    { 
	$match : {hosting : "aws.amazon.com"}
    },
    { 
	$group : { _id : "$hosting", total : { $sum : 1 } }
    }
  );

Output


{
        "result" : [
                {
                        "_id" : "aws.amazon.com",
                        "total" : 4
                }
        ],
        "ok" : 1
}
More Examples
Refer to this official MongoDB Aggregation guide for more advance aggregation and group examples.

3. Exports Grouping Result to CSV or JSON

Often times, we need to export the grouping results in csv or JSON format. To solve it, inserts the group results in a new collection, and exports the new collection via mongoexport.

3.1 Set the group results in a variable. In this case, the variable name is “groupdata”.


> var groupdata = db.website.aggregate(
    { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
    },
    {
	$sort : {total : -1}
    }
  );

3.2Inserts groupdata.toArray() into a new collection.


> db.websitegroup.insert(groupdata.toArray());

> db.websitegroup.find().pretty()
{ "_id" : "aws.amazon.com", "total" : 4 }
{ "_id" : "hostgator.com", "total" : 3 }
{ "_id" : "cloud.google.com", "total" : 2 }
{ "_id" : "godaddy.com", "total" : 1 }
>

3.3 Exports the collection “websitegroup” to a csv file.


c:\> mongoexport -d testdb -c websitegroup -f _id,total -o group.csv --csv
connected to: 127.0.0.1
exported 4 records
group.csv

_id,total
"aws.amazon.com",4.0
"cloud.google.com",2.0
"godaddy.com",1.0
"hostgator.com",3.0

3.4 Exports the collection “websitegroup” to a JSON file.


c:\> mongoexport -d testdb -c websitegroup -o group.json
connected to: 127.0.0.1
exported 4 records
group.json

{ "_id" : "aws.amazon.com", "total" : 4 }
{ "_id" : "cloud.google.com", "total" : 2 }
{ "_id" : "godaddy.com", "total" : 1 }
{ "_id" : "hostgator.com", "total" : 3 }

4. Large Sort Operation

Changed in version 2.6 – Read this Memory Restrictions
In MongoDB, the in-memory sorting have a limit of 100M, to perform a large sort, you need enable allowDiskUse option to write data to a temporary file for sorting.

To avoid the sort exceeded memory limit error, enable the allowDiskUse option.


db.website.aggregate(
[
	{$group : {_id : "$hosting", total : { $sum : 1 }}},
	{$sort : {total : -1}}
],
	{allowDiskUse: true}
);

References

  1. MongoDB Aggregation
  2. MongoDB db.collection.aggregate()
  3. Aggregation Pipeline Limits
  4. MongoDB Hello World Example

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
10 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Ulysses Alexandre Alves
7 years ago

Your tutorial is much better than the official MongoDB documentation for aggretation and grouping! Thank you for sharing.

Tamilanban Rajendran
5 years ago

How to perform the group operation based on sub document in Java Springboot. I have the Mongo DB query..If I convert in Java its not working . Can anyone help me on this .

prtk
8 years ago

why {$sum:1} is use in this aggregation ?

rue lim
8 years ago

Hello Sir.

I need help with or condition in the where (‘$match’) query. I want something like this but it doesn’t work. Could you please help me?

> db.website.aggregate(
{
$match : {$or : [ {hosting : “aws.amazon.com”},{hosting: “cloud.google.com”} ] } },
{
$group : { _id : “$hosting”, total : { $sum : 1 } }
}
);

Suraj
5 years ago
Reply to  rue lim

Aggregate takes filters(match, group etc..) inside square brackets.
db.website.aggregate(
[
{
$match : {$or : [ {hosting : “aws.amazon.com”},{hosting: “cloud.google.com”} ] } },
{
$group : { _id : “$hosting”, total : { $sum : 1 } }
}
]
);

JH
8 years ago

And can I return multiful fields? not for query options but as results.

carl
9 years ago

Mkylong, thanks for this tutorial, also had the same problem that Erics reports and resolved as he says.
once again, thanks

Erik
9 years ago

Hi, tnx for the tutorial. This statement did not work:

db.websitegroup.insert(groupdata.result);

But changing it to

db.websitegroup.insert(groupdata.toArray());

did the trick.

Arslan
9 years ago

Hello Mkyong
This tutorial is very helpful I want to ask aboutHow to process unstructured data ? like any web data and also how to handle large files like audio,video etc Thanks

Udhayakumar GK
10 years ago

Great Tutorial. I got it useful.

Thanks,
Udhayakumar G K
http://saibabaprayerrequest.blogspot.in