MongoDB Aggregation: Powerful Data Processing Techniques

Learn about MongoDB's aggregation framework, a versatile tool for processing and analyzing data. Explore the different aggregation methods, including the aggregation pipeline and single-purpose methods, and discover how to use them to perform various data transformations and calculations.



MongoDB Aggregation

Aggregation in MongoDB processes multiple documents and returns computed results. You can use it to group values from multiple documents or perform operations on grouped data.

Aggregation Methods

Aggregation can be performed in two ways:

  • Aggregation Pipeline: Uses an array of stages to process documents.
  • Single Purpose Aggregation Methods: Methods like db.collection.estimatedDocumentCount(), db.collection.count(), and db.collection.distinct().

Aggregation Pipeline

The aggregation pipeline consists of one or more stages passed to the db.aggregate() or db.collection.aggregate() method.

Syntax

db.collection.aggregate([ {stage1}, {stage2}, {stage3}...])
        

Each stage receives the output of the previous stage, processes it, and passes it to the next stage. The pipeline executes on the server and can utilize indexes.

Sample Data

Insert the following documents into the employees collection:

Insert Sample Data

db.employees.insertMany([
    { _id: 1, firstName: "John", lastName: "King", gender: "male", email: "john.king@abc.com", salary: 5000, department: { name: "HR" }},
    { _id: 2, firstName: "Sachin", lastName: "T", gender: "male", email: "sachin.t@abc.com", salary: 8000, department: { name: "Finance" }},
    { _id: 3, firstName: "James", lastName: "Bond", gender: "male", email: "jamesb@abc.com", salary: 7500, department: { name: "Marketing" }},
    { _id: 4, firstName: "Rosy", lastName: "Brown", gender: "female", email: "rosyb@abc.com", salary: 5000, department: { name: "HR" }},
    { _id: 5, firstName: "Kapil", lastName: "D", gender: "male", email: "kapil.d@abc.com", salary: 4500, department: { name: "Finance" }},
    { _id: 6, firstName: "Amitabh", lastName: "B", gender: "male", email: "amitabh.b@abc.com", salary: 7000, department: { name: "Marketing" }}
])
        

$match Stage

The $match stage filters documents to include only those matching the specified criteria, similar to the find() method.

Example: $match Stage

db.employees.aggregate([{ $match: { gender: 'female' } }])
        

Output:

[
  {
    _id: 4,
    firstName: 'Rosy',
    lastName: 'Brown',
    gender: 'female',
    email: 'rosyb@abc.com',
    salary: 5000,
    department: { name: 'HR' }
  }
]
    

$group Stage

The $group stage groups input documents by the specified expression and accumulates values for each group.

Example: $group Stage

db.employees.aggregate([{ $group: { _id: '$department.name' } }])
        

Output:

[
  { _id: 'Marketing' },
  { _id: 'HR' },
  { _id: 'Finance' }
]
    

Calculate the number of employees in each department:

Example: Get Accumulated Values

db.employees.aggregate([
    { $group: { _id: '$department.name', totalEmployees: { $sum: 1 } } }
])
        

Output:

[
  { _id: 'Marketing', totalEmployees: 2 },
  { _id: 'HR', totalEmployees: 2 },
  { _id: 'Finance', totalEmployees: 2 }
]
    

$sort Stage

The $sort stage sorts documents based on the specified field in ascending or descending order.

Example: Sort Documents

db.employees.aggregate([
    { $match: { gender: 'male' } },
    { $sort: { firstName: 1 } }
])
        

Output:

[
  {
    _id: 6,
    firstName: 'Amitabh',
    lastName: 'B',
    gender: 'male',
    email: 'amitabh.b@abc.com',
    salary: 7000,
    department: { name: 'Marketing' }
  },
  {
    _id: 3,
    firstName: 'James',
    lastName: 'Bond',
    gender: 'male',
    email: 'jamesb@abc.com',
    salary: 7500,
    department: { name: 'Marketing' }
  },
  {
    _id: 1,
    firstName: 'John',
    lastName: 'King',
    gender: 'male',
    email: 'john.king@abc.com',
    salary: 5000,
    department: { name: 'HR' }
  },
  {
    _id: 5,
    firstName: 'Kapil',
    lastName: 'D',
    gender: 'male',
    email: 'kapil.d@abc.com',
    salary: 4500,
    department: { name: 'Finance' }
  },
  {
    _id: 2,
    firstName: 'Sachin',
    lastName: 'T',
    gender: 'male',
    email: 'sachin.t@abc.com',
    salary: 8000,
    department: { name: 'Finance' }
  }
]
    

Sort grouped data by department name:

Example: Sort Grouped Data

db.employees.aggregate([
    { $match: { gender: 'male' } },
    { $group: { _id: { deptName: '$department.name' }, totalEmployees: { $sum: 1 } } },
    { $sort: { '_id.deptName': 1 } }
])
        

Output:

[
  { _id: { deptName: 'Finance' }, totalEmployees: 2 },
  { _id: { deptName: 'HR' }, totalEmployees: 1 },
  { _id: { deptName: 'Marketing' }, totalEmployees: 2 }
]
    

Use aggregation pipelines to efficiently query and process data from MongoDB collections.