DocsHub
Advanced

Time Series Collections

Learn how MongoDB time series collections work, when to use them, and how they improve on the manual bucket pattern for storing time-stamped data.

Time Series Collections

In the Schema Design section, we built the bucket pattern manually — grouping attendance records by month into single documents to avoid millions of tiny documents. That pattern works, but you had to design and maintain it yourself.

MongoDB has a built-in collection type specifically for this — time series collections. They do the bucketing automatically, store data more efficiently, and are optimized for exactly the kind of data our attendance records represent — measurements tied to a timestamp.


What is Time-Series Data?

Time-series data is any data that consists of measurements over time:

  • Daily attendance records — present/absent per student per day
  • Sensor readings — temperature, humidity over time
  • Stock prices — price at each point in time
  • Server metrics — CPU usage, memory usage logged every minute
  • Login activity — timestamped login events

The defining characteristics:

  • Each document has a timestamp
  • Documents are mostly inserted, rarely updated
  • You usually query by time range
  • The volume of documents grows continuously over time

Creating a Time Series Collection

db.createCollection("attendance_ts", {
  timeseries: {
    timeField: "timestamp",     // required — the date field
    metaField: "studentId",     // optional — what the measurement is about
    granularity: "hours"        // optional — hint for how data is bucketed
  }
})

Required and Optional Fields

OptionRequired?What it does
timeFieldYesThe field containing the timestamp — must be a Date
metaFieldNoA field that identifies what is being measured — stays constant for a series
granularityNo"seconds", "minutes", or "hours" — hints how frequently data arrives

Choosing metaField

metaField should be the value that stays the same across many measurements — in our case, studentId. Many attendance records share the same studentId but have different timestamp values. MongoDB groups documents with the same metaField value together internally for efficient storage.


Inserting Data

Inserting into a time series collection looks exactly like inserting into a normal collection:

db.attendance_ts.insertOne({
  timestamp: new Date("2024-09-01T08:00:00Z"),
  studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
  status: "present",
  courseId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f1")
})
// Insert many attendance records at once
db.attendance_ts.insertMany([
  {
    timestamp: new Date("2024-09-01T08:00:00Z"),
    studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
    status: "present"
  },
  {
    timestamp: new Date("2024-09-02T08:00:00Z"),
    studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
    status: "absent"
  },
  {
    timestamp: new Date("2024-09-01T08:00:00Z"),
    studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f7"),
    status: "present"
  }
])

You write one document per measurement — one document per student per day. MongoDB handles the internal bucketing automatically. You never see or manage the buckets directly.


What Happens Behind the Scenes

Your view — one document per measurement:
{ timestamp: "2024-09-01", studentId: "abc", status: "present" }
{ timestamp: "2024-09-02", studentId: "abc", status: "absent"  }
{ timestamp: "2024-09-03", studentId: "abc", status: "present" }

MongoDB's internal storage — automatically bucketed:
{
  meta: { studentId: "abc" },
  data: {
    timestamp: ["2024-09-01", "2024-09-02", "2024-09-03"],
    status: ["present", "absent", "present"]
  }
}

This is exactly the same idea as the bucket pattern we built manually — but MongoDB does it for you, transparently. You query and insert one-document-per-measurement; MongoDB stores it efficiently.


Querying Time Series Collections

Queries work exactly like normal collections — no special syntax needed.

// All attendance for one student
db.attendance_ts.find({
  studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6")
})

// Attendance in September 2024
db.attendance_ts.find({
  timestamp: {
    $gte: new Date("2024-09-01"),
    $lt: new Date("2024-10-01")
  }
})

// A student's attendance in a specific month
db.attendance_ts.find({
  studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
  timestamp: {
    $gte: new Date("2024-09-01"),
    $lt: new Date("2024-10-01")
  }
}).sort({ timestamp: 1 })

Aggregation on Time Series Data

// Count present vs absent days per student for September
db.attendance_ts.aggregate([
  {
    $match: {
      timestamp: {
        $gte: new Date("2024-09-01"),
        $lt: new Date("2024-10-01")
      }
    }
  },
  {
    $group: {
      _id: { studentId: "$studentId", status: "$status" },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.studentId",
      attendance: {
        $push: { status: "$_id.status", count: "$count" }
      }
    }
  }
])

Result:

[
  {
    _id: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
    attendance: [
      { status: "present", count: 19 },
      { status: "absent", count: 3 }
    ]
  }
]

Time Series vs Manual Bucket Pattern

Manual Bucket PatternTime Series Collection
SetupYou design the bucket document shapeMongoDB handles it automatically
Insert codeupdateOne with $push and upsertSimple insertOne per measurement
Query codeMust unwrap bucket arraysNormal queries — one doc per measurement
Storage efficiencyGood — if designed wellExcellent — MongoDB-optimized compression
FlexibilityFull control over bucket shapeLess control — MongoDB decides bucketing
Best forCustom aggregated summaries (e.g., monthly totals)Raw measurement data, queried by time range

Time series collections are best for raw measurement data — one document per event, queried by time range. The manual bucket pattern is still useful when you want pre-aggregated summaries — like our gradeStats example from the Computed Pattern, where each document represents a calculated rollup, not a raw measurement.


When to Use Time Series Collections

Use a time series collection when:

  • You are recording measurements or events tied to a timestamp
  • Each measurement is mostly write-once — you rarely update old records
  • You query primarily by time range
  • The data volume grows continuously — daily attendance, sensor logs, activity tracking

Do not use a time series collection when:

  • Documents are frequently updated after creation
  • The data is not fundamentally about a point in time
  • You need full control over document structure for complex embedded relationships
// Good fit — attendance records
db.attendance_ts.insertOne({ timestamp: new Date(), studentId, status: "present" })

// Not a good fit — student profile (not time-series data)
db.students.insertOne({ name: "Ali Hassan", grade: "10th", ... })

Converting Our Attendance System

In the Schema Design section, we built a manual bucket pattern for attendance — one document per student per month, with a records array. Let's compare it to a time series approach.

Manual Bucket Pattern (from Schema Design)

// One document per student per month
{
  studentId: ObjectId("..."),
  month: "2024-09",
  totalDays: 22,
  presentDays: 19,
  records: [
    { date: new Date("2024-09-01"), present: true },
    { date: new Date("2024-09-02"), present: false },
    // ... rest of month
  ]
}

Time Series Approach

// One document per student per day
db.attendance_ts.insertOne({
  timestamp: new Date("2024-09-01"),
  studentId: ObjectId("..."),
  status: "present"
})

When to Choose Which

Use the manual bucket pattern when you also want pre-calculated summaries stored alongside the raw data — totalDays, presentDays per month, ready to read instantly without aggregation.

Use a time series collection when you want raw, queryable daily records and are comfortable running aggregations for summaries — MongoDB's time series storage is highly optimized for exactly this kind of query.

Many real systems use both — a time series collection for raw attendance records, and a separate computed-pattern collection (like gradeStats from earlier) that periodically aggregates the time series data into summaries for fast dashboard reads.


Indexes on Time Series Collections

MongoDB automatically creates an index on timeField and metaField. You can create additional indexes just like a regular collection:

// Additional index for querying by course
db.attendance_ts.createIndex({ courseId: 1, timestamp: 1 })

Quick Reference

// Create a time series collection
db.createCollection("attendance_ts", {
  timeseries: {
    timeField: "timestamp",
    metaField: "studentId",
    granularity: "hours"
  }
})

// Insert — same as any collection
db.attendance_ts.insertOne({
  timestamp: new Date(),
  studentId: ObjectId("..."),
  status: "present"
})

// Query — same as any collection
db.attendance_ts.find({
  studentId: ObjectId("..."),
  timestamp: { $gte: startDate, $lt: endDate }
})

If you are starting a new feature that records events over time — attendance, activity logs, sensor data, audit trails — create it as a time series collection from day one. The syntax for inserting and querying is identical to a normal collection, so there is no extra complexity in your application code, but you get MongoDB's optimized storage and query performance for free.

On this page