Time Series Collections
Learn how MongoDB time series collections work, when to use them, and how they improve on the manual bucket pattern for storing time-stamped data.
Time Series Collections
In the Schema Design section, we built the bucket pattern manually — grouping attendance records by month into single documents to avoid millions of tiny documents. That pattern works, but you had to design and maintain it yourself.
MongoDB has a built-in collection type specifically for this — time series collections. They do the bucketing automatically, store data more efficiently, and are optimized for exactly the kind of data our attendance records represent — measurements tied to a timestamp.
What is Time-Series Data?
Time-series data is any data that consists of measurements over time:
- Daily attendance records — present/absent per student per day
- Sensor readings — temperature, humidity over time
- Stock prices — price at each point in time
- Server metrics — CPU usage, memory usage logged every minute
- Login activity — timestamped login events
The defining characteristics:
- Each document has a timestamp
- Documents are mostly inserted, rarely updated
- You usually query by time range
- The volume of documents grows continuously over time
Creating a Time Series Collection
db.createCollection("attendance_ts", {
timeseries: {
timeField: "timestamp", // required — the date field
metaField: "studentId", // optional — what the measurement is about
granularity: "hours" // optional — hint for how data is bucketed
}
})Required and Optional Fields
| Option | Required? | What it does |
|---|---|---|
timeField | Yes | The field containing the timestamp — must be a Date |
metaField | No | A field that identifies what is being measured — stays constant for a series |
granularity | No | "seconds", "minutes", or "hours" — hints how frequently data arrives |
Choosing metaField
metaField should be the value that stays the same across many measurements — in our case, studentId. Many attendance records share the same studentId but have different timestamp values. MongoDB groups documents with the same metaField value together internally for efficient storage.
Inserting Data
Inserting into a time series collection looks exactly like inserting into a normal collection:
db.attendance_ts.insertOne({
timestamp: new Date("2024-09-01T08:00:00Z"),
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
status: "present",
courseId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f1")
})// Insert many attendance records at once
db.attendance_ts.insertMany([
{
timestamp: new Date("2024-09-01T08:00:00Z"),
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
status: "present"
},
{
timestamp: new Date("2024-09-02T08:00:00Z"),
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
status: "absent"
},
{
timestamp: new Date("2024-09-01T08:00:00Z"),
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f7"),
status: "present"
}
])You write one document per measurement — one document per student per day. MongoDB handles the internal bucketing automatically. You never see or manage the buckets directly.
What Happens Behind the Scenes
Your view — one document per measurement:
{ timestamp: "2024-09-01", studentId: "abc", status: "present" }
{ timestamp: "2024-09-02", studentId: "abc", status: "absent" }
{ timestamp: "2024-09-03", studentId: "abc", status: "present" }
MongoDB's internal storage — automatically bucketed:
{
meta: { studentId: "abc" },
data: {
timestamp: ["2024-09-01", "2024-09-02", "2024-09-03"],
status: ["present", "absent", "present"]
}
}This is exactly the same idea as the bucket pattern we built manually — but MongoDB does it for you, transparently. You query and insert one-document-per-measurement; MongoDB stores it efficiently.
Querying Time Series Collections
Queries work exactly like normal collections — no special syntax needed.
// All attendance for one student
db.attendance_ts.find({
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6")
})
// Attendance in September 2024
db.attendance_ts.find({
timestamp: {
$gte: new Date("2024-09-01"),
$lt: new Date("2024-10-01")
}
})
// A student's attendance in a specific month
db.attendance_ts.find({
studentId: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
timestamp: {
$gte: new Date("2024-09-01"),
$lt: new Date("2024-10-01")
}
}).sort({ timestamp: 1 })Aggregation on Time Series Data
// Count present vs absent days per student for September
db.attendance_ts.aggregate([
{
$match: {
timestamp: {
$gte: new Date("2024-09-01"),
$lt: new Date("2024-10-01")
}
}
},
{
$group: {
_id: { studentId: "$studentId", status: "$status" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.studentId",
attendance: {
$push: { status: "$_id.status", count: "$count" }
}
}
}
])Result:
[
{
_id: ObjectId("64a1f2c3e4b0a1b2c3d4e5f6"),
attendance: [
{ status: "present", count: 19 },
{ status: "absent", count: 3 }
]
}
]Time Series vs Manual Bucket Pattern
| Manual Bucket Pattern | Time Series Collection | |
|---|---|---|
| Setup | You design the bucket document shape | MongoDB handles it automatically |
| Insert code | updateOne with $push and upsert | Simple insertOne per measurement |
| Query code | Must unwrap bucket arrays | Normal queries — one doc per measurement |
| Storage efficiency | Good — if designed well | Excellent — MongoDB-optimized compression |
| Flexibility | Full control over bucket shape | Less control — MongoDB decides bucketing |
| Best for | Custom aggregated summaries (e.g., monthly totals) | Raw measurement data, queried by time range |
Time series collections are best for raw measurement data — one document per event, queried by time range. The manual bucket pattern is still useful when you want pre-aggregated summaries — like our gradeStats example from the Computed Pattern, where each document represents a calculated rollup, not a raw measurement.
When to Use Time Series Collections
Use a time series collection when:
- You are recording measurements or events tied to a timestamp
- Each measurement is mostly write-once — you rarely update old records
- You query primarily by time range
- The data volume grows continuously — daily attendance, sensor logs, activity tracking
Do not use a time series collection when:
- Documents are frequently updated after creation
- The data is not fundamentally about a point in time
- You need full control over document structure for complex embedded relationships
// Good fit — attendance records
db.attendance_ts.insertOne({ timestamp: new Date(), studentId, status: "present" })
// Not a good fit — student profile (not time-series data)
db.students.insertOne({ name: "Ali Hassan", grade: "10th", ... })Converting Our Attendance System
In the Schema Design section, we built a manual bucket pattern for attendance — one document per student per month, with a records array. Let's compare it to a time series approach.
Manual Bucket Pattern (from Schema Design)
// One document per student per month
{
studentId: ObjectId("..."),
month: "2024-09",
totalDays: 22,
presentDays: 19,
records: [
{ date: new Date("2024-09-01"), present: true },
{ date: new Date("2024-09-02"), present: false },
// ... rest of month
]
}Time Series Approach
// One document per student per day
db.attendance_ts.insertOne({
timestamp: new Date("2024-09-01"),
studentId: ObjectId("..."),
status: "present"
})When to Choose Which
Use the manual bucket pattern when you also want pre-calculated summaries stored alongside the raw data — totalDays, presentDays per month, ready to read instantly without aggregation.
Use a time series collection when you want raw, queryable daily records and are comfortable running aggregations for summaries — MongoDB's time series storage is highly optimized for exactly this kind of query.
Many real systems use both — a time series collection for raw attendance records, and a separate computed-pattern collection (like gradeStats from earlier) that periodically aggregates the time series data into summaries for fast dashboard reads.
Indexes on Time Series Collections
MongoDB automatically creates an index on timeField and metaField. You can create additional indexes just like a regular collection:
// Additional index for querying by course
db.attendance_ts.createIndex({ courseId: 1, timestamp: 1 })Quick Reference
// Create a time series collection
db.createCollection("attendance_ts", {
timeseries: {
timeField: "timestamp",
metaField: "studentId",
granularity: "hours"
}
})
// Insert — same as any collection
db.attendance_ts.insertOne({
timestamp: new Date(),
studentId: ObjectId("..."),
status: "present"
})
// Query — same as any collection
db.attendance_ts.find({
studentId: ObjectId("..."),
timestamp: { $gte: startDate, $lt: endDate }
})If you are starting a new feature that records events over time — attendance, activity logs, sensor data, audit trails — create it as a time series collection from day one. The syntax for inserting and querying is identical to a normal collection, so there is no extra complexity in your application code, but you get MongoDB's optimized storage and query performance for free.