Content Collections
Typed, validated markdown — like a DataFrame for your articles.
On this page
The Problem Collections Solve
Prereq: layouts. After this you will want markdown and code blocks which describes how the body renders, and build and deploy to ship it. For the typed-content philosophy see the mental model.
Without collections, you’d have loose markdown files with no validation. Typo in frontmatter? Wrong date format? Missing required field? You’d only find out when the page looks broken.
Content Collections give you:
- Schema validation — catch errors at build time, not in production
- Type safety — autocomplete and type checking on frontmatter fields
- Querying — filter, sort, and group articles with typed API
Without collections: a folder of CSV files with no schema. Every script that reads them has to handle missing columns, wrong types, inconsistent formats.
With collections: a Pandas DataFrame with defined dtypes + Pydantic validation. If a row doesn’t match the schema, you get an error before analysis runs.
Defining a Collection
In src/content.config.ts:
import { defineCollection, z } from 'astro:content';import { glob } from 'astro/loaders';
const kb = defineCollection({ // Where to find the files loader: glob({ pattern: '**/*.md', base: '../../content/kb' }),
// The schema — validates every file's frontmatter schema: z.object({ title: z.string(), description: z.string().optional(), kind: z.enum(['snippet', 'note', 'guide', 'article', 'showcase']), tags: z.array(z.string()).optional(), maturity: z.enum(['seedling', 'budding', 'evergreen']).optional(), origin: z.enum(['human', 'ai-assisted', 'ai-drafted']).optional(), confidence: z.enum(['high', 'medium', 'low']).optional(), }),});
export const collections = { kb };Schema ↔ Pydantic Mapping
| Zod (Astro) | Pydantic (Python) | What It Does |
|---|---|---|
z.string() | str | Must be a string |
z.number() | int | float | Must be a number |
z.boolean() | bool | Must be true/false |
z.enum(['a', 'b']) | Literal['a', 'b'] | Must be one of these values |
z.string().optional() | Optional[str] = None | Can be omitted |
z.coerce.date() | datetime with validator | Parses string to Date |
z.array(z.string()) | list[str] | Array of strings |
z.object({...}) | Nested Pydantic model | Nested structure |
What Happens When Validation Fails
If a markdown file has invalid frontmatter:
---title: 42 # ERROR: expected string, got numberdifficulty: "expert" # ERROR: not in enum---The build fails with a clear error:
[ERROR] kb → bad-file.md frontmatter does not match schema title: Expected string, received number kind: Invalid enum value. Expected 'snippet' | 'note' | 'guide' | 'article' | 'showcase'This is the same principle as unit tests for your data. Every npm run build validates every article against the schema. Bad data never reaches production.
Querying Collections
Once defined, you query collections with getCollection():
---import { getCollection } from 'astro:content';
// Get all entriesconst allKB = await getCollection('kb');
// Filterconst guides = await getCollection('kb', ({ data }) => { return data.kind === 'guide';});
// Sortconst sorted = allKB.sort((a, b) => { return (a.data.series_order ?? 99) - (b.data.series_order ?? 99);});
// Group by kindconst byKind = Object.groupBy(allKB, entry => entry.data.kind);---# Pandas equivalentdf = pd.read_csv("kb_entries.csv") # getCollection('kb')guides = df[df.kind == "guide"] # filter by data.kindsorted = df.sort_values("series_order") # .sort() by data.series_ordergrouped = df.groupby("kind") # Object.groupBy by data.kindEntry Shape
Each entry from getCollection() has:
entry.id // "astro-mental-model" (file name without extension)entry.data // { title: "...", kind: "guide", ... } (validated frontmatter)To get the rendered HTML:
import { render } from 'astro:content';
const { Content, headings } = await render(entry);// Content — an Astro component that outputs the rendered markdown// headings — array of { depth, slug, text } for ToC generationkrowdev’s Collection
// kb — unified knowledge baseconst kb = defineCollection({ loader: glob({ pattern: '**/*.md', base: '../../content/kb' }), schema: z.object({ title: z.string(), description: z.string().optional(), kind: z.enum(['snippet', 'note', 'guide', 'article', 'showcase']), created: z.coerce.date(), tags: z.array(z.string()).optional(), maturity: z.enum(['seedling', 'budding', 'evergreen']).optional(), origin: z.enum(['human', 'ai-assisted', 'ai-drafted']).optional(), confidence: z.enum(['high', 'medium', 'low']).optional(), series: z.string().optional(), series_order: z.number().optional(), }),});The kind field determines the URL prefix (/guide/, /article/, /snippet/, etc.) — one collection, multiple content types.
Challenge: Break the schema on purpose
- Create a file
content/kb/test-break.md - Give it intentionally invalid frontmatter:
---title: 42kind: "expert"---Test content.- Run
npm run buildand read the error - Fix the frontmatter and rebuild
- Delete the test file when done
This teaches you to read Zod validation errors — you’ll see them when you make typos in real articles.
Sources
- Astro Docs, Content collections
- Astro Docs, Content loader API
- Zod, Schema definition