
Markdown TOC Generator: Automatic Table of Contents for Your Docs
π· Christina Morillo / PexelsMarkdown TOC Generator: Automatic Table of Contents for Your Docs
Stop writing table of contents by hand. Learn how to auto-generate navigation links for any Markdown document and keep your docs actually readable.
Every developer has written a long README or documentation file and then noticed β halfway through β that they need to add a table of contents. And then spent twenty minutes doing it by hand, carefully crafting each anchor link, making sure to lowercase everything, replace spaces with hyphens, strip the special characters. Then the document changes and half the links break.
It is one of those small but genuinely annoying problems that has a completely mechanical solution. The anchor ID generation algorithm for GitHub-Flavored Markdown is deterministic and well-documented. There is no reason to do this by hand.
The Markdown TOC Generator automates the entire thing. Paste your Markdown, get a properly indented, correctly linked table of contents you can drop right back into the document.
Why TOCs Matter More Than You Think
A table of contents is not decorative. On a long document β anything over a thousand words β it serves a real navigational function. Readers scan the TOC first to understand what a document covers and then jump directly to the section they care about. Without one, they either scroll manually or give up.
For README files on GitHub, this is especially true. When a developer lands on a repository they are evaluating for the first time, they scan the README. If the README is long and there is no table of contents, their mental model of the project is harder to build. A clear TOC communicates structure immediately.
Technical documentation has the same dynamic. API reference pages, architecture guides, onboarding docs β readers almost never read these linearly. They come looking for something specific. A TOC is their map.
The catch is that maintaining a TOC by hand is tedious and error-prone. Heading text changes. New sections get added. The TOC falls out of sync. Over time, broken anchor links and missing sections erode trust in the document. Regenerating automatically eliminates that maintenance burden entirely.
How the Anchor ID Algorithm Works
Understanding the anchor generation algorithm is worth knowing if you publish to multiple platforms, because the details matter.
GitHub (and most Markdown renderers that follow GitHub-Flavored Markdown) generate anchor IDs from heading text using a straightforward process:
- Strip all HTML tags from the heading text
- Lowercase the entire string
- Remove all punctuation except hyphens and spaces
- Replace spaces with hyphens
- Deduplicate: if a heading text appears more than once, append
-1,-2, etc.
So ## Getting Started becomes #getting-started. ## API Reference (v2) becomes #api-reference-v2. ## What is This? becomes #what-is-this.
The deduplication step is where things get interesting. If your document has two sections both titled ## Installation β say, one for macOS and one for Linux β the first gets #installation and the second gets #installation-1. The generator handles this correctly by scanning all headings and tracking duplicates before generating the links.
The GitHub vs GitLab Difference
GitHub and GitLab follow the same basic algorithm but diverge slightly on which characters get stripped vs. preserved. For most English documentation without unusual characters, the output is identical. The differences show up with:
- Non-ASCII Unicode characters: both platforms generally preserve them, but rendering can vary
- Certain punctuation like parentheses and periods: stripped by GitHub, sometimes preserved differently by GitLab
- Emoji in headings: technically supported by both platforms, but anchor behavior is inconsistent
If you are writing documentation for both GitHub and GitLab, the practical advice is to keep heading text clean β plain words, minimal punctuation. The generated TOC will work consistently across both.
Walking Through a Real Example
Here is a typical README structure:
# My Library
## Installation
### macOS
### Linux
### Windows
## Quick Start
## API Reference
### Configuration Options
### Methods
#### connect()
#### disconnect()
## Contributing
## License
Running this through the generator produces:
## Table of Contents
- [Installation](#installation)
- [macOS](#macos)
- [Linux](#linux)
- [Windows](#windows)
- [Quick Start](#quick-start)
- [API Reference](#api-reference)
- [Configuration Options](#configuration-options)
- [Methods](#methods)
- [connect()](#connect)
- [disconnect()](#disconnect)
- [Contributing](#contributing)
- [License](#license)
Notice a few things. First, # My Library is excluded from the TOC β H1 is the document title, not a section you navigate to. Second, the nesting reflects the heading hierarchy: H2 at the top level, H3 indented once, H4 indented twice. Third, #### connect() becomes #connect β the parentheses are stripped per the anchor algorithm.
That last point is worth flagging as a real limitation. If you have two methods both named get() and set(), the anchor IDs would be #get and #set. But if you have two overloaded methods both called get(id) and get(name), both anchor IDs would be #getid and #getname β or if the argument is stripped too, you might get #get and #get-1. The generator handles deduplication correctly, but the resulting anchor name might look odd.
For API reference docs specifically, it is sometimes better to use a static site generator with a dedicated API docs renderer (like Docusaurus or TypeDoc) rather than hand-written Markdown. For everything else, this generator works perfectly.
Dealing With Special Characters in Headings
Let me be honest about the edge cases.
Emoji in headings are popular in README files. Something like ## π Getting Started. The anchor ID for this will vary by platform. GitHub actually includes the emoji in the anchor, which means ## π Getting Started becomes #-getting-started (the emoji is stripped on some renderers, preserved on others). The safest advice: if your headings contain emoji, double-check the generated anchor links by viewing the rendered file on the actual platform.
Code spans in headings β like ## The \render()` Functionβ are handled by stripping the backticks, so you get#the-render-function`. This is usually what you want.
Links in headings β which are technically valid Markdown β are handled by extracting just the link text. ## [Getting Started](https://example.com) becomes #getting-started. This is correct per the spec.
Accented characters and non-Latin scripts β GitHub preserves these in anchor IDs. A heading like ## HΓ€ufige Fragen becomes #hΓ€ufige-fragen. This is actually fine and links work correctly in most browsers, even though the URL looks unusual.
Integrating the TOC Into Your Workflow
The typical workflow is simple:
- Write your document first, without worrying about a TOC
- When the structure feels settled, paste the Markdown into the Markdown TOC Generator
- Copy the generated TOC
- Paste it at the top of your document, after the introductory paragraph
Where to put the TOC is a minor style question. Some people put it immediately after the H1 title, before any introductory text. Others prefer to write a brief introduction paragraph first, then the TOC. The latter tends to feel more natural for README files that are also a human-readable introduction to a project.
Keeping It in Sync
The main limitation of a static generator is that it does not automatically update when you change your document. If you rename a heading or add new sections, the TOC needs to be regenerated.
For small documents that change infrequently, this is no big deal. For large documentation sites or frequently updated README files, you might want a more automated approach. Some options:
- doctoc (npm package): a CLI tool that updates the TOC in place, designed to be run as a pre-commit hook
- VS Code extensions: several extensions auto-generate and update TOCs on save
- GitHub Actions: you can configure a workflow to run doctoc or a similar tool on push
The Markdown TOC Generator is ideal for one-off generation or when you want to see exactly what the output looks like before committing it. For ongoing maintenance of a large doc site, a CI integration is worth setting up.
Related Tools That Complement This Workflow
If you are working with Markdown extensively, a few other tools on this site are worth knowing about.
The Markdown Preview tool lets you render any Markdown document in real time, so you can visually confirm how the headings and TOC will look before copying it to your repo. It is useful for catching formatting issues before they end up in a published README.
The Markdown to HTML converter is useful when you need the actual HTML output β for example, when pasting into a CMS that accepts HTML but not raw Markdown, or when building an email template from Markdown content.
The Markdown Table Generator is specifically for creating the pipe-delimited table syntax that is notoriously painful to write by hand. If your documentation includes data tables, it saves significant time.
Used together, these three tools cover most of the Markdown authoring workflow that does not require a full static site generator.
Practical Tips From Using This in the Wild
A few things worth knowing from actual use:
Start with a clean heading hierarchy. If your document mixes H2 and H3 without a clear structure β using H3 headings inside what should be H2 sections, for example β the generated TOC will reflect that confusion. The TOC generation process often reveals structural problems with a document that were invisible before.
Flat structures are fine. Not every document needs nested sections. A README with only H2 headings produces a simple flat list TOC, and that is often exactly right. Do not add H3 sub-headings just to make the TOC look more elaborate.
Consider your audience's reading pattern. For technical reference docs that developers will search and jump through repeatedly, a comprehensive TOC with full nesting is helpful. For a README that a new user will read top-to-bottom the first time, a shorter TOC with just the major sections is often enough.
The generated TOC uses standard Markdown list syntax. It is fully portable β it will render correctly in GitHub, GitLab, VS Code, Obsidian, Notion (Markdown import), and anywhere else that renders GFM.
Writing good documentation is a skill, and table of contents generation is a minor but real part of that process. Automating the mechanical parts β anchor ID generation, link formatting, indentation β means you can focus on the actual content instead of the formatting. That is exactly what this tool is for.
Try the Markdown TOC Generator on your next README or docs file.