Sept. 30, 2025, 7:24 p.m.

Issue 45 - Markdown is Holding You Back

Explore why Markdown, despite its ubiquity, might not be the best fit for technical content.

Code, Content, and Career with Brian Hogan

I've used many content formats over the years, and while I love Markdown, I run into its limitations daily when I work on larger documentation projects.

In this issue, you'll look at Markdown and explore why it might not be the best fit for technical content, and what else might work instead.

Markdown Lacks the Structure You Need

Markdown is everywhere. It's human-readable, approachable, and has just enough syntax to make docs look good in GitHub or a static site. That ease of use is why it's become the default choice for developer documentation. I'm using Markdown right now to write this newsletter issue. I love it.

But Markdown's biggest advantage is its biggest drawback: it doesn't describe the content like other formats can.

Think about how your content gets consumed. Your content isn't just for human readers. Machines use it too. Your content gets indexed by search engines, and parsed by LLMs, and those things parse the well-formed HTML your systems publish. Markdown's basic syntax only emits a small subset of the available semantic tags HTML allows.

IDE integrations can use your docs, too. And AI agents rely on structure to answer developer questions. If you're only feeding them plain-text Markdown documents to reduce the number of tokens you send, you're not providing as much context as you could.

Worse, when you want to reuse your content or syndicate content into another system, you quickly find out that Markdown is more of the lowest common denominator than a source of truth, as not all Markdown flavors are the same.

There are other options you can use that give you more control. But first, let's look deeper into why you should move away from Markdown for serious work.

Markdown is "implicit typing" for content

If you're a developer, you know all about type systems in programming languages. Some languages use Implicit typing, in which the compiler or interpreter infers the data type from the value. These languages give you flexibility, but no guarantees. That's why many developers prefer languages that use explicit typing, where you predefine data types when writing the code. In those languages, the compiler doesn't just build your code; it guarantees specific rules are followed. That's the main reason for the rise of TypeScript over JavaScript: compile-time guarantees.

Markdown is implicit typing. It lets you write quickly, but without constraints or guarantees. There's no schema. No way to enforce consistency. A heading in one file might be a concept, in another it might be a step, and there's no machine-readable distinction between the two.

To complicate things further, there are multiple flavors of Markdown, each with its own features and markup. Here are just a few:

  • CommonMark
  • GitHub-Flavored Markdown
  • MyST
  • MultiMarkdown

You think you're writing "Markdown," but what works in one tool may not render in another. Some Markdown processors allow footnotes, Others ignore soft line breaks. And some even require different formatting for code blocks. Inconsistency makes Markdown a shaky foundation for anything beyond the most basic document.

And then there's MDX, which people often use to extend Markdown to support things it doesn't:

Here's a typical MDX snippet:

# Install

<Command>npm install my-library</Command>

That <Command> tag isn't Markdown at all; it's a React component. Instead of using a code block, the author chose to create a special component to standardize how all commands would display in the documentation.

It works beautifully on their site because their publishing system knows what <Command> means. But if they try to syndicate this content to another system, it breaks because that system also needs to implement that component. And even if it was supported elsewhere, there's no guarantee that the component is implemented the same way.

MDX shows that even in Markdown-centric ecosystems, people instinctively add more expressive markup. They know plain Markdown isn't enough. They're reinventing semantic markup, but in a way that's custom, brittle, and not portable.

Why semantic markup matters

Semantic markup describes what content is, not just how it should look. It's the difference between saying "here's a bullet with some text" and "here's a step in a procedure." To a human, those may look the same on a page. To a machine or to a publishing pipeline, they are entirely different.

Web developers already went through all this with HTML. Prior to HTML5, you had <div> as a logical container. But HTML5 introduced <section>, <article>, <aside>, and many other elements that described the content.

Semantic markup matters for two important and related reasons:

  • Transformation and reuse. With semantic markup, you can publish the same content to HTML, PDF, ePub, or even plain Markdown. With Markdown as your source, you can't easily go to another format. You can't turn a bullet into a <step> or a paragraph into a <para> without guessing. You can't add context if it wasn't there to begin with, but you can strip out what you don't need when you transform the document, and you can choose how to present each thing in a consistent way.
  • Machine consumption. LLMs and agents can make better use of content that carries structure. A step marked as a <step> is unambiguous. A bullet point might be a step, or a note, or just a list item. The machine has to guess. This is why XML was a preferred mechanism for web services for a long time, and why JSON Schema exists.

Let's explore four formats that give you more control over structure than plain Markdown.

reStructuredText

reStructuredText is a plain-text markup language from the Python/Docutils ecosystem that supports directives, roles, and structural semantics. It is the foundational format used by Sphinx for generating documentation.

Installation
============

.. code-block:: bash

   npm install my-library

.. note::  
   This library requires Node.JS ≥ 22.

See also :ref:`usage-guide`.

Here you see a code-block directive, an admonition (note), and an explicit cross-reference via :ref:. You'll find support for images, figures, topics, sidebars, pull quotes, epigraphs, and citations as well.

All of those encode semantics, not just presentation.

AsciiDoc

AsciiDoc aims to be human-readable but semantically expressive. It has attributes, conditional content, include mechanisms, and more.

Here's an example of AsciiDoc:

= Installation
:revnumber: 1.2
:platform: linux
:prev_section: introduction
:next_section: create-project

[source,bash]
----
npm install my-library
----

NOTE: This library requires Node.JS ≥ 22.

See <<usage,Usage Guide>> for examples.

AsciiDoc has native support for document front-matter. Attributes like :revnumber: or :platform: let you parameterize content.

<<usage,Usage Guide>> is a cross-reference syntax.

Like reStructuredText, AsciiDoc supports admonitions like NOTE and WARNING so you don't have to build your own custom renderer. It also has support for sidebars, and you can add line highlighting and callouts to your code blocks without additional extensions.

And if you're writing technical documentation, there's explicit support for marking up UI elements and keyboard shortcuts.

Using AsciiDoctor, you can transform AsciiDoc into other formats, including HTML, PDF, ePub, and DocBook, which you'll look at next.

DocBook (XML)

DocBook is an XML-based document model explicitly designed for technical publishing. It expresses hierarchical and semantic structure in tags and attributes, enabling industrial-grade transformations.

Here's an example

<article id="install-library">
  <title>Installation</title>
  <command>npm install my-library</command>
  <note>This library requires Node.JS &gt;= 22</note>
  <xref linkend="usage-chapter">Usage Guide</xref>
</article>

Every tag is meaningful: <command> vs <para>, <note> vs <xref>. You'll find predefined tags for function names, variables, application names, keyboard shortcuts, UI elements, and much more. Being able to mark up the specific product names and terminology you use makes it so much easier to create glossaries and indexes. And Docbook has tags for defining index terms, too.

DocBook's rich ecosystem of XSLT stylesheets supports transforming to HTML, PDF, man pages, and even Markdown. Using DocBook ensures structure and validation at scale, as long as you use the tags it provides.

Then there's DITA.

DITA (Darwin Information Typing Architecture)

DITA is a standard for writing, managing, and publishing content. It's a topic-based XML architecture with built-in reuse, specialization, and modular content design. It's an open standard, and it's widely used in enterprises for multi-channel, structured content that needs standardization and reuse.

Here's an example:

<task id="install">
  <title>Installation</title>
  <steps>
    <step><cmd>npm install my-library</cmd></step>
  </steps>
  <prolog>
    <note>This library requires Node.js &gt;= 22</note>
  </prolog>
</task>

DITA defines types like <task> and <step>, which cleanly map to procedural structure. You can compose topics, reuse via content references (conrefs), and specialize as your domain evolves.

One of the more interesting features DITA provides is the ability to filter content and create multiple versions from a single document.

The DITA Open Toolkit and many enterprise tools handle rendering, transformation, and reuse pipelines.

Ew. XML.

Yes, XML. The syntax is more verbose than Markdown. Tooling is less ubiquitous than Markdown. Migration requires effort, and your team may resist the learning curve. For small docs, Markdown's features are often enough.

But if you're already bolting semantics onto Markdown with MDX or plugins or custom scripts, you're paying that complexity cost anyway, and you don't get the benefits of standardization or portability. You're building a fragile, custom semantic layer instead of adopting one that already works.

So where does that leave you?

If you're writing a quick README or a short-lived doc, Markdown is fine. It's fast, approachable, and does the job. If you're building a developer documentation site that needs some structure, reStructuredText or AsciiDoc are better choices. They balance expressiveness with usability. And if you're managing a large doc set that needs syndication, reuse, and multi-channel publishing, DocBook and DITA give you the semantics and tooling to make that process more manageable.

The key is to start with the richest format you can manage and export downward. Markdown makes a great output for developers. It's approachable and familiar. But be careful not to lock yourself into it as your source of truth, because you can't add context back as easily as you can strip it out.

Things To Explore

  • I have a new book out. Check out Write Better with Vale. This book walks you through implementing Vale, the prose linter, on your next writing project to create consistent, quality content.
  • Tidewave.ai is a full-stack coding agent from the creators of the Elixir programming language. It supports Ruby on Rails, Phoenix, and React applications and has a free tier. You'll need an API key for OpenAI, Anthropic, or GitHub Copilot to use it.
  • Google's Chrome for Developers blog has a post on creating accessible carousels. It's worth the read if you have to implement one of these on your site.

Parting Thoughts

Before the next issue, here are a couple of things you should try to get some hands-on experience with a different format.

  • DocBook and DITA might be too much out of the gate, so explore AsciiDoc. The AsciiDoctor toolchain makes it much less painful to migrate from Markdown. And there are many static site generators that support AsciiDoc, including Hugo.
  • Then try exporting AsciiDoc content to Docbook.

As always, thanks for reading. Share this issue with someone who you think would find this helpful.

I'd love to talk with you about this issue on BlueSky, Mastodon, Twitter, or LinkedIn. Let's connect!

Please support this newsletter and my work by encouraging others to subscribe and by buying a friend a copy of Write Better with Vale, tmux 3, Exercises for Programmers, Small, Sharp Software Tools, or any of my other books.

You just read issue #45 of Code, Content, and Career with Brian Hogan. You can also browse the full archives of this newsletter.

Read more →

  • Aug 31, 2025

    Issue 44 - Content Reuse and an Interactive Git Workflow

    Maximize your content's impact and use a CLI tool to speed up Git tasks.

    Read article →
  • Jul 31, 2025

    Issue 43 - Bringing Content to Communities and Showing AI What You Mean

    Take your content to communities instead of waiting for traffic, and improve AI coding results by sharpening your planning and communication skills.

    Read article →
Share on Twitter Share on LinkedIn Share on Hacker News Share on Reddit Share via email Share on Mastodon Share on Bluesky
X LinkedIn Mastodon Bluesky
Powered by Buttondown, the easiest way to start and grow your newsletter.