I've got syntax highlighting now!

It was easier than I thought it would be. All I did was search the web for "mdx syntax highlighting" and found this helpful article in the mdxjs documentation. It lists two options:

  • composition via the MDXProvider
  • remark plugin

They say it's "typically preferred to take the compositional approach". I don't know why. Maybe they assume that people expect it to dynamically update the highlighting if the code is changed on the client side. I don't need it to be doing any parsing on the client, so I went for the second option. I installed the plugin with pnpm add @mapbox/rehype-prism and added it to the mdx options parameter in my next.config.js file. This is what it looked like after:

const path = require('path')
const rehypePrism = require('@mapbox/rehype-prism')
const withMDX = require('@next/mdx')({
  extension: /\.mdx?$/,
  options: {
    rehypePlugins: [rehypePrism],
  },
})
module.exports = withMDX({
  pageExtensions: ['mdx', 'tsx'],
})

This should do all the expensive parsing while the markdown file is being loaded, and what we end up with is a javascript file that exports a React component, which contains all the stuff it did before like headings, paragraphs, etc. But when I insert a code snippet, it renders a <pre> tag with a whole bunch of <span>s in it, each with one or more css classes attached to it which lets me style them with a very sensible looking css file. I started by downloading a css file from prismjs.com after selecting the languages I thought I would need, and modified it to my liking. Mostly I just replaced the hard-coded hex color codes with my variables. An excerpt:

.token.comment,
.token.prolog,
.token.doctype,
.token.namespace,
.token.namespace > .token.punctuation,
.token.cdata {
  transition: var(--color-transition);
  color: var(--comment);
}

.language-tsx .token.tag > .token.script,
.token.operator,
.token.punctuation {
  transition: var(--color-transition);
  color: var(--fg);
}

.token.class-name,
.token.maybe-class-name,
.token.function,
.token.property,
.token.tag,
.token.constant,
.token.symbol,
.token.deleted {
  color: var(--blue);
}

.token.atrule,
.token.keyword {
  color: var(--green);
}

.token.builtin,
.token.class-name.known-class-name,
.token.attr-name {
  color: var(--yellow);
}

/* I'm skipping a few rules here */

.token.comment,
.token.italic {
  font-style: italic;
}

General identifiers are not wrapped in spans, so they get the default color. One exception is, in jsx tags, I think it switches to a parser originally made for html, which would make sense, but when you have computed attributes with regular javascript (or typescript) code in them, the identifiers within turned blue because of the style applied to .token.tag.

<Link href={props.href} />

I had to add a special case: .language-tsx .token.tag > .token.script should have the normal foreground color. I ended up adding a few more tweaks like that.

<Link href={props.href} />

I have one gripe with the prism typescript parser that I didn't manage to solve though: while the javascript parser applies both a keyword and a nil class to undefined and null alike, allowing you to choose if you want to style it like other keywords or add a special style, their typescript parser doesn't for whatever reason. While they technically are keywords, I prefer to see them as literals and thus would like their color to be consistent with string, number, and boolean literals. There are already so many keywords in the code, we don't need two more of them.

// javascript
const a = undefined
const b = null
const c = false
const d = ''
// typescript
const a: undefined = undefined
const b: null = null
const c: boolean = false
const d: string = ''

I also think it looks a bit strange when it highlights these keywords the same in type annotations as it does when they're in value position, but it's only a regex-based parser so I wasn't expecting miracles.

Overall it was a pleasant experience to implement this. I give this solution an 8/10 and would recommend.