How to build a Markdown editor with real-time preview

Note: As of 20th Apr 2024, I have re-implemented this in a very different way due to a roadblock with the undo/redo functionality that I couldn't overcome. I will update this article in due time.

When I first learned about Markdown and started writing with it, I found myself constantly had to somehow preview what my content would actually look like with those markups. All Markdown editors I had tried either require manually switching my editor to a preview mode or toggle a pane beside the editor that automatically updates as you type. These approaches never clicked with me. Then, one day I discovered Typora. I was immediately enthralled by how the markup can be rendered and edited in real time, and the coolest thing is: all of that is done in the same editor. I loved it.

Then few years passed by, like many others before, I decided to build a Markdown-based note-taking app but with a hard requirement that this is going to be web-based. Right off the bat, I could play it in my mind how the whole thing was going to work out: I will create a sidebar and two panes - one for editor, another for the live preview, and then I will feed the editor's content to one of the open-source Markdown parsers out there, which will spit out a HTML version of it, and obviously the next step is to just append the HTML to the live preview element. Well yeah but I quickly discovered a performance ceiling with the live preview pane: it takes seconds to update when working with a huge document. Also, I realized with the split-pane, it can be tiring to constantly track my focus left and right. So split-pane was a no-go for me. And I couldn't live with the toggle preview mode approach either. I knew I had to somehow combine preview and editing in one view - as what Typora had done.

And this time around, I was absolutely clueless. My first instinct was to google for solutions. But StackOverflow didn't turn up any answers; nobody had written blog post about it; Typora is closed-source; best I could find was Marktext but I failed miserably trying to grasp how they did it. So my second instinct was to build on top of a rich-text editor framework. I'd had experience with Prosemirror but I'd only used it to build a typical rich-text editor that kind of integrated Markdown. Prosemirror still was(and is) a formidable library to build something that's out of the ordinary. It was scary to even take the first step because it felt like stepping into a void without any guardrails, even more so when I wondered on several occasions if I might as well build it all from scratch myself.

But there was really no way around it. So I spent one month going through Prosemirror's guide and experimenting with its API to hopefully gather pieces that would pave the way to my final destination. This was followed by countless of sheer trial-and-error. And over 10 months, the effect I very much admired when I first used Typora years ago, slowly emerged.

And now, I will you how you can do it too.

Demo & Source Codes

Try complete the bold markup below. (Note: Only bold and italic are implemented in this demo.)

Read on for a walk-through.

Overview

The rationale is straightforward. The hardest part for me was putting them together within the confine of my chosen libraries.

A flow chart showing how logic flows from detecting changes, styling markup, render markup when cursor exits, revert rendered text when cursor enters - to finally updating the view.

In the following walkthrough, we will slowly travel this flow chart from top to bottom, and along the way I will illustrate how each component is implemented with code snippets.

Outline of our plugin

Our codes will live in a plugin. A plugin in Prosemirror enables extra editor functionality. Although it offers a wide array of APIs, after countless console.log and trial-and-error, I had identified a handful of them that can drive the entire live preview and edit behaviours.

Our plugin's scaffolding looks like this:

new Plugin({
    key: // give it a name(it's in the form of PluginKey; we will touch upon this later)
    props: {
      decorations(state) {
        // to style our markup
      }
    },
    state: {
      init() {
        // to et our initial state
      },
      apply(tr, set, oldState, newState) {
        return set; // to update our state
      }
    },
    view() {
        return {
            update(view, prevState) {
                // to update our editor view
            }
        }
    }
})

We will slowly reveal the purpose each of them serves throughout this post.

Detect changes made on a document

As user write, we want to know the positions at which they have used Markdown syntax, in order to

Style the syntax characters, and
Track their start and end positions so we can render them(i.e. live preview) when the cursor exits either of them.

We will do this in the apply method of the plugin's state.

function changedNodes(transaction) {
  let ranges = [];
  let changedNodes = new Map();

  transaction.mapping.maps.forEach((stepMap) => {
    ranges = ranges.map(([start, end]) => [
      stepMap.map(start),
      stepMap.map(end),
    ]);

    stepMap.forEach((oldStart, oldEnd, newStart, newEnd) => {
      ranges.push([newStart, newEnd]);
    });
  });

  for (const [start, end] of ranges) {
    transaction.doc.nodesBetween(start, end, (node, pos) => {
      if (node.type.inlineContent) {
        changedNodes.set(pos + 1, node);

        return false;
      }
    });
  }

  return changedNodes;
}

Prosemirror creates a new transaction object for every change we make on the editor. This object has our latest document at transaction.doc.
With transaction.mapping.maps, we can get the positions of a particular change. For example, in our demo, the text starts at 1:

This is from an indispensible dev toolkit for Prosemirror - prosemirror-dev-toolkit

Now if we type one "h" character right before "hello" - "hhello", and if you console.log the ranges variable, you will see [ [1, 2] ]: the change starts at position 1 and ends at position 2.

Now we will traverse(nodesBetween) the latest document transaction.doc to find all the "nodes"(a paragraph is a node, a piece of text is another node) between the start and end positions.
Finally, we will store only the paragraph node(node.type.inlineContent) that has been impacted by the change and its position pos in the context of the entire document.

Parse the text contents

Now that we have identified the changes, we want to see if they contained any valid Markdown syntax. And we can easily do that with a Markdown parser, in our case, it's markdown-it library.

const highlights = [];
const decorations = [];

for (let [pos, node] of changedNodes(transaction)) {
  // take account of any spaces in front the text
  let currentPosition = node.textContent.search(/\S|$/) + 1 * pos; // multiply 1 to convert pos to number

  // trim() to remove spaces around the input texts
  const tokens = mdTokenizer.parse(node.textContent.trim(), {});

  highlightDocument(
    highlights,
    decorations,
    tokens,
    newState,
    currentPosition,
    [],
  );
}

currentPosition is the sum of 1) number of preceding spaces, and 2) the pos position of the paragraph we obtained ealier. It is our starting position from which we start incrementing when we loop through the tokens as will be discussed in the next section below. In this example, it's 7.

Our paragraph node object has a property called textContent that gives us its content, which we will pass to Markdown-it's parse function to produce Markdown tokens.

Get positions

highlightDocument(
  highlights,
  decorations,
  tokens,
  newState,
  currentPosition,
  [],
);

Now that we have the tokens, we will loop through them with our highlightDocument function to get 4 key positions:

Start position of opening markup. This is on the extreme left end.
Start position of enclosed content.
End position of enclosed content.
End position of closing markup. This is on the extreme right end.

Start and end positions of text and markups.

If we completed the bold syntax **world** and logged the highlights variable, we would see this result:

[
  {
    from: 7,
    textStart: 9,
    textEnd: 14,
    to: 16,
    nodeType: "mark",
    tokenType: "strong",
  },
];

To illustrate the positions:
Start and end positions of markup and marked text

With these four particular positions, we are able to know 3 key things:

Where we should make any Markdown syntax looks grey-ish, in this case, between 7 and 9 and 14 and 16,
Where we need to accordingly format the texts that are in between Markdown syntax, in this case, make the word "world" look bold, and
When cursor has exited either of the extreme ends, in this case, 7 on the left and 16 on the right, in order to render into proper HTML element:

Here is the highlightDocument function explaining how it works in the comments:

/**
 *
 *  What follows simply arose from me logging the
 *  "tokens" parameter and slowly see what's there
 *  I can work with in order to reach my objective
 *
 */
function highlightDocument(
  highlights,
  decorations,
  tokens,
  state,
  currentPosition,
  openPositions = [],
) {
  let nextPosition = currentPosition;

  for (const token of tokens) {
    // console.log('TOKEN', token)

    // paragraph has open and close tokens and we don't care about them
    if (SKIP_HIGHLIGHTING.includes(token.tag)) continue;

    /**
     *
     * value of "nesting" property indicates a token is opening or closing
     * Reference: https://markdown-it.github.io/markdown-it/#Token.prototype.nesting
     *
     * 1 means it's opening
     *
     */
    if (token.nesting === 1) {
      /**
       *
       * Let's go step by step to see how we got those 4 key positions.
       *
       * STEP-1
       *
       * If you log the token, you will see a property called "markup"
       * with value as "**".
       *
       * That's a string with a length of 2.
       *
       * currentPosition is 7. So 7 + 2 = 9
       *
       * nextPosition is now 9
       *
       */
      nextPosition = currentPosition + token.markup.length;

      /**
       *
       * When a tag closes, it always closes on its associated
       * opening tag. So I store the opening positions for when
       * they do close, we will have the range of the
       * Markdown text. This will become clear later.
       *
       */
      openPositions.push({
        openingTagStartPosition: currentPosition,
        openingTagEndPosition: nextPosition,
      });

      // now we update currentPosition to the value of nextPosition - 9
      currentPosition = nextPosition;
    } else if (token.nesting === -1) {
      // -1 means this is a closing tag

      /**
       *
       * STEP-3
       *
       * And now we see a token that closes the last opened tag.
       * To get the last opened tag, it means grabbing the last
       * element in the openPositions array. Hence, the "pop" array's
       * method is utilized here.
       *
       */
      const { openingTagStartPosition, openingTagEndPosition } =
        openPositions.pop();

      /**
       *
       * Again, we increment by the length of the closing markup "**",
       * length of 2.
       *
       * currentPosition is 14, hence 14 + 2 = 16
       *
       * nextPosition is now 16
       *
       */
      nextPosition = currentPosition + token.markup.length;

      /**
       *
       * OK, now we've got all the positions we needed to do
       * 2 things. The first thing: To know when cursor exited
       * the range of the marked text
       *
       * To that end, we gather all the info we are going to need in an object.
       *
       */
      const hl = {
        from: openingTagStartPosition, // extreme left end
        textStart: openingTagEndPosition, // position at which "world" starts
        textEnd: currentPosition, // position at which "world" ends
        to: nextPosition, // extreme rigth end
        nodeType: "mark", // headers are "node" type, so I needed a way to distinguish..
        tokenType: token.tag, // it will be "strong" in this case
      };

      /**
       *
       * And the second thing: To style the marked text.
       *
       * And Prosemirror provides a way to do that:
       * https://prosemirror.net/docs/ref/#view.Decoration%5Einline
       *
       */

      /**
       *
       * Equiped with those 4 key positions, now we know precisely
       * where to style
       *
       */

      // style the opening markup with CSS class called "delimiter"
      decorations.push(
        Decoration.inline(openingTagStartPosition, openingTagEndPosition, {
          class: "delimiter",
        }),
      );

      // style the enclosed text with CSS class as the tag of the token
      decorations.push(
        Decoration.inline(openingTagEndPosition, currentPosition, {
          class: token.tag,
        }),
      );

      // similarly, style the closing markup with "delimiter" CSS class
      decorations.push(
        Decoration.inline(currentPosition, nextPosition, {
          class: "delimiter",
        }),
      );

      // store it in an array cuz there can be nested marked texts
      highlights.push(hl);

      // finally, let's update currentPosition again - 16
      currentPosition = nextPosition;
    } else if (token.type === "text") {
      /**
       *
       * STEP-2
       *
       * This is a text token with a property called "content"
       * whose value would be "world" in this case. So now we
       * will increment currentPosition by the length of
       * "world" which is 5. Hence, nextPosition is 14 = 9 + 5
       *
       * And again, we update currentPosition to the value of
       * nextPosition - 14
       *
       */
      currentPosition = nextPosition = currentPosition + token.content.length;
    } else if (token.type === "softbreak") {
      currentPosition = nextPosition = currentPosition + 1;
    } else if (token.type === "inline") {
      /**
       *
       * blocks like paragraph and heading will have inline tokens
       * such as italic and bold, in which case, we want to
       * continue looping and incrementing the position.
       *
       */
      currentPosition = highlightDocument(
        highlights,
        decorations,
        token.children,
        state,
        currentPosition,
        openPositions,
      );
    }
  }

  return currentPosition;
}

So far, we have been working in the apply method of the plugin. We are done here. Next we will move our work to the view method particularly the update function. Here is a snapshot of the scaffolding of our plugin again to reorient ourselves:

new Plugin({
  state: {
    apply(tr, set, oldState, newState) {
      return set; // to update our state
    },
  },
  view() {
    return {
      update(view, prevState) {
        // to update our editor view
      },
    };
  },
});

One key thing I realized after sprinkling console.log everywhere: when user makes changes, apply runs first, followed by the update function runs as stated in the doc:

[update function is] Called whenever the view's state is updated.

Get state from our plugin

view() {
      return {
        update(view, prevState) {
          // only proceed if it's a empty single(blinking) cursor
          // if (!view.state.selection.empty) return false;

          const {
            boundsOfHighlights,
            highlights
          } = markdownHighlighterKey.getState(view.state);

          // const { $cursor } = view.state.selection;
          // const { pos } = $cursor;
          // let { tr } = view.state;

          // ...
        }
      }
}

The last thing we did was to apply new states. To access them elsewhere, we need 3 parts:

A PluginKey instance.

import { PluginKey } from "prosemirror-state";

const markdownHighlighterKey = new PluginKey("canBeAnyUniqueStringYouWant");

This PluginKey instance is passed to the key property of the Plugin class:

new Plugin({
  key: markdownHighlighterKey,
});

In the PluginKey instance, there's a getState method.
Then we pass the entire state of our editor as its argument

markdownHighlighterKey.getState(view.state);

Get cursor's position

view() {
      return {
        update(view, prevState) {
          // only proceed if it's a empty single(blinking) cursor
          // if (!view.state.selection.empty) return false;

          // const {
          //   boundsOfHighlights,
          //   highlights
          // } = markdownHighlighterKey.getState(view.state);

          const { $cursor } = view.state.selection;
          const { pos } = $cursor;
          // let { tr } = view.state;

          // ...
        }
      }
}

Once we have obtained the positions of our Markdown texts, we need to know when we should render them to their corresponding formatting when user and editor have moved their focus elsewhere. And the thing that can inform us that is the position of the cursor.

You can find $cursor property inside selection of the editor's state view.state which we just came across above

const { $cursor } = view.state.selection;

And inside the $cursor object, you can get the cursor's position via pos property

const { pos } = $cursor;

Create a new transaction

view() {
      return {
        update(view, prevState) {
          // only proceed if it's a empty single(blinking) cursor
          // if (!view.state.selection.empty) return false;

          // const {
          //   boundsOfHighlights,
          //   highlights
          // } = markdownHighlighterKey.getState(view.state);

          // const { $cursor } = view.state.selection;
          // const { pos } = $cursor;
          let { tr } = view.state;

          // ...
        }
      }
}

To make new changes to your editor's view, you need to create a new "transaction" onto which you will apply your changes(we will see about this below). The way to do that is by getting the tr property from the editor's state

let { tr } = view.state;

And this transaction object is chainable - "Most transforming methods return the Transform object itself, so that they can be chained.".

Render Markdown texts

When user and editor have moved their focus away from an active Markdown text, we will render it to its corresponding formatting.

exitedHighlighted function simply compares the cursor's position and the extreme ends of all Markdown texts, and returns those that have been exited

function getExitedBoundsOfHighlights(boundsOfHighlights, cursorPos) {
  if (!boundsOfHighlights.length) return [];

  return boundsOfHighlights.filter(
    (hl) => cursorPos > hl.to || cursorPos < hl.from,
  );
}

Then if any were exited,

First, we get all marked texts that are within the extreme ends.
Second, we remove all of their Markdown syntax characters.
Third, we tell Prosemirror to represent the final text in the correct type. We do this by addMark method of the transaction object. With this, when we update view, Prosemirror will create a DOM node as specified in the toDom method in schemas such as in the Bold.js file.

if (boundsOfHighlightsExited.length) {
  for (const boundOfHighlightsExited of boundsOfHighlightsExited) {
    for (const highlight of highlights) {
      // STEP-1: we get all marked texts that are within the range.
      if (
        highlight.from >= boundOfHighlightsExited.from &&
        highlight.to <= boundOfHighlightsExited.to
      ) {
        const { from, to, tokenType, textStart, textEnd, attrs } = highlight;

        // STEP-2: remove markdown syntax
        tr.delete(tr.mapping.map(from, -1), tr.mapping.map(textStart, -1))
          // STEP-2: remove the other markdown syntax
          .delete(tr.mapping.map(textEnd), tr.mapping.map(to))
          .addMark(
            // STEP-3: represent the text with correct type in Prosemirror
            tr.mapping.map(textStart),
            tr.mapping.map(textEnd),
            view.state.schema.marks[tokenType].create(attrs),
          );
      }
    }
  }
}

Notice the prevalent usage of tr.mapping.map. This is another key to all this. It's used to shift our existing positions every time we apply a change that alters our document's length, in this case deleting some characters from our document. If we didn't do this, our second delete operation above would have deleted unintended characters because the document had shifted beneath it as a result of the first deletion action.

Inline editing Markdown-formatted texts

This is the inverse of what we did in the last section. When a cursor sees a rendered texts that were formatted by Markdown syntax, we want to reveal the syntax in plain texts so we can change them in the same editor without needing any toolbars.

The $cursor contains various information that will be instrumental to us in this task.

In the getMarks function, first we see if there's a rendered text in front and behind of the cursor

// check any rendered text ahead of the cursor
$cursor.parent.childAfter($cursor.parentOffset);

// check any rendered text right behind the cursor
$cursor.parent.childBefore($cursor.parentOffset);

If cursor has landed on a rendered text, we will then collect all nested rendered texts. In the parseChildForMarks function, there are two while loops: first one is collecting any nested texts ahead of the cursor, and second one is doing the same but in the opposite direction.

We also need to shift the cursor's position. The reason is similar: the addition of syntax characters to the document would have pushed the cursor forward. That's the goal of cursorOffsetCount variable: it stores the number of times syntax characters have been added before the cursor.

And in normalizeMarksAndCursorPos function, we are building the full Markdown version of the rendered texts.

Once that's all done, we replace the rendered text with its Markdown equivalent, and set a new position for the cursor

// replace rendered text with its Markdown version
tr.replaceWith(
  tr.mapping.map(marks.start),
  tr.mapping.map(marks.end),
  view.state.schema.text(marks.text),
);

// set cursor to a new position
tr.setSelection(
  TextSelection.create(
    tr.doc, // tr.doc is the latest doc after the replaceWith step above
    cursorPos,
  ),
);

Updating view

We have been applying changes on the tr. Now we want our editor to visually reflect all the changes we have made. To do that, we need to dispatch the transaction object. And that's what we do at the end of the update method

view.dispatch(tr.setMeta(markdownHighlighterKey, { boundsOfHighlightsExited }));

The setMeta can be used to store custom data. Here we use it to access the data in the last section below.

Clean up

Once we dispatched a transaction, the apply method of the plugin will run once again.

To get the custom data we stored previously using setMeta, we use getMeta

// "this" object is the Plugin itself
const { boundsOfHighlightsExited } = transaction.getMeta(this);

And finally! We are going to do some clean-ups by resetting our states

state.boundsOfHighlightsExited = [];
state.boundsOfHighlights = [];
state.highlights = [];
state.decorations = DecorationSet.empty;

Thank you for your attention and I hope this was clear enough.