Ash Berlin

Markdown's List Handling

Posted on Jan 23 2010

Jeff Atwood of codinghorror.com recently posted a rant/diatribe against Markdown, whilst at the same time declaring his love for it…

I agree 99% - the 1% where I don’t agree is point #3 is a non-issue for me.

When evilstreak and I wrote markdown-js recently we spent hours ranting against the fact that the ‘specification’ for Markdown is the broken implementation of Markdown.pl. (As for why we wrote a new implementation, well thats a whole other story. Main reason being needed an intermediate representation to play with.) Take this particular gem:

  * foo
   * bar
* baz
 * fnord

Its rather brilliantly ambiguous. It is of course not mentioned in the ‘specification’ (which is not a specification per say: its a usage guide). Markdown.pl treats it as if you’d written:

* foo
  * bar
  * baz
  * fnord

So of course this is the behaviour in most markdown implementations – Babelmark is a good tool for this sort of checking. If you think that case is odd, try to work out what this example is treated as:

   * foo
  * bar
 * baz
* HATE
  * flibble
   * quxx
    * nest?
        * where
      * am
     * i?

Any guesses? No? Well, its as if you entered this:

* foo
    * bar
    * baz
    * HATE
    * flibble
* quxx
    * nest?
        * where
        * am
        * i?

If you dont believe me, try it for yourself. This took me a number of hours just staring at the output and trying different permutations to to work out just what was going on here. I guess I could have looked at the code, but I like my eyes, thank you very much.

I’m not sure if this is how other implementations manage it, but what we did to replicate this (brain dead) behaviour was:

  • keep a ‘stack’ of previous lists and what indent they have, updated when ever you create a new nested list or ‘un-nest’ a list item by pushing and popping
  • if the next item is indented by 4 more spaces than the ‘indent level’ of the previous item then it’s a nested list.
  • if the number of indent chars (after tab expansion) exactly matches the indent of a list in the stack, place it at the same level as that item
  • failing that, treat it as if it was just indented to make it a sub list, even if it is actually indented less than the ‘parent’ list item.

The third point above explains why in case VII quxx appears at the same level as foo. And screams out for a desperate need for some cases to be specified as errors.

So no you know what you get when you have people implement a project in different languages without a sufficiently spec-like-spec or even some regression tests they could run: chaos.