Below is a collection of extensions to the Markdown syntax that I use on my site.
These rules all run as a pre-processing step before handing off to the Markdown parser. As such, some of them transform alternate syntax into an equivalent Markdown syntax so that I don't necessarily have to generate HTML in this step.
Short Link Syntax
I find the Markdown link syntax somewhat cumbersome. I much prefer the MediaWiki style of wrapping the link in square brackets. It's much easier to type, and much easier to remember the syntax.
[http://example.com]
renders as example.com, automatically removing the "http://" or "https://".
To specify the text of the link, include text after a space:
[http://example.com Example]
renders as Example.
// Short link syntax, wrap links like [http://example.com] into [example.com](http://example.com)
$body = preg_replace('|(?<![!\\\])\[(https?://([^\s\]]+))\]|', '[$2]($1)', $body);
// If the text is prefixed with a backslash, don't replace it, and instead, un-escape.
$body = preg_replace('|\\\(\[(https?://([^\s\]]+))\])|', '$1', $body);
// Replace mediawiki-style links with markdown style
// [http://example.com Example] transforms to [Example](http://example.com)
$body = preg_replace('/#{negative_lookbehind}\[(http[^\] ]+) ([^\]]+)\]/', '[$2]($1)', $body);
// If the text is prefixed with a backslash, don't replace it, and instead, un-escape.
$body = preg_replace('/\\\(\[http[^\] ]+ [^\]]+\])/', '$1', $body);
Variables
Since each page can define arbitrary data in the YAML front matter, it is useful to be able to print values from the YAML data inline in the page.
Example YAML content
---
title: Some Enhancements to Markdown
slug: some-enhancements
...
Now, \#{title}
renders as: "#{title}"
Don't forget to support escaping this sequence with a backslash: \\#{title}
so that we can include
references like this in documentation.
// Look for any variables on the page enclosed with #{} and replace with their $page->var values.
// Look for #{} not prefixed with a backslash so we can escape if needed.
while(preg_match_all('|#{negative_lookbehind}#{([a-z0-9_]+)}|i', $body, $matches, PREG_OFFSET_CAPTURE)) {
$var = $matches[1][0][0];
$offset = $matches[0][0][1];
$varText = $matches[0][0][0];
// $this->meta($var) retrieves the meta value from the page
if($this->meta($var, NULL) !== NULL)
$value = $this->meta($var);
// If not found, we include the escaped version of the string so
// it gets printed with no replacements in the final output
else
$value = '\#{'.$var.'}';
$body = substr_replace($body, $value, $offset, strlen($varText));
}
// Replace any escaped sequences \\#{stuff} with \#{stuff}
$body = preg_replace('/\\\(#{[a-z0-9_]+})/', '$1', $body);
Macros
Macros are small helpers used to generate larger HTML blocks. A couple examples I find useful are a way to embed a Youtube or Vimeo video without copying the full embed code from the respective sites inline.
Macros are defined in the format:
![:macro params](value)
This syntax is based off of the Markdown image syntax to avoid adding too much additional overhead to remember.
For example, the Youtube and Vimeo macros look like this:
![:vimeo 600x400](13697303)
![:youtube 600x400](G-M7ECt3-zY)
This makes it easy to define new macros, and each macro can decide what kind of parameters it takes if any.
// Parse ![:vimeo]() tags
$body = preg_replace('|#{negative_lookbehind}!\[:vimeo (\d+)x(\d+)\]\(([^\)]+)\)|',
'<object width="$1" height="$2"><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=$3&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" /><embed src="http://vimeo.com/moogaloop.swf?clip_id=$3&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="$1" height="$2"></embed></object>', $body);
// Parse ![:youtube]() tags
$body = preg_replace('|#{negative_lookbehind}!\[:youtube (\d+)x(\d+)\]\(([^\)]+)\)|',
'<iframe width="$1" height="$2" src="http://www.youtube.com/embed/$3" frameborder="0" allowfullscreen></iframe>', $body);
Github Ribbons
Using the macro syntax defined above, we can create a macro to easily display a Github ribbon in the top right corner.
![:github](aaronpk/IndieAuth)
This results in the appropriate HTML being generated to place a ribbon in the top right corner. The value component of the macro should be the relative path to the project on github.com.
// Replace ![:github](aaronpk/Example) tags
$body = preg_replace('|#{negative_lookbehind}!\[:github\]\(([^\)]+)\)|',
'<a href="https://github.com/$1"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://s3.amazonaws.com/github/ribbons/forkme_right_gray_6d6d6d.png" alt="Fork me on GitHub"></a>', $body);
Relative Image Links
While markdown has a satisfactory way to create an image tag, I've found it to be easier not to specify the full image path when embedding images in a post.
For example, normally to embed an image in this post I would need to specify the full path to the image, like so:
![Alt Text](/2012/244/article/1/image.png)
This quickly gets cumbersome to have to enter the full path each time, so I created a shortcut:
![Alt Text](image.png)
The pre-processor automatically expands the tag to include the full path so the Markdown processor handles it properly.
$body = preg_replace('/#{negative_lookbehind}\!\[([^\]]+)\]\(([^\/\)]+)\)/', '![$1]('.$imgPath.'/$2)', $body);
Unescape Macros and Image Links
Since I wanted to be able to include some examples in this page that the pre-processor would normally catch, I needed a way to escape these sequences. This is the negative lookbehind sequence found at the beginning of all the regexes: #{negative_lookbehind}
After replacing patterns that do not start with the escape character, \
, we need to remove the escape character from the beginning so the output text looks normal. Most of the other PHP code examples include this as the last step, but since macros and images share a similar structure we need to do the unescaping after all macros and images have been expanded.
// Un-escape macros and image links
$body = preg_replace('/\\\(\!\[[^\]]+\]\([^\)]+\))/',
'$1', $body);
Short Posts
There are a few pre-processing steps I only use for short-format notes, not on full-length articles. In my experience, these were catching too many false positives in articles, and for articles I don't mind spending a few extra seconds to mark some text as a link and such.
Hashtags
Typing a standard #hashtag
renders as #hashtag
[#hashtag](http://www.google.com/search?q=%23hashtag&as_sitesearch=aaronparecki.com&tbs=sbd:1,cdr:1,cd_min:1/1/1999)
The link destination is a Google site search which searches my domain for the hashtag. I may update this in the future to link to a page on my site showing posts with this hashtag, or link to a global Google search for the hashtag.
$body = preg_replace_callback('/(?<=\s)#([a-z0-9_]+)/i', function($matches){
$hashtag = '#' . $matches[1];
return '[' . $hashtag . '](http://www.google.com/search?q=' . urlencode($hashtag) . '&as_sitesearch=aaronparecki.com&tbs=sbd:1,cdr:1,cd_min:1/1/1999)';
}, $body);
$body = preg_replace('/\\\(#[a-z0-9_]+)/i', '$1', $body);
Usernames
Like on Twitter, I want to be able to type @username
and have it automatically create a link to that person's URL.
For people who have their own domains, I've created a mapping of \@usernames to their domains in a file. If I mention someone who doesn't have an entry, I just link to Twitter instead.
Example:
@brennannovak
transforms to:
[@brennannovak](https://brennannovak.com)
and renders as: @brennannovak
@username
transforms to:
[@username](https://twitter.com/username)
and renders as: @username
$body = preg_replace_callback('/(?<=\s)@([a-z0-9_]+)/i', function($matches) {
return '[@' . $matches[1] . '](' . linkForUser($matches[1]) . ')';
}, $body);
// Assuming the function markdownLinkForUser() knows how to retrieve a URL for a given username
// Replace any escaped sequences \@username with @username
$body = preg_replace('/\\\(@[a-z0-9_]+)/i', '$1', $body);
Further Reading
On this site I'm using the PHP Markdown Extra parser which adds a few additional features to a standard Markdown parser. One of my favorites is adding the attribute markdown="1"
to HTML blocks so the parser will go ahead and parse Markdown inside an HTML tag. Rather than explain the rest of the additions, you can read about them here.
Feel free to contribute versions of these extensions in other languages and I'll include a link to them here, or embed alternate language versions inline above.