Wednesday, September 30, 2015

Creating PDFs from Markdown with Pandoc and LaTeX

creating PDFs

If you've read some of my previous posts on SitePoint or elsewhere, you may know that I'm working on a board game. In the game, called Chip Shop, you get to run a computer company in 1980s America.

As part of the project, I'm attempting to open source the entire game as much as possible. After several false starts, I've decided on a basic framework of Markdown for most of the game components—especially cards and the manual.

As the game's website uses Jekyll, the website for the game is generated from the Markdown files. I intend to have premium pre-boxed and print-yourself versions of the game, and to achieve this I need to generate PDFs from the Markdown files.

What I'm Trying to Accomplish

My ideal workflow is to generate the PDF files at the same time as generating the website, rather than generate the files as visitors request them. This rules out my usual option for PDF generation, wkhtmltopdf, as it generates PDFs from already generated HTML. Another reason it's not an option is that I want the PDF card versions to look different from the HTML pages, and Jekyll lacks any kind of view mode feature to accomplish this without resorting to complex CSS rules.

The Markdown template file for cards in the Chip Shop game contains a lot of Markdown front matter fields for game mechanics. Not all are used on every card. For convenience during printing, I need to fit as many cards on an A4 page as possible—in this case, a 3x3 grid. Eventually the pages will need to be double-sided, but I haven't implemented that yet.

Enter Pandoc and LaTeX

[author_more]

Any internet search looking for solutions to generating PDFs from Markdown will lead you down the Pandoc path. Pandoc is an open-source, Swiss Army knife markup conversion tool that supports a wide and growing variety of input and output markup formats.

To generate PDFs with Pandoc, LaTeX is needed. LaTeX has its roots in the scientific research community, and is a document declaration and layout system. Combining Pandoc and LaTeX allows us to use variables, and thus to generate PDFs from a series of Markdown files and support Markdown front matter.

Despite the power of Pandoc and LaTeX, I couldn't find any way of combining multiple PDFs (cards) onto one page, especially when using variables from Markdown files. After much research, I settled on PDFJam, a simple command line tool for this requirement.

Installing Dependencies

Markdown

You need no extra software for Markdown, except maybe an editor and there are so many of those, I suggest you read a few SitePoint posts to make your choice.

Jekyll

I'll continue to use Jekyll in my examples taken from my game to illustrate the build process, but it isn't an essential part of PDF generation if you don't need a website.

Pandoc

On my Mac, I installed Pandoc with Homebrew, but there are options for all operating systems.

LaTeX

There are lots of opinions on the best way to install LaTeX, depending on what you need or intend to do with it. A full installation of its common tools and libraries can near 2GB, but for most purposes a minimal installation will be enough. Read the project's download page to find the best option for you.

For this tutorial, we'll be using the xelatex engine, as I use custom fonts. But you can select any engine that supplies specific features you require.

PDFJam

Depending on how you installed LaTeX, you may have PDFJam installed already. (Check by typing which pdfjam in the terminal.) If you haven't, then find details on installation here.

Continue reading %Creating PDFs from Markdown with Pandoc and LaTeX%


by Chris Ward via SitePoint

No comments:

Post a Comment