In pandoc, a lot of similar functionality can be added through short filters that are applied when the files are processed. In the next few posts, I’ll outline a few examples of short little filters I’ve recently put together to smooth the path for writing chemistry in pandoc. Here’s the first one: pandoc-chem-struct. Pandoc -filter pandoc-citeproc myinput.txt In order to use this feature, you will need to specify a bibliography file using the bibliography metadata field in a YAML metadata section, or -bibliography command line argument.
I use Pandoc to create Reveal.js presentations from Markdown documents, having switched to it from Google Slides. From the beginning I decided that presentations for different events and lectures are to be combined automatically from multiple files - thus it's easier to maintain separate slides without need to update content constantly in several places.
The solution for now combines Codekit (using its Kit language) and pandoc. So individual parts of presentations are prepared in an editor, then are referred in a single
.kit
file with @import statements and then the resulting Markdown file goes processed by pandoc which creates an HTML presentation (the last part being done by Sublime Text build system).
I'd like to simplify this process somehow with some kind of script that would pre-process a combined Markdown file automatically every time pandoc eats it. There are posts on StackExchange that refer to Haskell filters, but Haskell install is way too big for my tiny system (800Mb at minimum).
Is there a way to include files with some other kind of programming language or trick? I know for example, that it's possible to join several files by concatenating their names in pandoc command, but that doesn't make workflow smoother or faster.
random♦
certainlyakeycertainlyakey
2 Answers
In principle, you can write pandoc filters in any language, though Haskell is particularly well suited. The pandocfilters library makes it easy to write them in python.
![Pandoc Filter Examples Pandoc Filter Examples](/uploads/1/2/3/1/123198386/641854959.jpg)
Here's a tutorial on pandoc filters. It contains a sample Haskell filter for include files, which should be pretty easy to translate to a python filter using pandocfilters.
See also the directory of examples in the pandocfilters repository.
John MacFarlaneJohn MacFarlane
I finally figured out some ways to do the task.
The first is to use a pandoc filter written in Python that does includes (it works the same way as the Haskell filter described in pandoc docs). However, now it's adapted for using only with included code blocks, not with general content pieces.
The second way is to use inline Perl script which may be prepended to the build command (first seen here). This path has proved useful and I'll stick to it for some time, because
- I'm not really good at Python and
- it allows for some handy search-and-replace tasks to be done, such as on-the-go replacing parts of paths of the images and included files.
Below is the command I use to produce a slideshow in Reveal.js format (though this one meant to be uploaded to some web hosting, there are other build variants for building a self-contained slideshow file using
--self-contained
option of pandoc or, for example, 'collect' all the files related to slideshow to a folder on Desktop):
perl -ne 's/^#((.+)).*/`cat '${project_path/////g}$1'`/e;s/((/_common/img)/(/presentations$1/g;print' ${file_base_name}.md > result.md && pandoc -s -t revealjs --variable revealjs-url=http://www.site.com/presentations/_common/resources/revealjs --css=http://www.site.com/presentations/_common/resources/customcss_sky.css -H ${project_path}/_common/resources/customhtml.html --highlight-style haddock result.md -o index.html && trash result.md
This command:
- Replaces all #(path/to/include) expressions (paths must be relative to project folder) with includes' contents;
- Replaces paths in images (relative to project folder) with server path to images directory;
- Outputs the resulting Markdown to a temporary file;
- Creates HTML slideshow with pandoc;
- Trashes temporary file with Ali Rantakari's
trash
utility.
Community♦
certainlyakeycertainlyakey
Not the answer you're looking for? Browse other questions tagged htmlpresentationsmarkdownpandoc or ask your own question.
Document your code
Every project on GitHub comes with a version-controlled wiki to give your documentation the high level of care it deserves. It’s easy to create well-maintained, Markdown or rich text documentation alongside your code.
Sign up for free See pricing for teams and enterprises
Pandoc provides an interface for users to write programs (known as filters) which act on the intermediate AST. For more info see the filter tutorial and the Lua filter tutorial.
This page collects together third party filters which can be used to add functionality to pandoc.
Writing Filters
Filters can be written in any programming language. Pandoc wrappers and interfaces are available in the following programming languages to facilitate modification of the AST:
language | link | description |
---|---|---|
Python | pandocfilters | a library for writing pandoc filters in python. |
Python | panflute | a pythonic alternative to pandocfilters , with batteries included.(@jgm recommended this in pandoc discuss) |
PHP | pandocfilters-php | a port of the python pandocfilters module to PHP to make writing filters in PHP easier. |
Node.js | pandoc-filter-node | a Node.js module for writing pandoc filters in JavaScript. |
Perl | Pandoc::Elements | a CPAN module for writing pandoc filters in Perl. |
Groovy | groovy-pandoc | a library for writing Pandoc filters in Groovy. |
Ruby | paru | a Ruby gem to write pandoc filters in Ruby. |
Other tools:
- vimhl, a vim plugin that makes vim syntax highlighting engine available in pandoc.
- pandoc-jats, a Lua custom writer for Pandoc generating JATS XML.
- pandocmeta.lua, a simple Lua package that converts Pandoc metadata types into a, possibly multi-dimensional, table.
Written Filters
A number of filters written in Lua are collected at https://github.com/pandoc/lua-filters .
Some other known 3rd party filters:
- Document (DOCX/ODT) related:
- Because DOCX and ODT files cannot use templates, we are limited in how we can transform metadata into document content. Several paru filters can help to solve this, given a metadata format involving authors with affiliation/correspondence fields and institute information: README; and individual filters: simplifyMetadata,prependInstitute,prependKeywords, prependAbstract,prependComments — filters combined: prependAll.
- Images related:
- pandoc-svg, a pandoc filter to convert svg files to pdf by Jerome Robert.
- diagrams-pandoc for inserting images expressed in the Haskell diagrams DSL.
- mermaid-pandoc for inserting images expressed in mermaid syntax
- r-pandoc for inserting plots expressed in the R language
- paru-screenshot.rb for automatically taking a screen shot of a web page and including that shot as an image in a markdown file.
- Numbering related:
- Numerical reference to sections, using a specified sign (by default
#
) in internal links. Metadata can configure special sign and whether links should be preserved or converted to plain text. - pandoc-fignos, for numbering figures and figure references.
- pandoc-eqnos, for numbering equations and equation references.
- pandoc-tablenos, for numbering tables and table references.
- pandoc-crossref, for numbering and cross-referencing figures, equations and tables
- pandoc-numbering, for numbering and cross-referencing any kinds of things such as examples, theorems, exercises and so on
- pandoc-listof, for creating lists of any kinds (deprecated)
- pandoc-amsthm: a pandoc amsthm package to define the use of amsthm through YAML front matter, target at HTML and LaTeX outputs. For HTML, CSS counter is used and defined in a template (by the YAML variables). For LaTeX amsthm package is used and defined in a template (by the YAML variables).
- Numerical reference to sections, using a specified sign (by default
- Math related:
- pandoc-mathjax-filter rendering math to SVG using mathjax-node
- mathjax-pandoc-filter rendering math to SVG using mathjax-node
- asciimathml-pandocfilter: to add read support for AsciiMathML syntax through conversion into LaTeX
- pandoc-unicode-math replaces Unicode math symbols and greek letters like ∀, ∈, →, λ, or Ω in math environments by equivalent Latex commands like
forall
,in
,rightarrow
,lambda
, orOmega
. - SugarTeX is a more readable LaTeX language extension and transcompiler to LaTeX. Fast Unicode autocomplete in Atom editor via SugarTeX Completions for Atom.
- LaTeX related:
- pandoc-latex-environment, for adding LaTeX environment on specific HTML
div
tags - latexdivs.py: define a syntax to turn any native pandoc Divs into a LaTeX environment: if
latex='true'
is in the attribute of the Div, the first class is used to define the LaTeX environment. - pandoc-latex-tip, for decorating specific
span
,code
,div
andcodeblock
elements by icons taken from popular icon collections. - pandoc-latex-admonition, for decorating specific
div
andcodeblock
elments by admonitions - pandoc-latex-barcode: insert a barcode or a QR code into a latex/PDF document.
- pandoc-latex-fontsize, for specifying LaTeX font size on
span
,code
,div
andcodeblock
elements - pandoc-latex-color, for specifying X11 color on
span
anddiv
elements - pandoc-latex-unlisted, for unlisting some specific headers in the table of contents
- pandoc-latex-newpage, for converting horizontal rules into new page
- pandoc-latex-french-spaces, for dealing with french spaces around some punctuation marks
- pandoc-latex-margin, for setting left and right margins on
div
andcodeblock
elements - pandoc-beamer-block, for using
block
,alertblock
andexampleblock
environment defined in beamer.
- pandoc-latex-environment, for adding LaTeX environment on specific HTML
- RAW related:
- Pandoc filter to insert arbitrary raw output markup as Code/CodeBlocks with an attribute raw=.: Pandoc filter to insert arbitrary raw output markup as Code/CodeBlocks with an attribute
raw=<outputformat>
. - Include Files: finds all the inline code blocks with attribute include, and replaces their contents with the contents of the file given
- pandoc-dot2tex-filter - a filter that converts dot notation to PGF/TikZ graphics for latex/pdf rendering.
- HTML comment to LaTeX comment: a filter that converts HTML comment to LaTeX comments
- Pandoc filter to insert arbitrary raw output markup as Code/CodeBlocks with an attribute raw=.: Pandoc filter to insert arbitrary raw output markup as Code/CodeBlocks with an attribute
- Tables related:
- pandoc-csv2table for including referenced csv files in markdown as markdown rendered tables.
- pandoc-placetable lightweight implementation of the idea behind the above
pandoc-csv2table
(e.g. doesn't necessarily require pandoc as a cabal dependency) - ickc/pantable: CSV Tables in Markdown: Pandoc Filter for CSV Tables: a Python alternatives to the above 2 filters, using panflute, with some enhancements (e.g. auto-width, fractional width, etc.)
- Creating a link table at the end of your document.
- Text related:
- pandoc-abbreviations allows the use of arbitrary abbreviations, defined in an abbreviations file or in the source document's YAML header, which are replaced on processing. Useful for maintaining consistency of terminology etc.
- pandoc-lang automatically detects the (natural) language of text, as well as the programming language of code blocks
- pandoc-mustache replaces variables like
{{varname}}
in a pandoc document with their values, which are stored in a separate YAML file. - pandoc-quotes.lua and the older pandoc-quotes replace non-typographic, quotation marks with typographic ones for languages other than US English.
- Running Code related:
- R-pandoc for generating R plots
- filter_pandoc_run_py for executing python codes written in code blocks and also embedding print output and pyplot figures
- pandoc-pyplot to generate and embed Matplotlib figures based on code blocks in documents. Easy integration with Haskell libraries (e.g. Hakyll)
- Knitty: is a Pandoc filter for reproducible reports via Jupyter and Pandoc (Stitch's fork that is a Knitr-RMarkdown-like lib). Insert Python code (or other Jupyter kernel code) to the Markdown document or write in plain Python/Julia/R/any-kernel-lang with block-commented Markdown and have code's results in the Pandoc output document.
- Others:
- Adding support for indexing with the syntax
(# term, subterm)
in html and latex - lablinkfix updates links to the Swedish Labour Movement Archives and Library catalogues.
- second-date changes
date
metadata to a different strftime format using python's dateutil. - pandoc_abnt allow to specify the source of images and tables, and automatically corrects Alineas according to Brazilian's standard for Academic writings (ABNT NBR 14724:2011).
- pandoc-refheadstyle sets a custom style for the reference section header; there's a Lua version, too: pandoc-refheadstyle.lua
- nheengatu provides several resources for publishing multimedia content through formats such as LaTeX, HTML and EPUB.
- pandoc-zotxt.lua looks up sources for citations in Zotero.
- Adding support for indexing with the syntax