Cool Emacs

Introduction to Literate programming

Abstract

In this article I give a short, theoretical introduction to the idea of Literate programming

Introduction

Literate programming is not a subject that comes out often, except if you are talking with an Emacs enthusiast. There is a significant chance that most programmers don’t even know the term. Let’s fix that with a quote:

Literate programming is an approach to programming which emphasises that programs should be written to be read by people as well as compilers. From a purist standpoint, a program could be considered a publishable-quality document that argues mathematically for its own correctness. A different approach is that a program could be a document that teaches programming to the reader through its own example. A more casual approach to literate programming would be that a program should be documented at least well enough that someone could maintain the code properly and make informed changes in a reasonable amount of time without direct help from the author. At the most casual level, a literate program should at least make its own workings plain to the author of the program so that at least the author can easily maintain the code over its lifetime.

– Christopher Lee1

The idea is therefore a conversion of an entity out of which one can extract code or documentation. A semi-abstract in-between called “web2”. The process of creating code is called “tangling”, and generation of document is a “weave”.

An example

Let’s say we want to show the reader how to install DWM. We can create a document in a style:

DWM is a window manager that can be changed only via source code modification. Here, we will fetch and compile it. First, we need to download the tarball:

wget https://dl.suckless.org/dwm/dwm-6.4.tar.gz

then, simply extract it

tar - xvzf dwm-6.4.tar.gz

And then we compile it

cd dwm-6.4
doas make clean install

After the compilation finishes, add executable to your .xinit

echo "exec dwm" >> ~/.xinit

So yeah, it’s a blog post. A blog post which one can execute. The example assumes shell, but the actual language can be anything. We can tangle C code without any problems.

Literate programming

This is not the way we do programming. We smack spaghetti code together, add a random sentence here and there, commit is as “bug fix” and voilà! In a few months no one knows what’s going on. Success, up to the next JIRA task.

Very often code comments are treated as an harmful or (at best) a necessary evil. We think that code should be self-documenting. And this is completely valid. A developer needs to understand what given code does, just by reading it. If your function is so convoluted, nested and complicated that it’s impossible to comprehend without a descriptive comment.

But this is not the whole story. A function may be very simple, but there is always context in which it is used.

Literate Programming promotes telling story to the reader. You are free to do narration giving all extra info in one place. Since the code coexists with documentation, the reader gets the whole picture.

Conclusion

Now, this is not a generic fix for all programs. We work on massive systems with hundreds of intertwined, moving parts. It is impossible to create a cohesive narrative when the program jumps all over the place.

Literate programing, however, found a different home. It is loved by scientists (just look at Jupyter Notebooks3) who use it for reproducible resarch. We, amongts Emacs crowed, use it extensively for literate configuration of our environments. It could be used for scripts, runbooks, debugging logs and so on. Wherever one can see a logical A, B and C points, we can explain the interconnections.

You can learn more (including much better example) by reading the original Knuth’s paper.


  1. “Literate Programming – Propaganda and Tools”, Christopher Lee, 1997 ↩︎

  2. this name was choosen, because at the time it was not in use related to computing. We’re dealing with history here! ↩︎

  3. I know that Jupyter is not strictly a literate program, but it’s close enough. ↩︎


Previous: Input Completiton, Next: Executing code in Org files with Babel, Up: Cool Emacs