Static site generator in Batch
[2014-04-23] [web][exotic]
Table of contents:
My affair with Batch
Well, what can I say. Around 2012-2014 I got fond of Batch. Maybe because it’s available on every Windows machine, honestly I don’t really know. Somehow I liked things that work out of the box so I started writing small scripts and later some more complicated ones.
Because of this interest, I soon created a guide to Batch for beginners, that you can read here. (The only problem is that it’s written in hungarian)
Motivation
Apparently I had just registered on GitHub and wanted to create my own Jekyll in… Batch. The specification is pretty simple. It has to parse raw text files and create one big html file filled with all the articles. Apparently I did not bother with creating different pages for different articles.
The actual generator
Many years have passed since I touched Batch (note that these articles are more like necrologs) so explaining how it works is going to be a little harder, especially because we’re talking about Batch. The grammar is just exceptionally alien to me.
The root path of the repository contains 4 files and 1 directory like so:
articles
_foot.html
_head.html
_sitebuild.bat
Here we can see that the only command we can issue is .\_sitebuild.bat
, the rest of the files are just content that the command reads and writes. Let’s inspect this dreaded script.
First 5 commands
@ECHO off
REM EVAL IN TWO ROW:
SETLOCAL EnableDelayedExpansion
REM SET UNICODE:
CHCP 65001
On the first 5 lines we initialize things.
It’s kind of a bootstrapping that you have to do at the beginning of a more complex script.
The first line, or more specifically ECHO off
tells Batch to turn off outputting every
command that is executed. This is not the same as > /dev/null
because we’re talking about
the commands not their output.
“Then why is there an @ symbol?” you might ask.
Because @ temporarily turns off the command output.
To be able to silence the command that silences the rest of the commands you need to use the
inline ‘silencer’ symbol. And ladies and gentlemen this is the moment
I gave up on not being sarcastic about explaining this script. Even the first row is such a joke. Anyway.
You know Unix shell scripts use commands to imitate conditions, loops, etc. Batch goes as far
so you can issue a command to create a comment. It’s called REM
and it literally
has no meaning, it’s just a placeholder command for comments. Of course, you can use
::
which is an actual comment feature in Batch, but for some reason I did not use it.
You know what? Let’s peek into the implementation of Batch and see how they implemented REM
.
After some digging I found the relevant section. Apparently the command is assigned to the NullHandler.
Here’s the implementation of the NullHandler:
/*
* Do nothing. Return TRUE so the the command will not
* be considered as bad.
*/
int NullHandler (char *lpParam)
{
return(TRUE);
}
Understandable, have a nice day.
The next command is SETLOCAL EnableDelayedExpansion
. Oh boy. Why did I even start this report?
So SETLOCAL
is supposed to create a new scope for environment variables, meaning, you can
manipulate %PATH%
until the execution reaches the ENDLOCAL
command.
(You can read about it at Microsoft
or at ss64.com. The latter one is a little bit more helpful)
You could say that it’s a nice round feature, time to ship it. But no. Bill Gates had other plans.
If you add the flag EnableDelayedExpansion
, all hell breaks lose and you acquire the power of evaluating variables multiple times inside commands!
Commands before the FOR loop
SET ARTICLES_DIR=articles
SET HEAD=_head.html
SET FOOT=_foot.html
SET GITHUB_USER=jbebe
TYPE %HEAD% > index.html
ECHO ^<div id="banner"^>%GITHUB_USER%/blog^</div^> >> index.html
The first 4 SET
s are self explanatory, but then we meet TYPE
.
Is it a file type tester? Does it type the arguments on the output?
You’re so far from the answer. TYPE
is the cat
of Batch.
So basically we cat the content of _head.html
into index.html.
Then on the next line we’re escaping some inequality symbols and append the blog title into the index file. Nothing too fancy.
The FOR loop
FOR /F %%x IN ('DIR /B /D /O-D %ARTICLES_DIR%') DO (
SET entry=%%x
SET entry_date=!entry:~0,10!
SET entry_title=!entry:~11!
SET entry_title=!entry_title:_= !
ECHO ^<div class="date"^>!entry_date!^</div^>^<div class="line"^>^</div^> >> index.html
ECHO ^<h1^>^<a href="#%%x" id="%%x"^>!entry_title!^</a^>^</h1^> >> index.html
ECHO. >> index.html
TYPE %ARTICLES_DIR%\%%x >> index.html
ECHO.^<br/^>>> index.html
)
Beware, we are reaching PhD grade Batch scripting territory.
For loops receive parameters.
If you want to iterate over files, you use FOR %%i IN (dir ...)
.
If you want to iterate over file contents, you need to add the /F switch too.
This way, %%i
will contain the file content instead of just the name.
Let’s see the “IN..” part of the for loop. You might be wondering why I had a stroke so late into writing this script. But fear not as those flags for the dir command are actually useful.
/B
removes the heading, file size and summary/D
aligns file names so each one is on a new line/O-D
sorts the files by creation date
We iterate over the articles directory and set the file content (entry
) and other properties.
This is the moment you have to ask what those exclamation marks are.
Why didn’t I use percent signs as before?
It’s because the whole for loop in Batch is a single command.
In a single command you can only evaluate a variable once.
Why? That’s why:
The SET command was first introduced with MS-DOS 2.0 in March 1983, at that time memory and CPU were very limited and the expansion of variables once per line was enough.
Moving on, !entry:~0,10!
means take the variable entry
, and substring it from 0 to 10.
entry_title=!entry_title:_= !
means replace all underscore characters with spaces.
We create another html fragment with the given information and then write it to the index.html with the content as well.
Closing Pandora’s box
TYPE %FOOT% >> index.html
For the last step, we cat the footer into the html and the blog is created.
Final output

Screenshot of the blog