How this blog works, or, embracing chaos

Posted Friday, May 13th, 2022

Once I decided to get back into blogging, I first went on a deep rabbit hole of researching what to use. It is not as if I didn’t had a website, it is just that I treated my websites as temporary. I would every couple years burn it all down and create a new one in the same domain. I wish I kept the old data, but at the moment that trashing usually involved migrating to new servers or services, but that is a story for another time. What I ended up finding was all the IndieWeb stuff, and it made me fall in love with blogging again. I was ready to be in control of my own platform and what is best than to be in control of your own CMS source-code, am I right? no.

At the time I was completely obsessed with Racket programming language and it’s offspring Pollen, a very clever static site generator used to write many amazing books such as Beautiful Racket. Pollen is amazing, but it is not geared towards blogging. To make it work, I collated code and techniques from other people also using it to run their blogs into something that I kinda have some agency over but not as much as I’d like. I understand all that is going on until we reach the Pollen source code itself, then it is all magic. So, let’s talk about how this blog actually works.

This blog is a static site generated using Pollen. A folder hosts all the files that makes up the source of the blog and a clever build script goes over it using Pollen and assembling the HTML and other text files.

Finder looking at aag source

Screenshot of a Finder window showing the source folder for this blog.

The selected file in the screenshot above is the post I made yesterday. A HTML file is created next to the original source file once Pollen processes it. Only the HTML files are published to the website, the source files remain outside of the web server root folder.

Assembling collection pages such as the index, the RSS feeds, and tag collections would be very slow if the blog had to traverse all text files every time to find their tags and dates. A clever trick is employed to solve that. Every time a post is rendered to HTML, metadata about it is inserted into a SQLite database. This database is not crucial and can be deleted at any time, it will simply be recreated from scratch once the build script runs again. All pages that need to assemble collection of posts query that database instead of traversing the filesystem. It is a bit fragile and it can be out-of-sync with what is on disk if the render pipeline ends up rendering a post after rendering a collection.

That made me create some overly complicated makefileto rebuild all the collection pages if any post page is touched, etc. Still mistakes creep in, specially if somehow the rendering process crashes and has to be restarted. I don’t mind it much anymore, I’ve learned to embrace a bit of chaos. It is only by accepting a bit of disorder in your life that you can have an effective blog if you’re a software developer. Without that you’ll keep working and reworking your CMS and system until it is perfect and will never actually write any post. Just ask the developers you know how many posts they made in the last two months and how many changes they made to their blog source-code in the same period… the answer will be a revelation to both of you.

If you’re a returning reader to this blog, you’ll be aware that I’m a bit old-school. I’m not on the bandwagon of CI/CD deployments and having a cloud system spin up I don’t know how many containers and tests just to render my grammar mistakes for the world to see. For the longest part of the life of this current incarnation of my blog, it has lived quite happily in my computer. I’d create posts using Sublime Text and build the blog by issuing commands on a terminal. That workflow served well me for a long time. It even followed me as I moved from a Surface Pro 4, to a Surface Go, to a Surface Pro X, and finally to the current Macbook Air. It was a refreshingly simple setup.

And soon, it was not enough. As I moved back into the Apple ecosystem, I ended up using more and more mobile devices. Suddenly I had an iPhone and an iPad, and oh my, I love the iPad so much (really unexpected, I thought it would just be a tool to carry my notes). I found myself leaving the laptop home and writing using either the iPad or my Freewrite Traveller. How could I post then? The source-code for the site was locked inside my home computer.

To solve that, I migrated the source to the VPS that is hosting the blog. There is a copy of the source both on the VPS and on my home machine. Each copy is self-contained and enough to regenerate the whole site. They are synchronised using Git, but I don’t do any clever Git stuff, the script just regenerate all the site files and add them to Git and send them over. Once Racket and Pollen were working well on the server, I could create posts by logging over SSH and doing the exact same thing as I did with the Macbook: write a new text file and run a script. That is not my idea of fun.

I turned again towards the IndieWeb people and implemented a subset of Micropub. I wrote about that yesterday. It enabled me to post from my iPhone and iPad. Basically all the Micropub server does is accept the h-entry creation request, write the corresponding text file and run the same script I used to run by hand. Eventually, I got tired of Micropub. I’m much more familiar with metaWeblog API and switched to implement a server for that protocol instead.

Now, the blog is still a statically generated site. It is assembled by Pollen with lots of bespoke code via a makefile and some shell scripts. There are two servers running, one accepting a subset of the Micropub protocol and the other supporting 99% of metaWeblog API (I forgot to implement metaWeblog.getUsersBlogs). If I submit a post through any of those protocols, the servers create the corresponding text file in the correct location and invoke a massive shell script that amounts to "please, rebuild the whole site”because any kind of incremental building script felt too fragile. Be aware that it will skip assembling files that didn’t change, so it is incremental, it is just that it attempts to build everything.

And here we are, this is the stack running this blog:

Racket: the programming language that powers all the bespoke code running in the blog.
Pollen: the static site generator used to assemble all the text files (HTML, CSS, RSS, XML, etc).
A custom Micropub server built with Racket.
A custom metaWeblog API server built with NodeJS.
Caddy web server.
Alpine Linux.

If you have any question, just AMA on Mastodon or Twitter.

Those among you who know Racket and Pollen and been reading my posts might be wondering how I am posting to the blog using metaWeblog, how that can play well with Pollen markup?! Well, it kinda can’t. MetaWeblog really wants to work with HTML fragments while Pollen has two markups Pollen Markupand Pollen Markdown(which is not exactly Markdown). What I’m doing is posting either HTML fragments or plain-text. In both cases, the file is saved in Pollen Markdown format and processed as such. That leaves all the basic HTML tags I use when editing rich-text on Mars Edit in place while still enabling me to switch to plain-text mode and write the correct Markup if needed. What I can’t do is edit the post as Pollen Markdown mostly because Mars Edit is breaking the newlines, so instead if I need to edit a post, I’ll edit it as an HTML fragment. Initially the post will be sent to the server as either an HTML fragment or a Markdown text, it will be rendered into HTML. If I try to edit it, what I’ll edit is the rendered HTML fragment which will then be saved back to the original file. It is not ideal, but as I said before, you either embrace the chaos or you’ll use 100% of your time to fix bugs in your CMS instead of writing.

How this blog works, or, embracing chaos

Did you enjoyed reading this content? Want to support me?

Comments? Questions? Feedback?

Mentions