How to Manage a Website, Destructively

Posted by Daniel Lyons on July 13, 2009

For a little while I was thinking about managing my website with this before I decided to use Webby instead. I changed my mind for three reasons:

  1. My system sucks at syntax highlighting. With Webby, you just use Ultraviolet.
  2. There isn’t a good way to do escape characters in m4, so some blog posts break spontaneously.
  3. Webby somehow manages to be faster.

I’m not sure I’m going to abandon this idea completely though. It might be a good project to do with Plan 9. If I remake the site in Clojure then I probably will go for something more dynamic though.

Anyway.

Conception

This project is a hate crime.

I hate websites. Especially I hate software for managing them which is too limited and doesn’t build on the fundamental flexibility of the platform. I also hate reinventing the wheel.

It occurred to me as I was playing with Webby that I hate Rake, because it’s slower than real make and less powerful and because I hate Ruby now too.

Why not use make? Make has a great advantage over other systems: you can write pattern rules that generate some file type from some other file type and now the whole thing knows how to transform all your files from one thing to another. That’s handy.

Thinking about what everyone needs in a website, they really only need a system to get some text into a giant ball of HTML and stuff that into their template somehow. Perhaps also doing some other template-like crap at the same time. There are lots of great tools for managing your HTML glorp, unfortunately many of them are Ruby specific, but there’s Textile, Markdown, Haml and others.

Interestingly, you only really want to deal with Haml for the template and Textile or Markdown for the structure. Textile looks more like plain text to me than Markdown but Markdown is better at dealing with code and nested stuff—I seem to have to resort to raw HTML a lot less, and it’s bi-directional. I like having choices.

So this seemed like my formula: glorp + template = HTML.

A Twist

How do you deal with things like the title? The title ostensibly appears in the glorp somewhere but it also needs to wind up at the top of the file, in the <title> tag.

This seems like metadata to me. So the new formula looks like this: glorp + metadata + template = HTML.

A Taste

  1. ‘bin/build-rules.sh sites
  2. make

            [fusion@Anatolia Makesite] % make rsync
            haml template.haml > template.m4.pre
            cat header.m4 template.m4.pre > template.m4
            rm template.m4.pre
            redcloth site/about-me.textile > site/about-me.phtml
            ruby bin/metadata.rb site/about-me.phtml > site/about-me.meta.m4
            m4 -D __metadata__=site/about-me.meta.m4 -D __body__=site/about-me.phtml template.m4 -D __dir__=site > site/about-me.html
            ruby bin/webby2ms.rb m4 site/how-it-works.webby > site/how-it-works.meta.m4
            ruby bin/webby2ms.rb content site/how-it-works.webby > site/how-it-works.textile
            redcloth site/how-it-works.textile > site/how-it-works.phtml
            m4 -D __metadata__=site/how-it-works.meta.m4 -D __body__=site/how-it-works.phtml template.m4 -D __dir__=site > site/how-it-works.html
            perl bin/Markdown.pl --html4tags site/index.markdown > site/index.phtml
            ruby bin/metadata.rb site/index.phtml > site/index.meta.m4
            m4 -D __metadata__=site/index.meta.m4 -D __body__=site/index.phtml template.m4 -D __dir__=site > site/index.html
            cp -r static output
            bin/install.sh output  site/about-me.html site/how-it-works.html site/index.html
            site/about-me.html -> output//about-me.html
            site/how-it-works.html -> output//how-it-works.html
            site/index.html -> output//index.html
            rsync -vrzp -e ssh --chmod=u+rwX,go+rX output/* clanspum.net:sites/beta.storytotell.org
            sending incremental file list
            about-me.html
            how-it-works.html
            index.html
            css/style.css
            downloads/
            downloads/makesite.tar.gz
            images/.DS_Store
            images/accept.png
            images/accept50.png
            images/bg_body.png
            images/bullet_bottom.png
            images/bullet_right.png
            images/content_bg.png
            images/email_open.png
            images/email_open50.png
            images/h1_bubble_bg.png
            images/h2_bubble_bg.png
            images/meta.png
            images/page_edit.png
            images/page_edit50.png
            images/sidebar_section_bg.png
            images/sidebar_section_bg_over.png
            images/tag.png
            images/title_bg.png
            images/user.png
            images/user50.png
            images/world_link.png
            images/world_link50.png
            
            sent 94105 bytes  received 1851 bytes  38382.40 bytes/sec
            total size is 234097  speedup is 2.44
        

Templates

For the final substitution, I don’t need much; all I really need is a way to substitute some metadata variables and swap in the content into the big hole in the middle of the template. Suddenly I remembered M4, the ancient Unix macro processor. Yeah, it’s fugly and gross, but like make, it isn’t going anywhere and it’s definitely powerful enough.

Using make

If you’re going to use make effectively, you need to figure out how to encode your information in the actual filename of the file. Make is going to use the extension of your file to figure out what to do with it.

Since I’m using m4, I decided to make a separate file for the metadata and make it m4 format. This is the extension .meta.m4. Also, pre-processed HTML has the extension .phtml. Now all I had to do was come up with the m4 command line to process a template with the body and the metadata. I came up with this:

%.html: %.meta.m4 %.phtml template.m4
        	m4 -D __metadata__=$< -D __body__=$(word 2,$+) $(lastword $+) -D __dir__=$(dir) > $@

I decided to use foo for all my m4 variables to reduce the chances that it runs into some text I’ve already defined in my files. body is the filename of the HTML glorp, metadata is the filename of the metadata and dir is the path we’re at when we invoke m4 (important for calculating relative paths).

Template Structure

Our template must begin with this:

include(site/header.m4)

These two lines do two things: import the metadata file using the variable from the command line and define the relative path back to the root (handy for links to CSS and whatnot in the base of the site).

Later on I decided that template.m4 ought to be built out of parts, I extracted this to header.m4 and define template.m4 to be built out of Haml using this rule:

template.m4: template.haml
        	haml $< > $@.pre
        	cat header.m4 $@.pre > $@
        	rm $@.pre

Now I can just use my m4 variables from within Haml instead of worrying about filling in the template with Ruby! :D

Making Glorp

To make the HTML glorp, I have two patterns for handling textile and markdown:

%.phtml: %.textile
        	redcloth $< > $@
        
        %.phtml: %.markdown bin/Markdown.pl
        	perl bin/Markdown.pl --html4tags $< > $@

(I have the markdown depend on the Perl file because it’s in the distribution and may change.)

Both of these just render the text and produce the interior HTML.

Handling Webby

I had already written a utility to convert my Typo blog posts to Webby format. Webby’s format is interesting; basically it’s a YAML wrapper on top of plain text. The wrapper specifies the format of the text and the YAML contains the metadata. So at this point I created a utility just to deal with Webby files and renamed all of them to end in .webby.

The utility takes a file and a single command, which is either ‘m4’ or ‘content’ depending on whether it’s generating the content or the m4 metadata. To make the content, it just slices it out and prints it; to make the metadata it parses the YAML and converts it to a series of define() statements in m4.

Make was informed:

%.meta.m4: %.webby bin/webby2ms.rb
        	ruby bin/webby2ms.rb m4 $< > $@
        
        %.textile: %.webby bin/webby2ms.rb
        	ruby bin/webby2ms.rb content $< > $@

In a future version, it might be able to determine the file type to create, but for right now it was convenient just to specify Textile since all my old blog posts are in Textile anyway.

Generic Metadata

It occurred to me that you could find some metadata just by introspecting the file, such as to look for the first h2/3/4 for the title or the ctime of the file for the creation time. So I wrote a short script to create a .meta.m4 from a .phtml. It’s important to keep the Webby lines above these rules or it will use these rules to produce .meta.m4 from .webby files when there is a better method available.

%.meta.m4: %.phtml bin/metadata.rb
        	ruby bin/metadata.rb $< > $@

Non-Recursive Make Structure

At this point, I could generate whatever html file I wanted from my templates and my old content, but I’d have to ask for it by name. I needed some kind of make rules that would find all the old content and build it.

As it turns out, there’s still recursion in non-recursive make, it’s just recursion of reading make files and processing their contents before doing all the work.

First of all we need to understand how to invoke make non-recursively. I read the famous paper, Recursive Make Considered Harmful and ran into it the other day and saw the link off to Emile van Bergen’s implementation.

I had to follow this in a very regimented fashion, so I wrote a script which recursively builds Rules.mk files for all of the subdirectories under some directory. This is very tailored to this app and will need to be extensively overhauled if I want to generate something other than HTML or use it in another app. But it’s also quite short and illustrates recursive calling in the shell.

#!/bin/zsh
        
        # bring in the header
        function build_rules() {
        	...
        }
        
        function build_rules_rec() {
        	build_rules $1 > $1/Rules.mk
        
        	for dir in $(find $1 -type d -maxdepth 1 | sed '1d'); do
        		build_rules_rec $dir > $dir/Rules.mk
        	done	
        }
        
        build_rules_rec $1

The essence of build_rules is that it looks in a directory and creates recursive loading of Rules.mk in each of the subdirectories, and adds all of the .webby files in the current directory to $(TGTS_HTML) with the extension changed. We’re being declarative here, folks.

Rock and Roll

The rest of the makefile is pretty much cake; define all to be targets, define targets to be $(TGTS_HTML), define clean to be rm -f $(CLEAN). Run make -j4 and watch as your website gets built with SMP!

Of course now you have the problem of installation. I didn’t want the installation method to be decoupled from the list of targets we’re building so I wound up writing a shell script which does some path nastiness to copy the target files preserving their path parts into the output directory. The output directory is first made by making a copy of a static folder which contains things that don’t need to be expanded like images and CSS files.

Then I added an install target and an rsync target which copies it up to the website.

That’s it in a nutshell.

The Code

I’m sure you’re all quite desperate to give this a try. Download the source and hopefully it’ll still be working for you then.