Vincent Zoonekynd's Blog

Thu, 26 Jan 2006: Blosxom

Since I decided to revive my web page, to update it on a more regular basis, to resume a frequent web activity, I have a look at a couple of Blog engines. I want one that allows me to write in a text editor, not in a web browser, that allows me to type text, not HTML, that allows me to use my own (pre-wiki) tagging, that produces static files.

Most blog engines are dynamic (meaning that the contents are stored on the web server, as text files or inside a database, and that the actual HTML pages are only generated when requested), but Blosxom (and probably PyBlosxom, its Python cousin, as well) also allows for the generation of static web sites (i.e., we can generate all the HTML pages at once and only transfer them to the web server).

http://www.blosxom.com/
http://research.operationaldynamics.com/blogs/andrew/meta/blosxom/blosxom-colophon.html
http://groups.yahoo.com/group/blosxom/
http://blosxom.ookee.com/blog/
http://hail2u.net/archives/bsk.html
http://www.blosxom.com/plugins/
http://fletcher.freeshell.org/wiki/BlosxomPlugins

A Naked blog

Let us first check how (if) it works and how it looks with no configuration, no plugin whatsoever.

Installation

I download it:

wget http://www.blosxom.com/downloads/blosxom.zip
unzip blosxom.zip

I edit a few variables:

vi blosxom.cgi

namely,

$blog_title = "Vincent Zoonekynd's Weblog";
$blog_description = "Vincent Zoonekynd's Weblog";
$datadir = "/tmp/Blogs/Txt";         # Where the raw, unprocessed, *.txt articles are
$static_dir = "/tmp/Blogs/Result";   # Where to store the HTML and RSS files
$static_password = "noPassword";
$static_entries = 1;                 # A spent a LONG time trying to find
                                     # why the individual entries were not
		       # created...
$url = "file:///home/zoonek/notes/blosxom/Result";   # This is just a test:
                                                     # the result will be
				       # accessed through the
				       # file:// protocol
				       # instead of http://

I write a Makefile:

all:
        rm -rf Result
 perl blosxom.cgi -password=noPassword

I put some *.txt files in the $datadir directory:

zoonek@gentoo /tmp/Blogs/Txt $ find
.
./Linux
./Linux/2005-09-01_Mandrake_10_1.txt
./Linux/2005-12-01_Mandrake_2006.txt
./Linux/2005-12-23-Suse.txt
./Linux/2005-12-31_Ubuntu.txt
./Linux/2006-01-01_Gentoo.txt
./Photo
./Photo/2005-10-09_Nikon_D50.txt
./Photo/2006-01-14_Book__Beyond__Visions_of_the_Interplanetary_Probes.txt

and I generate the files:

make

Results

I can then check the results in the $static_dir directory.

There are a few problems: it is ugly, I did not provide a stylesheet; all the text is present on the main page, not only the begining, as I was expecting (as I tend to write overly long pages, this is important); my text has not been converted into HTML (in particular, the blank lines have not been turned into paragraph changes); the pictures were not included (and I have no idea where to put them); I do not see how to put an article in several categories (e.g., the comments on a photo book I recently read should be in the "Books" section and in the "Photography" section.

A more configurable blog: the simple way

The Blosxom Starter's kit may be a good place to start: it provides you with blosxom and a few useful plugings: for instance, thanks to the Markdown plugin, you can type wiki-like text.

A more configurable blog: the hard way

Inside the box

Actually, blosxom is a very simple, almost empty piece of software: it is just a loop over all the *.txt files in the data directory, for each of them, we call the functions in the plugins directory (they can modify the text, turn it into HTML, etc., they can also create new variables to be used later), and we create the page by filling in the templates. The loop is really empty, the actual work is done by the plugings.

In short:

1. Find all the *.txt files in the data directory
2. For each file to be generated (one file for each article, one for each
   date (the dates are simply the modification dates of the files), one for
   each month, one for each category (the categories are just the directories
   containing the files), one for each flavour (HTML, RSS, Atom, but you
   could also have outputs in PDF or whatever):
3. Read the file;
4. Call all the plugins: they will define new variables or modify existing
   ones -- for instance, they can turn text into HTML;
5. Take the templates (head, story, date, foot; for each flavour) and
   replace the variables they contain, with a simple 
   s/(\$\w+(?:::)?\w*)/"defined $1 ? $1 : ''"/gee;
 

Initial templates

The outlook of the pages is decided by the following files, that should be in the data directory. These are their initial contents.

content_type.html

text/html

head.html

<html>
  <head>
    <link rel="alternate" type="type="application/rss+xml"
          title="RSS" 
   href="$url/index.rss" />
    <title>$blog_title $path_info_da $path_info_mo $path_info_yr</title>
  </head>
  <body>
     <center>
       <font size="+3">$blog_title</font>
<br />
$path_info_da $path_info_mo $path_info_yr
     </center>
     <p />
     

story.html

<p>
  <a name="$fn"><b>$title</b></a>
  <br />
  $body
  <br /><br />
  posted at: $ti | 
  path: <a href="$url $path">$path</a> | 
  <a href="$url/$yr/$mo_num/$da#$fn">permanent link to this entry</a>
</p>

date.html

<h3>$dw, $da $mo $yr</h3>

foot.html

 <p/>
 <center>
   <a href="http://www.blosxom.com/">
     <img src="http://www.blosxom.com/images/pb_blosxom.gif" border="0" />
   </a>
 </center>
 </body>
 </html>

The template I actually use

I just add a stylesheet and my name, e-mail address and web site at the bottom of each page.

head.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
      <link rel="alternate" type="type="application/rss+xml"
            title="RSS"
            href="$url/index.rss" />
      <title>$blog_title $path_info_da $path_info_mo $path_info_yr</title>

      <style type="text/css">
  BODY {
    background-color: #FFFFFF;
    color: #000000;
  }
  H1 {
    background-color: #ffdb43;
    color: #000000;
    padding: 20pt;
    margin-left:  20%;
    margin-right: 20%;
    text-align: center;
  }
  H2, H2 A:link, H2 A:visited {
    background-color: #6D8ADA;
    color: #FFFFFF;
    font-weight: normal;
    font-weight: bold;
    font-size: medium;
    margin-left: 0pt;
    padding: 5pt;
  }
  H3 {
    font-weight: bold;
    font-size: medium;
  }
  PRE {
    background-color: #FFFFAA;
    color: #000000;
    border: thin solid;
    white-space: pre;
    margin-left:  20pt;
    margin-right: 20pt;
    padding-bottom: 10pt;
    padding-left: 10pt;
    padding-right: 10pt;
    padding-top: 10pt;
  }
  P {
    margin-left:  20pt;
    margin-right: 20pt;
  }
  LI P {
    margin-left:  0pt;
    margin-right: 0pt;
  }
      </style>
      <meta http-equiv="Content-Style-Type" content="text/css">
      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    </head>
    <body>
      <h1>$blog_title</h1>
       $path_info_da $path_info_mo $path_info_yr

story.html

<h2><a name="$fn">$dw, $da $mo $yr: $title</a></h2>
$body
<p>
  posted at: $ti |
  path: <a href="$url $path">$path</a> |
  <a href="$url/$path/$fn.html">permanent link to this entry</a>
</p>

foot.html

seemore.showmore.html:

@<p><a href="$blosxom::url$path/$fn.html">More...</a></p>

HTML formatting

I do not type HTML but text, in a very simple pre-wiki, pre-POD format: the first line is the title of the document, lines starting with a star are section titles, lines starting with a plus sign are subsection titles, lines starting with @ are to be included as is (they are already HTML), lines starting with an equal sign contain the URL of an image to be included, lines starting with two spaces contain code, to be included verbatim (in a <pre> block).

I wrote the following plugin:

package VZ;
use strict;
use warnings;

use constant NONE => 0;
use constant PAR  => 1;
use constant CODE => 2;

use constant TRUE  => 0==0;
use constant FALSE => 0==1;

sub start { 1; }
sub story {
  my($pkg, $path, $filename, $story_ref, $title_ref, $body_ref) = @_;
  my $continued = FALSE;
  if (TRUE) {
    ($$body_ref, $continued) = split(m/\f/, $$body_ref, 2);
    $continued = defined($continued);
  }
  our $result = "";
  our $mode = NONE;
  #$result .= "<pre>pkg:" . substr($pkg,0,50) ."\npath:" . substr($path,0,50) . "\nfilename:" . substr(0,50) . "\nstory:" . substr($$story_ref,0,50) . "\ntitle:" . substr($$title_ref,0,50) . "\nbody:" . substr($$body_ref,0,50) . "\n</pre>\n";
  #$result .= "<pre>pagetype: $pagetype::pagetype</pre>\n";
  sub start_par  { $result .= "<p>";    $mode = PAR;  }# print STDERR "<PAR>\n";    }
  sub end_par    { $result .= "</p>";   $mode = NONE; }# print STDERR " </PAR>/n";  }
  sub start_code { $result .= "<pre>";  $mode = CODE; }# print STDERR "<CODE>\n";    }
  sub end_code   { $result .= "</pre>"; $mode = NONE; }# print STDERR " </CODE>\n"; }
  sub print_line { my $l = shift; $l =~ s/\&/\&amp;/g; $l =~ s/</\&lt;/g; $result .= $l }
  sub print_raw  { my $l = shift; $l =~ s/^\@//; $result .= $l; }
  sub print_section { my $l = shift; $l =~ s/^\*\s+//; $result .= "<h2>"; print_line($l); $result .="</h2>"; }
  sub print_subsection { my $l = shift; $l =~ s/^\+\s+//; $result .= "<h3>"; print_line($l); $result .="</h3>"; }
  sub print_picture { my $l = shift; $l =~ s/^\=\s+//; chomp($l); $l =~ s/\s*$//; $result .= "<p><img alt=\"$l\" src=\"$blosxom::url/$blosxom::path/$l\"/></p>"; }
  sub is_code    { my $l = shift; return( $l =~ m/^\s\s/ ); }
  sub is_empty   { my $l = shift; return( $l =~ m/^\s*$/  ); }
  sub is_section { my $l = shift; return( $l =~ m/^\*\s/ ); }
  sub is_subsection { my $l = shift; return( $l =~ m/^\+\s/ ); }
  sub is_picture { my $l = shift; return( $l =~ m/^\=\s*/ ); }
  sub is_raw     { my $l = shift; return( $l =~ m/^\@\s*/ ); }
  foreach my $line (split(m/^/m, $$body_ref)) {
    #print STDERR "Line: $line";
    if ($mode == NONE) {
      if (is_empty($line)) {
        print_line($line);
      } elsif (is_raw($line)) {
        print_raw($line);
      } elsif (is_section($line)) {
        print_section($line);
      } elsif (is_subsection($line)) {
        print_subsection($line);
      } elsif (is_picture($line)) {
        print_picture($line);
      } elsif (is_code($line)) {
        start_code();
        print_line($line);
      } else {
	start_par();
	print_line($line);
      }
    } elsif ($mode == PAR) {
      if (is_raw($line)) {
        print_raw($line);
      } elsif (is_section($line)) {
        end_par();
        print_section($line);
      } elsif (is_subsection($line)) {
        end_par();
        print_subsection($line);
      } elsif (is_picture($line)) {
        end_par();
      } elsif (is_empty($line)) {
        end_par();
      } elsif (is_code($line)) {
        end_par();
	start_code();
	print_line($line);
      } else {
        print_line($line);
      }
    } elsif ($mode == CODE) {
      if (is_raw($line)) {
        print_raw($line);
      } elsif (is_section($line)) {
        end_code();
        print_section($line);
      } elsif (is_subsection($line)) {
        end_code();
        print_subsection($line);
      } elsif (is_picture($line)) {
        end_code();
      } elsif (is_code($line) or is_empty($line)) {
        print_line($line);
      } else {
        end_code();
	start_par();
	print_line($line);
      }
    } else {
      die "Bug: Wrong mode";
    }
  }
  if    ($mode == CODE) { end_code(); }
  elsif ($mode == PAR)  { end_par(); }
  $$body_ref = $result;
}

1;

__END__

(Insert some POD documentation here...)

Images

I use the staticfile plugin (formerly known as "binary") that copies non-txt files to the result directory.

Only displaying the begining of each article

The seemore plugin allows me to only display the begining of the article in the index page -- more precisely, everything until a ^L.

It requires a seemore.showmore.html template

I also have an empty seemore.divider.html template.

Last remarks

Plugins actually used

The numbers in front of the plugin names indicate the order in which they are to be read.

00static_file
50seemore
95VZ

Makefile

all: local

www:
        -rm -rf Result
        perl -p -e s!TARGETURL!http://zoonek.free.fr/blosxom!
        blosxom.cgi > blosxom_www.cgi
        perl blosxom_www.cgi -password=noPassword
        rm -f blosxom_www.cgi

local:
        -rm -rf Result
        perl -p -e s!TARGETURL!file:///home/zoonek/notes/blosxom/Result! blosxom.cgi > blosxom_local.cgi
        perl blosxom_local.cgi -password=noPassword
        rm -f blosxom_local.cgi

Other plugins

I have not looked at the following plugins:

toc                 For larger, structured stories
foreshortened       (designed for RSS feeds)
interpolate_fancy 
followsymlink       To put an article in several categories

RSS

I still have to add an RSS feed (actually, the RSS files are there, but there is no link to them -- I will also have to understand what RSS, Atom or whatever is used for).

Tweaking the dates

You might want to preserve the date of a file while editing it -- when you correct a typo, for instance.

You can get the date with the "ls -l" or the "stat" command.

zoonek@gentoo ~/notes/blosxom/Txt/Linux $ ls -l
total 100
-rw-r--r--  1 zoonek users 19942 Sep 24 06:36 2005-09-01_Mandrake_10_1.txt
-rw-r--r--  1 zoonek users 19615 Dec 23 20:46 2005-12-01_Mandrake_2006.txt
-rw-r--r--  1 zoonek users 12040 Dec 24 11:07 2005-12-23-Suse.txt
-rw-r--r--  1 zoonek users  9624 Jan  1 11:30 2005-12-31_Ubuntu.txt
-rw-r--r--  1 zoonek users 24395 Jan 23 21:24 2006-01-01_Gentoo.txt
-rw-r--r--  1 zoonek users  8997 Jan 25 21:24 2006-01-15_Blosxom.txt

zoonek@gentoo ~/notes/blosxom/Txt/Linux $ stat 2006-01-15_Blosxom.txt
  File: `2006-01-15_Blosxom.txt'
  Size: 8997            Blocks: 24         IO Block: 4096   regular file
Device: 304h/772d       Inode: 1433325     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  zoonek)   Gid: (  100/   users)
Access: 2006-01-25 21:24:49.000000000 +0000
Modify: 2006-01-25 21:24:49.000000000 +0000
Change: 2006-01-25 21:24:49.000000000 +0000
  

The "touch -d" command allows you to change the date.

zoonek@gentoo ~/notes/blosxom/Txt/Linux $ touch -d "2006-01-15 21:13" 2006-01-15_Blosxom.txt
zoonek@gentoo ~/notes/blosxom/Txt/Linux $ ls -l 2006-01-15_Blosxom.txt
-rw-r--r--  1 zoonek users 8997 Jan 15 21:13 2006-01-15_Blosxom.txt

posted at: 06:51 | path: /Linux | permanent link to this entry