Graham Marlow

Cool Rails concerns

09 Nov, 2024 til

There's something super elegant about Writebook's use of concerns. I especially like Book:Sluggable:

module Book::Sluggable
  extend ActiveSupport::Concern

  included do
    before_save :generate_slug, if: -> { slug.blank? }
  end

  def generate_slug
    self.slug = title.parameterize
  end
end

Here's a few reasons:

  • Nesting concerns in a model folder is neat when that concern is an encapsulation of model-specific functionality: app/models/book/sluggable.rb.
  • Concerns don't have to be big. They do have to be single-purpose.
  • Reminds me of a great article by Jorge Manrubla: Vanilla Rails is plenty. Down with service objects!

Why I Still Use Apple Notes

30 Oct, 2024 blog

My list of reasons for using Apple Notes isn't very long. In many ways I think Apple Notes is an inferior app, especially when comparing it to alternatives like Notion, Obsidian, and the like. There is one thing that keeps me using it, however. Simplicity.

Organization is a death sentence for spontaneity. Tools like Notion have me questioning my note placement before I've written a single word, killing the idea before it's had a moment to begin. I've lost track of ideas due to a distracting home screen ("Oh, I should check my email"), or navigating a Notion sidebar ("Where's the archive again?"), or generating the title of the very note I'm starting to write.

Therefore, I value flow over all else. Nothing is more important than getting the idea into some storage mechanism as fast as humanly possible. Expedience is the same reason I don't carry a paper notebook and a pen, despite preferring writing by hand to tapping on a piece of glass. When inspiration strikes and I need to jot down an idea right now nothing beats Apple Notes. Pop open the app, hit the lower-right corner, type type type.

Every month or so I'll peruse my notes and organize them into better, more permanent places. 90% of the time that means deleting the note. I'm not very sentimental[1]. The remaining 10% is moved into a GitHub README if it's a project idea or drafted into a proper blog post (markdown + git). This whole categorization phase is a moment of catharsis and feels wholly productive.

I've always been turned off by the term "second brain". Despite what productivity gurus say I don't think my simple notes will ever amount to some magnum opus of material, studied by historians in the distant future. Nor do I think there's much value in the nitpicky categorization that makes up a zettelkasten. I believe those who spend the majority of their time organizing notes and staring at their Obsidian graph view are fooling themselves into thinking they're more productive. Sure, it's pretty. But is my writing any better?

Productivity software preys upon novelty. New apps tout new approaches that will always catch me in the allure of efficiency. Let this ode to Apple Notes serve as a reminder that sometimes simple is better.


  1. Part of my notetaking philosophy is that a note has served its purpose through the mechanism of its creation. Field Notes puts it well: "I'm not writing it down to remember it later, I'm writing it down to remember it now." So when I say 90% I mean it. ↩︎

Kafka on the Shore

26 Oct, 2024 til

On the inside cover of Kafka on the Shore Murakami explains how his idea for the book started with its title. This approach is opposite to anything I've ever written, though I recognize there's a notable difference between fiction and technical writing. But what a powerful idea: a simple phrase shapes the entire story.

I dug up this quote from an interview:

When I start to write, I don’t have any plan at all. I just wait for the story to come. I don’t choose what kind of story it is or what’s going to happen. I just wait.

I think that's pretty cool.

Exploring the Writebook Source Code

13 Oct, 2024 blog, ruby

Earlier this year 37signals released Writebook, a self-hosted book publishing platform. It's offering number two from their ONCE series, pitched as the antithesis of SaaS. Buy it once and own it for life, but run it on your own infrastructure.

Unlike the other ONCE offering, Writebook is totally free. When you "purchase" it through the ONCE checkout, they hook you up with the source code and a convenient means of downloading the software on a remote service. Since the software is free (but not open source) I thought it's fair game to read through it and write a little post about its implementation. It's not everyday that we can study a production Rails application made by the same folks behind Rails itself.

Note: I'll often omit code for the sake of brevity with a "--snip" marker. I encourage you to download Writebook yourself and follow along so you can discover the complete context.

Run the thing

A good place to start is the application entrypoint: Procfile. I think Procfile is a holdover from the Heroku-era, when everyone was hosting their Rails applications on free-tier dynos (RIP). Either way, it describes the top-level processes that make up the server:

web: bundle exec thrust bin/start-app
redis: redis-server config/redis.conf
workers: FORK_PER_JOB=false INTERVAL=0.1 bundle exec resque-pool

Nice and simple. There are three main components:

  • web, Writebook's web and application server
  • redis, the backing database for the application cache and asynchronous workers
  • workers, the actual process that executes asynchronous tasks

The only other infrastructure of note is the application database, which is running as a single file via SQLite3.

bundle exec thrust bin/start-app might be surprising for folks expecting bin/rails server as the main Rails process. thrust is the command invocation for thruster, a fairly recent HTTP proxy developed by 37signals specifically for ONCE projects. It provides a similar role to nginx, a web server that sits in front of the main Rails process to handle static file caching and TLS. The thrust command takes a single argument, bin/start-app, which contains your standard bin/rails s invocation, booting up the application server.

Redis and workers fill out the rest of the stack. Redis fills a few different purposes for Writebook, serving as the application cache and the task queue for asynchronous work. I'm a little surprised Solid Queue and Solid Cache don't make an appearance, swapping out Redis for the primary data store (SQLite in this case). But then again, perhaps it's more cost-efficient to run Redis in this case, since Writebook probably wants to be self-hosted on minimal hardware (and not have particular SSD requirements).

You can run the application locally with foreman (note you'll need Redis installed, as well as libvips for image processing):

foreman start

Pages that render markdown

When it comes to the textual content of books created with Writebook, everything boils down to the Page model and it's fancy has_markdown :body invocation:

class Page < ApplicationRecord
  # --snip
  has_markdown :body
end

That single line of code sets up an ActionText association with Page under the attribute name body. All textual content in Writebook is stored in the respective ActionText table, saved as raw markdown. Take a look at this Rails console query for an example:

writebook(dev)> Page.first.body.content
=> "# Welcome to Writebook\n\nThanks for downloading Writebook...

To my surprise, has_markdown is not actually a Rails ActionText built-in. It's manually extended into Rails by Writebook in lib/rails_ext/action_text_has_markdown.rb, along with a couple other files that integrate ActionText with the third-party gem redcarpet:

module ActionText
  module HasMarkdown
    extend ActiveSupport::Concern

    class_methods do
      def has_markdown(name, strict_loading: strict_loading_by_default)
		# --snip

        has_one :"markdown_#{name}", -> { where(name: name) },
          class_name: "ActionText::Markdown", as: :record, inverse_of: :record, autosave: true, dependent: :destroy,
          strict_loading: strict_loading

        # --snip
      end
    end
  end
end

# ...

module ActionText
  class Markdown < Record
	# --snip
    mattr_accessor :renderer, default: Redcarpet::Markdown.new(
      Redcarpet::Render::HTML.new(DEFAULT_RENDERER_OPTIONS), DEFAULT_MARKDOWN_EXTENSIONS)

    belongs_to :record, polymorphic: true, touch: true

    def to_html
      (renderer.try(:call) || renderer).render(content).html_safe
    end
  end
end

lib/rails_ext/ as the folder name is very intentional. The code belongs in lib/ and not app/lib/ because it's completely agnostic to the application. It's good ol' reusable Ruby code for any Rails application that has ActionText. rails_ext/ stands for "Rails extension", a common naming convention for vendor monkey patches that might live in a Rails application. This code re-opens an existing namespace (the ActionText module, in this case) and adds new functionality (ActionText::Markdown). Within the application, users can use ActionText::Markdown without evet knowing it's not a Rails built-in.

This is a neat little implementation for adding markdown support to ActionText, which is normally just a rich text format coupled to the Trix editor.

Beyond pages

Page is certainly the most important data model when it comes to the core functionality of Writebook: writing and rendering markdown. The platform supports a couple other fundamental data types, that being Section and Picture, that can be assembled alongside Pages to make up an entire Book.

The model hierarchy of a Book looks something like this:

Book = Leaf[], where Leaf = Page | Section | Picture

In other words, a Book is made up of many Leaf instances (leaves), where a Leaf is either a Page (markdown content), a Section (basically a page break with a title), or a Picture (a full-height image).

Writebook book detail screenshot

You can see the three different Leaf kinds near the center of the image, representing the three different types of content that can be added to a Book. This relationship is clearly represented by the Rails associations in the respective models:

# app/models/book.rb
class Book < ApplicationRecord
  # --snip
  has_many :leaves, dependent: :destroy
end

# app/models/leaf.rb
class Leaf < ApplicationRecord
  # --snip
  belongs_to :book, touch: true
  delegated_type :leafable, types: Leafable::TYPES, dependent: :destroy
  positioned_within :book, association: :leaves, filter: :active
end

Well, maybe not completely "clearly". One thing that's interesting about this implementation is the use of a Rails concern and delegated_type to represent the three kinds of leaves:

module Leafable
  extend ActiveSupport::Concern

  TYPES = %w[ Page Section Picture ]

  included do
    has_one :leaf, as: :leafable, inverse_of: :leafable, touch: true
    has_one :book, through: :leaf

    delegate :title, to: :leaf
  end
end

There are three kinds of Leaf that Writebook supports: Page, Section, and Picture. Each Leaf contains different attributes according to its kind. A Page has ActionText::Markdown content, a Section has plaintext, and a Picture has an image upload and a caption. However, despite their difference in schema, each of the three Leaf kinds is used in the exact same way by Book. In other words, Book doesn't care which kind of Leaf it holds a reference to.

This is where delegated_type comes into play. With delegated_type, all of the shared attributes among our three Leaf kinds live on the "superclass" record, Leaf. Alongside those shared attributes is a leafable_type, denoting which "subclass" the Leaf falls into, one of "Page", "Section", or "Picture". When we call Leaf#leafable, we fetch data from the matching "subclass" table to pull the non-shared attributes for that Leaf.

The pattern is made clear when querying in the Rails console:

writebook(dev)> Leaf.first.leafable
SELECT "leaves".* FROM "leaves" ORDER BY "leaves"."id" ASC LIMIT 1
SELECT "pages".* FROM "pages" WHERE "pages"."id" = ?

Rails knows from leafable_type that Leaf.first is a Page. To read the rest of that Leaf's attributes, we need to fetch the Page from the pages table associated to the leafable_id on the record. Same deal for Section and Picture.

Another thing that's interesting about Writebook's use of delegated_type is that the Leaf model isn't exposed on a route:

  resources :books, except: %i[ index show ] do
    # --snip
    resources :sections
    resources :pictures
    resources :pages
  end

This makes a ton of sense because the concept of Leaf isn't exactly "user-facing". It's more of an implementation detail. The relation between the three different Leafable types is exposed by some smart inheritance in each of the "subclasses". Take SectionsController as an example:

class SectionsController < LeafablesController
  private
    def new_leafable
      Section.new leafable_params
    end

    def leafable_params
      params.fetch(:section, {}).permit(:body, :theme)
        .with_defaults(body: default_body)
    end

    def default_body
      params.fetch(:leaf, {})[:title]
    end
end

All of the public controller handlers are implemented in LeafablesController, presumably because each Leafable is roughly handled in the same way. The only difference is the params object sent along in the request to create a new Leaf.

class LeafablesController < ApplicationController
  # --snip
  def create
    @leaf = @book.press new_leafable, leaf_params
    position_new_leaf @leaf
  end
end

I appreciate the nomenclature of Book#press to create add a new Leaf to a Book instance. Very clever.

Authentication and users

My go-to when setting up authentication with Rails is devise since it's an easy drop-in component. Writebook instead implements its own lightweight authentication around the built-in has_secure_password:

class User < ApplicationRecord
  include Role, Transferable

  has_many :sessions, dependent: :destroy
  has_secure_password validations: false

  has_many :accesses, dependent: :destroy
  has_many :books, through: :accesses
  # --snip
end

The authentication domain in Writebook is surprisingly complicated because the application supports multiple users with different roles and access permissions, but most of it is revealed through the User model.

The first time you visit a Writebook instance, you're asked to provide an email and password to create the first Account and User. This is represented via a non-ActiveRecord model class, FirstRun:

class FirstRun
  ACCOUNT_NAME = "Writebook"

  def self.create!(user_params)
    account = Account.create!(name: ACCOUNT_NAME)

    User.create!(user_params.merge(role: :administrator)).tap do |user|
      DemoContent.create_manual(user)
    end
  end
end

Whether or not a user can access or edit a book is determined by the Book::Accessable concern. Basically, a Book has many Access objects associated with it, each representing a user and a permission. Here's the Access created for the DemoContent referenced in FirstRun:

#<Access:0x00007f06efac0538
  id: 1,
  user_id: 1,
  book_id: 1,
  level: "editor"
  #--snip>

Likewise, when new users are invited to a book, they are assigned an Access level that matches their permissions (reader or editor). Note that all of this access-stuff is for books that have not yet been published to the web for public viewing. Writebook allows you to invite early readers or editors for feedback before you go live.

Whoa, whoa, whoa. What is this rate_limit on the SessionsController?

class SessionsController < ApplicationController
  allow_unauthenticated_access only: %i[ new create ]
  rate_limit to: 10,
             within: 3.minutes,
             only: :create,
             with: -> { render_rejection :too_many_requests }

Rails 8 comes with built-in rate limiting support? That's awesome.

Style notes

I like the occasional nesting of concerns under model classes, e.g. Book::Sluggable. These concerns aren't reusable (hence the nesting), but they nicely encapsulate a particular piece of functionality with a callback and a method.

# app/models/book/sluggable.rb
module Book::Sluggable
  extend ActiveSupport::Concern

  included do
    before_save :generate_slug, if: -> { slug.blank? }
  end

  def generate_slug
    self.slug = title.parameterize
  end
end

Over on the HTML-side, Writebook doesn't depend on a CSS framework. All of the classes are hand-written and applied in a very flexible, atomic manner:

<div class="page-toolbar fill-selected align-center gap-half ..."></div>

These classes are grouped together in a single file, utilities.css. Who needs Tailwind?

.justify-end {
  justify-content: end;
}
.justify-start {
  justify-content: start;
}
.justify-center {
  justify-content: center;
}
.justify-space-between {
  justify-content: space-between;
}
/* --snip */

I'm also surprised at how little JavaScript is necessary for Writebook. There are only a handful of StimulusJS controllers, each of which encompasses a tiny amount of code suited to a generic purpose. The AutosaveController is probably my favorite:

import { Controller } from '@hotwired/stimulus'
import { submitForm } from 'helpers/form_helpers'

const AUTOSAVE_INTERVAL = 3000

export default class extends Controller {
  static classes = ['clean', 'dirty', 'saving']

  #timer

  // Lifecycle

  disconnect() {
    this.submit()
  }

  // Actions

  async submit() {
    if (this.#dirty) {
      await this.#save()
    }
  }

  change(event) {
    if (event.target.form === this.element && !this.#dirty) {
      this.#scheduleSave()
      this.#updateAppearance()
    }
  }

  // Private

  async #save() {
    this.#updateAppearance(true)
    this.#resetTimer()
    await submitForm(this.element)
    this.#updateAppearance()
  }

  #updateAppearance(saving = false) {
    this.element.classList.toggle(this.cleanClass, !this.#dirty)
    this.element.classList.toggle(this.dirtyClass, this.#dirty)
    this.element.classList.toggle(this.savingClass, saving)
  }

  #scheduleSave() {
    this.#timer = setTimeout(() => this.#save(), AUTOSAVE_INTERVAL)
  }

  #resetTimer() {
    clearTimeout(this.#timer)
    this.#timer = null
  }

  get #dirty() {
    return !!this.#timer
  }
}

When you're editing markdown content with Writebook, this handy controller automatically saves your work. I especially appreciate the disconnect handler that ensures your work is always persisted, even when you navigate out of the form to another area of the application.

Closing thoughts

There's more to explore here, particularly on the HTML side of things where Hotwire does a lot of the heavy lifting. Unfortunately I'm not a good steward for that exploration since most of my Rails experience involves some sort of API/React split. The nuances of HTML-over-the-wire are over my head.

That said I'm impressed with Writebook's data model, it's easy to grok thanks to some thoughtful naming and strong application of lesser-known Rails features (e.g. delegated_type). I hope this code exploration was helpful and inspires the practice of reading code for fun.

HEY to Fastmail and Back Again

05 Oct, 2024 blog

I've read a few stories about folks moving their email from HEY to Fastmail, but have not seen any in the reverse direction. After two years of Fastmail, I'm moving back to HEY. Here are my thoughts.

For those unacquainted with HEY, the main pitch is (a) screen unknown senders (b) into one of three locations: "Imbox", "The Feed", and "Paper Trail". Senders that are "screened out" are completely blocked, you won't be notified again from that address. For those "screened in", the split inbox offers more than just filters and labels. "The Feed" for example aggregates emails into a continuous reader view that's nice browsing on a weekend morning. There are many more features but these two are probably the most important ones.

In my first HEY adventure, I had an @hey.com address for $99/yr. My primary motivation was moving away from Gmail and freeing some of my dependence on Google products, which I still maintain is worthwhile. HEY pulled me in with the marketing, but at $99 I wasn't convinced I was receiving enough value for the price tag. When I saw that Fastmail supported Masked Email, my mind was made up. Added privacy at half the cost? Yes please.

So I migrated, eating the cost of cycling yet another email address but setting up a custom email domain along the way to future-proof my erratic email exploration tendencies. I followed this guide from Franco Correa to emulate some of the HEY functionality in Fastmail, attempting to hold on to some of the principles that improved my workflow.

Two years later and I'm moving back to HEY.

Why switch back? The decision mostly comes down to the difference in user experience between the two apps. Fastmail feels like a chore to use, especially on iOS where most of my email (and newsletter) reading happens. Here are my two biggest problems:

  • I'd often need to close and reopen the Fastmail app because it was stuck on a black screen. Particularly frustrating when on a slow connection because it means going through the whole SPA-style loading animation that can take 10-20 seconds.
  • Using contacts + groups as substitutes for "The Feed" and "Paper Trail" is tedious. Email addresses that go into either bucket must first be added to contacts, then edited to include the appropriate filtering group. I honestly can't remember how to do this in the mobile app.

There were also a handful of workflows that I was missing from HEY:

  • The ability to merge threads and create collections is incredible when dealing with travel plans. Rather than juggling a bunch of labels for different trips, email threads are neatly organized into one spot for each.
  • "Send me push notifications" on an email thread, which will notify me when that thread and only that thread receives replies, is genius.
  • I created a "Set Aside" folder in Fastmail but eventually found myself missing the nice little stack of email threads that are bundled up in a corner in the HEY app.
  • Bundling email from certain senders into a single thread is an excellent solution for notification streams from Github or Amazon, where I want to be alerted with updates but don't want to have a bunch of separate email threads taking up space in my inbox.
  • I really like clips as an alternative to slapping on a label so I know to revisit an email for some buried content.

Don't get me wrong, Fastmail is a great service. If I didn't find out that masked email could be replaced by DuckDuckGo Email Protection I would probably still be using it[1]. I'm especially fond of their investment in JMAP and attempts to make the technical ecosystem around email better. Also, if you want to have multiple custom domains routing to the same email platform, Fastmail is way more cost effective.

But, having moved back to HEY, I've discovered that I'm easily swayed by software that can please and delight. Many of HEY's features are UX oddities that don't exactly nail down ways to make email better, but make the experience of using it more enjoyable. I think HEY gets it right most of the time.

The calendar is a new addition to HEY in the time that I've been away and it's interesting. I'm not hugely opinionated when it comes to calendars, I hardly use them outside of work where my company dictates the platform. The HEY calendar feels split between innovating for the sake of novelty and innovating for the sake of good ideas.

For one, there's no monthly view. Only day and week. Instead of viewing a complete month you view an endless scroll of weeks, with about three and a half fitting on the screen at any given time. The daily/weekly focus of HEY Calendar seems catered to daily activities: journaling, photography, and habit tracking. Not so much complicated scheduling workflows.

HEY's email offering still has some rough spots as well:

  • No import from an existing email.
  • Adding additional custom domains is prohibitively expensive for a single user.
  • Feature rollout is asymmetrical, web and Android often outpace iOS.
  • Two separate apps for calendar and email (minor, but kind of annoying).
  • Journal integration with the calendar is interesting, but I'm hesitant to use it because there's no export.
  • Can't use HEY with an external app (e.g. Thunderbird).
  • Still can't configure swipe actions on iOS.

Some of these (like swipe actions and import) are longtime issues that will probably never be addressed. It's probably also worth noting that the HEY workflow is rather opinionated and isn't guaranteed to hit. But hey, give it a try and see if it works for you.

Moral of the story: use custom email domains. It protects you from email vendor lock-in so you're free to experiment as you see fit.


  1. On that topic, masked email is such a critical privacy feature for email that I can't believe HEY doesn't offer it. I suppose the screener is meant to alleviate that concern (since unwanted emails must be manually screened-in) but it's not quite the same. I'd rather rest easy knowing that only a randomly-generated email winds up in marketing garbage lists. ↩︎

Crafting Interpreters, Ruby Style

18 Aug, 2024 blog, ruby

I finally have started working through Crafting Interpreters, a wonderful book about compilers by Robert Nystrom. The book steps through two interpreter implementations, one in Java and one in C, that ramp in complexity.

Now I don't know about you, but I hate Java. I can hardly stand to read it, let alone write it. That's why I decided to write my first Lox interpreter in Ruby, following along with the book as I can but converting bits and pieces into Rubyisms as I see fit.

In general, the Java code can be ported 1-1 to Ruby with no changes. Of course there's some obvious stuff, like lack of types means I need fewer methods and no coersions, or certain stdlib method namespaces that are updated to match Ruby idioms (while vs. until, anyone?). However, lots of code I just accept as-is and allow Nystrom to guide me through.

I've only worked through the first 7 chapters, but I did note down a few things in the Ruby conversion that I found interesting.

Avoiding switch statement fallthrough with regular expressions

Admittedly this difference is just a tiny syntactical detail, but one that plays to Ruby's strengths. Take the book's implementation of scanToken:

private void scanToken() {
  char c = advance();
  switch (c) {
    case '(': addToken(LEFT_PAREN); break;
    // ...
    default:
      if (isDigit(c)) {
        number();
      } else if (isAlpha(c)) {
        identifier();
      } else {
        Lox.error(line, "Unexpected character.");
      }
  }
}

private boolean isDigit(char c) {
  return c >= '0' && c <= '9';
}

// private boolean isAlpha...

Due to limitations in the Java switch statement, the author adds some post-fallthrough checks to the default case. This removes the need to check every number and letter individually (0-9, a-z, A-Z as separate cases) because the check is deferred into the default case, where an additional conditional statement is applied. Aesthetically it's not an ideal solution since it breaks up the otherwise regular pattern of case ... handler that holds for the other tokens. I don't know, it's just kinda ugly.

With Ruby, I can instead employ regular expressions directly in my switch statement:

def scan_token
  case advance
  when "("
    add_token(:left_paren)
  # ...
  when /[[:digit:]]/
    number
  when /[[:alpha:]]/
    identifier
  else
    Lox.error(@line, "unexpected character")
  end
end

No default fallthrough needed! These tiny details are what keep me programming in Ruby.

Metaprogramming the easy way

The largest deviation between the Java and Ruby implementation is definitely the metaprogramming. In Implementing Syntax Trees the author employs metaprogramming through an independent build step.

First, a new package is created (com.craftinginterpreters.tool) with a couple of classes that themselves generate Java classes by writing strings to a file:

  private static void defineType(
      PrintWriter writer, String baseName,
      String className, String fieldList) {
    writer.println("  static class " + className + " extends " +
        baseName + " {");

    // Constructor.
    writer.println("    " + className + "(" + fieldList + ") {");

    // Store parameters in fields.
    String[] fields = fieldList.split(", ");
    for (String field : fields) {
      String name = field.split(" ")[1];
      writer.println("      this." + name + " = " + name + ";");
    }

    writer.println("    }");

    // Fields.
    writer.println();
    for (String field : fields) {
      writer.println("    final " + field + ";");
    }

    writer.println("  }");
  }

These string builders are hooked up to a separate entrypoint (made for the tool Java package) and are compiled separately. The result spits out a bunch of .java files into the com.craftinginterpreters.lox package, whereby the programmer checks them into the project.

It's not a bad solution by any means, but requiring a separate build step and metaprogramming by concatenating strings is a little rough. The Ruby solution is totally different thanks to a bunch of built-in metaprogramming utilities (and the fact that Ruby is an interpreted language).

Here's how I wired up the expression generation:

module Rlox
  module Expr
    EXPRESSIONS = [
      ["Binary", [:left, :operator, :right]],
      ["Grouping", [:expression]],
      ["Literal", [:value]],
      ["Unary", [:operator, :right]]
    ]

    EXPRESSIONS.each do |expression|
      classname, names = expression

      klass = Rlox::Expr.const_set(classname, Class.new)
      klass.class_eval do
        attr_accessor(*names)

        define_method(:initialize) do |*values|
          names.each_with_index do |name, i|
            instance_variable_set(:"@#{name}", values[i])
          end
        end

        define_method(:accept) do |visitor|
          visitor.public_send(:"visit_#{classname.downcase}_expr", self)
        end
      end
    end
  end
end

When this file is included into rlox.rb (the main entrypoint to the interpreter), Ruby goes ahead and builds all of the expression classes dynamically. No build step needed, just good ol' Ruby metaprogramming. Rlox::Expr.const_set adds the class to the scope of the Rlox::Expr module, re-opening it on the next line via class_eval to add in the automatically-generated methods.

To close the loop, here's what one of the generated classes looks like if it were to be written out by hand (while also avoiding the dynamic instance variable setter):

module Rlox
  module Expr
    class Binary
      attr_accessor :left, :operator, :right

      def initialize(left, operator, right)
        @left = left
        @operator = operator
        @right = right
      end

      def accept(visitor)
        visitor.visit_binary_expr(self)
      end
    end
  end
end

Comparing the Ruby and Java implementation is interesting because it highlights some higher-level advantages and disadvantages between the two languages. With the Ruby version, adding new types is trivial and does not require an additional compile + check-in step. Just add a name-argument pair to the EXPRESSIONS constant and you're done!

The flip-side of this is the class is not easily inspectable. Although I wrote Rlox::Expr::Binary above this paragraph as regular Ruby code, that code doesn't exist anywhere in the application where a programmer's eyes can read it. Instead, developers have to read the metaprogramming code in expr.rb to understand how the classes work.

I think this implementation leans idiomatic Ruby: metaprogramming is part of the toolkit so it's expected for developers to learn how to deal with it. If you're interested in learning how the class works and can't understand the metaprogramming code, you can always boot up the console and poke around with an instance of the class. It kind of coincides with the Ruby ethos that a REPL should be close at hand so you can explore code concepts that you might otherwise misunderstand by reading the code.

That said, I still have respect for the Java implementation because Ruby metaprogramming can really end up biting you in the ass.

TDD (well, not really)

I'm sure Nystrom omitted tests from the book because it would add a ton of implementation noise to the project, and not in a way that benefited the explanation. For my purposes, I wanted to make sure I added tests with each chapter to make sure my implementation wasn't drifting from the expectation.

It's not perfect by any means, but it definitely gives me a ton of confidence that I'm following along with the material and exercising some of the trickier edge cases. I was also impressed that Nystrom's implementation is really easy to test. Here's an example from the parser:

class TestParser < Minitest::Test
  def test_it_handles_comparison
    got = parse("2 > 3")

    assert_instance_of Rlox::Expr::Binary, got
    assert_equal :greater, got.operator.type
    assert_equal 2.0, got.left.value
    assert_equal 3.0, got.right.value

    got = parse("2 >= 3")

    assert_instance_of Rlox::Expr::Binary, got
    assert_equal :greater_equal, got.operator.type
    assert_equal 2.0, got.left.value
    assert_equal 3.0, got.right.value

    got = parse("2 < 3")

    assert_instance_of Rlox::Expr::Binary, got
    assert_equal :less, got.operator.type
    assert_equal 2.0, got.left.value
    assert_equal 3.0, got.right.value

    got = parse("2 <= 3")

    assert_instance_of Rlox::Expr::Binary, got
    assert_equal :less_equal, got.operator.type
    assert_equal 2.0, got.left.value
    assert_equal 3.0, got.right.value
  end

  def parse(str)
    scanner = Rlox::Scanner.new(str)
    tokens = scanner.scan_tokens
    parser = Rlox::Parser.new(tokens)
    # Call private method to bubble up exception that is caught by #parse
    parser.send(:expression)
  end
end

Astute readers might recognize that the parse helper function defined within the test is also calling into the Rlox::Scanner class. That's one item that I've taken the quick and easy approach towards: rather than ensure test isolation by writing out the AST with the Rlox::Expr/Rlox::Statement classes (which are incredibly verbose), I use Rlox::Scanner so I can write my tests as string expressions that read like the code I'm testing. Unfortunately, that does mean that if I write a bug into the Rlox::Scanner class, that bug is propogated into the Rlox::Parser tests, but in my head it's better than the alternative of tripling the lines of code for my test files. What can you do?

Next steps

There might be a part two for this post as I work my way further through the first Lox interpreter. If you're interested in following along with the code, check it out on Github.

New stuff in Emacs 30

28 Jul, 2024 blog, emacs

Emacs 30.1 is near on the horizon with the most recent pretest (30.0.93) made available in late December. This post highlights some new features in the upcoming release that I find especially compelling.

If you want to know everything that's upcoming, check out the full release notes.

Native compilation enabled by default

This is huge!

Native compilation was introduced in Emacs 28 behind a configuration flag, so even though it's been around for a little while you probably aren't using it unless you compile your Emacs from source (or use a port that explicitly had it enabled). Enabling it by default brings it to more users.

This feature compiles Emacs Lisp functions to native code, offering 2-5x faster performance to the byte-compiled counterpart. The downside is an additional dependency (libgccjit) and a little extra compilation overhead when installing a package for the first time. The downsides are so minor that enabling it by default is a no-brainer.

Native (and faster) JSON support

You no longer need an external library (libjansson) to work with JSON in Emacs. On top of that, JSON parsing performance in Emacs is significantly improved (the author provides that parsing is up to 8x faster). This is all thanks to Géza Herman's contribution: I created a faster JSON parser. He summarizes his changes later in that thread:

My parser creates Lisp objects during parsing, there is no intermediate step as Emacs has with jansson. With jansson, there are a lot of allocations, which my parser doesn't have (my parser has only two buffers, which exponentially grow. There are no other allocations). But even ignoring performance loss because of mallocs (on my dataset, 40% of CPU time goes into malloc/free), I think parsing should be faster, so maybe jansson is not a fast parser in the first place.

Great stuff.

use-package version control support

You can now install packages directly from version-controlled repositories (for those packages that aren't yet in ELPA, Non-GNU ELPA or MELPA).

For example:

(use-package bbdb
  :vc (:url "https://git.savannah.nongnu.org/git/bbdb.git"
       :rev :newest))

This also means that you can opt into package updates based on commit instead of latest release (e.g. :rev :newest). I think this is actually a sleeper feature of :vc, since the default Emacs package release/update cycle can be a little wonky at times.

If you want all of your :vc packages to prefer the latest commit (instead of the latest release), you can set use-package-vc-prefer-newest to t.

Tree-sitter modes are declared as submodes

I had to read this change a few times before I grokked what it was saying. Tree-sitter modes, e.g. js-ts-mode, are now submodes of their non-tree-sitter counterpart, e.g. js-mode. That means any configuration applied to the non-tree-sitter mode also applies to the tree-sitter mode.

In other words, my .dir-locals.el settings for js-mode simply apply to js-ts-mode as well, without needing to write it explicitly. A nice quality of life change to help pare down Emacs configurations that rely on both modes (which is more common than you might think, given that non-tree-sitter modes are typically more featureful).

Minibuffer QOL improvements

Some nice quality-of-life improvements for the default Emacs completions:

  • You can now use the arrow keys to navigate the completion buffer vertically (in addition to the M-<up|down> keybindings).

  • Previous minibuffer completion selections are deselected when you begin typing again (to avoid accidentally hitting a previous selection).

  • completions-sort has a new value: historical. Completion candidates will be sorted by their order in minibuffer history so that recent candidates appear first.

Customize interface for dir-locals

There's now a customize interface for Directory Variables:

M-x customize-dirlocals

I always find myself forgetting the .dir-locals.el syntax (even though they're just lists!) so this is a surprisingly handy feature for me.

New mode: visual-wrap-prefix-mode

Now this one is cool. I'm the kind of guy who uses auto-mode for everything because I haven't bothered to figure out how Emacs line wrapping works. Everything I write hard breaks into newlines after 80 characters.

The new mode visual-wrap-prefix-mode is like auto-mode, except that the breaks are for display purposes only. I think this is incredibly useful when editing text that might be reviewed using a diffing tool, since long lines tend to display more useful diffs than a paragraph broken up with hard breaks. I'm actually pretty excited about this change, maybe it will get me to stop using (markdown-mode . (( mode . auto-fill)) everywhere.

New command: replace-regexp-as-diff

You can now visualize regular expression replacements as diffs before they're accepted. This is actually incredible.

New package: which-key

Previously a package in GNU ELPA, which-key-mode is now built-in. With which-key-mode enabled, after you begin a new command (e.g. C-x) and wait a few seconds, a minibuffer will pop up with a list of possible keybinding completions. It's a super handy tool for remembering some of the more esoteric modes.

New customizations

  • Show the current project (via project.el) in your modeline with project-mode-line.

  • Add right-aligned modeline elements via mode-line-format-right-align.

  • You can now customize the venerable yes-or-no-p function with yes-or-no-prompt.

A few Emacs Lisp changes

There are a few small, yet impactful changes around help buffers and Emacs Lisp types that I think are worth noting.

  • describe-function shows the function inferred type when available:
C-h f concat RET

(concat &rest SEQUENCES)
  Type: (function (&rest sequence) string)
  • Built-in types show their related classes:
C-h o integer RET

integer is a type (of kind ‘built-in-class’).
 Inherits from ‘number’, ‘integer-or-marker’.
 Children ‘fixnum’, ‘bignum’.
  • The byte compiler warns if a file is missing the lexical binding directive. Lexical bindings have been included in ELisp for awhile now, so it's nice to see more effort being made towards making it the default.
;;; Foo mode  -*- lexical-binding: t -*-

Read the full details

That wraps up my highlights. There's a ton more stuff included in Emacs 30.1 so I encourage you to check it out the NEWS yourself.

PS. Interested in trying out Emacs but don't know where to start? Check out my MIT-licensed configuration guide: Start Emacs.

Recently

19 Jun, 2024 blog

Recently I've been deep down a crossword puzzle rabbit hole. I started a new side project that has taken most of my writing energy: them's crossing words, a blog where I post daily crossword puzzle reviews and articles about the craft of puzzle construction. Thus far there's about fifty crossword puzzles featured and discussed, a sizable number of grids with over 10k words dedicated to crossing them.

When I started the project I thought I might burn out quickly on the idea. Writing a daily review was actually far from my original intent. The thing is, there's just so much to talk about when it comes to the art of crossword construction (and puzzles in general, by extension). Every crossword is nuanced and interesting, built by constructors that bring their own voice into the grid with interesting clues and clever themes.

The idea of a puzzle blog has been bouncing around in my head for a long time now, and my motivation to start one was largely influenced by the release of Braid, Anniversary Edition. When I was a kid playing through Braid on the Xbox Live Arcade, I didn't actually care that much for puzzle games. They were too slow and plodding for my High School brain.

Since then I've come to really appreciate the genre, with games like The Witness and Outer Wilds completely blowing my mind as to what's possible in the medium of video games. When Braid re-released this year with loads of developer commentary, I was in.

Now that I'm playing through it as an adult I have a newfound appreciation for its narrative and design. There are loads of spots where the narrative of the tale is paralleled by the mechanics of the gameplay and the design of the puzzles, a genius combination of factors seen in few games. What really drew me in, however, was the developer commentary, discussing the minutiae of game, sound, narrative, and art design behind every level and artistic motif.

The body of commentary in Braid, Anniversary Edition is staggering. The amount of thought that bleeds into every ounce of that game is an incredible achievement showing just how artistic the medium of video games can be. It inspired me to start writing about puzzles because I think they're more than just a method of wasting time. They're little worlds of simulation where ideas mesh with action, a creative landscape of human ingenuity clashing with constraints.

Needless to say I've been noodling on puzzles for the last few months, looking back on some of my old Puzzlescript prototypes and some of the things I learned about game design when I hacked them together. Building a level for a puzzle game is kind of like mentoring new engineers. You never really know how well you understand an idea until you need to teach it to someone. Same goes for a mechanic in a puzzle game: what is the truth that you're trying to expose to the person playing your game? Why is it interesting?

Anyway, this is less of a Recently and more of a ramble.

For this blog, I've also been interested in exploring systems languages now that I have a year or so of Rust experience under my belt. I'm curious about the idea of manual memory management and how it is handled by various different languages, especially with the recent uptick in C-alternatives, like Zig, Odin, Jai, among others. I've also been wanting to do a post on "learning systems languages as a webdev" for awhile now, exploring the Rust ecosystem as a Ruby on Rails/JavaScript developer. I think I need a stronger baseline in systems performance before I try to tackle that subject.

Right now I'm reading through A Tour of C++ and Understanding Software Dynamics, testing my knowledge of performance programming and how computers actually work under the hood. It's been a humbling experience working in manual memory scenarios after so many years of garbage-collected languages. It's a different landscape when you have to deal with certain hot-paths in a game loop, for example, where allocations can lead to undesirable performance characteristics.

Markdown Rendering with Awk

23 Mar, 2024 blog

I can't believe I'm writing another post about Awk but I'm just having too much fun throwing together tiny Awk scripts. This time around that tiny Awk script is a markdown renderer, converting markdown to good ol' HTML.

Markdown is interesting because 80% of the language is rather easy to implement by walking through a file and applying a regular expression to each line. Most implementations seem to start this way. However, once that final 20% is hit some aspects of the language start to show their warts, exposing the not-so-happy path that eventually leads to lexical analysis.

To list a few examples,

  • there are many elements that can span more than one line, like paragraph emphasis, links, or bolded text

  • elements can encompass sub-elements like a russian doll, e.g. headers that include emphasized text that itself is bolded

  • elements can defy existing behavior, like code blocks that can themselves contain unrendered markdown

Each of these conditions complicates the simple line-based approach.

The renderer that I'm building doesn't aim to be comprehensive, so most of these edge cases are not handled. For my toy renderer, I'm assuming that the markdown is written in accordance to a general style guide with friendly linebreaks between paragraphs.

I am also being careful to call this project an markdown "renderer" and not a "parser" because it's not really parsing the markdown file. Instead, we're immediately replacing markdown with HTML. The difference may seem nitpicky but implies that there's no intermediate format between the markdown and the HTML output, a nuance that makes this implementation less powerful but also much simpler.

Let's get cracking.

Initial approach

Headers are a natural first step. The solution emphasizes Awk's strengths when it comes to handing line-based input:

/^# / { print "<h1>" substr($0, 3) "</h1>" }

This reads, "On every line, match # followed by a space. Replace that line with header tags and the text of that line beginning at index 3 (one-based indexing)." Since we're piping awk into another file, print statements are effectively writing our rendered HTML.

Line replacements are the name of the game in Awk, where the simplicity of the syntax really shines. The call to substr is less elegant than the usual space-delimited Awk fields ($1, $2, etc.), but it's necessary since we want to preserve the entire line sans the first two characters (the leading header hashtag).

The remaining headers follow the same pattern:

/^# /   { print "<h1>" substr($0, 3) "</h1>" }
/^## /  { print "<h2>" substr($0, 4) "</h2>" }
/^### / { print "<h3>" substr($0, 5) "</h3>" }
# ...

For something a little trickier, let's move on to block quotes. Block quotes in Markdown look like the following, leading each line with a caret:

> Deep in the human unconscious is a pervasive need for a logical universe that
> makes sense. But the real universe is always one step beyond logic. - Frank
> Herbert

Finding block quote lines is easy, we just use the same approach as our headers:

/^> / { print "<blockquote>"; print substr($0, 3); print "</blockquote>"}

But as you have probably guessed, this simplification isn't quite what we want. Instead of wrapping each line with a block quote tag, we want to wrap the entire block (three lines in this case) with one set of tags. This will require us to keep track of some intermediate state between line-reads:

/^> / { if (!inquote) print "<blockquote>"; inquote = 1; print substr($0, 3) "}

If we match a blockquote character and we're not yet inquote, we write the opening tag and set inquote. Otherwise, we simply write the content of the line. We need an extra rule to write the closing tag:

inquote && !/^> / { print "</blockquote>"; inquote = 0 }

If our program state says we're in a quote but we reach a line that doesn't lead with a block quote character, it's time to close the block. This matches against paragraph breaks which are normally used to separate paragraphs in Markdown documents.

This same strategy can be applied to the other block-style markdown elements: code blocks and lists. Each requires its own variable to keep track of the block content, but the approach is the same.

Paragraphs are tricky

It is very tempting to implement inline paragraph elements in the same way as we've handled other, single-line markdown syntax. However, paragraphs are special in that they often span more than a single line, especially so if you use hard-wrapping in your text editor at some column. For example, it's very common for links to span multiple lines:

A link that spans [multiple lines](https://definitely-a-valid-link-here.com)

This breaks the nice, single-line worldview that we've been operating under, requiring some special handling that will end up leaking into other aspects of our rendering engine.

My approach is to collect multiple paragraph lines into a single string, rendering it altogether on paragraph breaks. This allows me to search the entire string for inline elements (links, bold, italics), effectively matching against multiple lines of input.

/./  { for (i=1; i<=NF; i++) collect($i) }
/^$/ { flushp() }

# Concatenate our multi-line string
function collect(v) {
  line = line sep v
  sep = " "
}

# Flush the string, rendering any inline HTML elements
function flushp() {
  if (line) {
    print "<p>" render(line) "</p>"
    line = sep = ""
  }
}

Each line of text is collected into a variable, line, that is persisted between line-reads. When a paragraph break is hit (a line that contains no text, /^$/) we render that line, wrapping it in paragraph tags and replacing any inline elements with their respective HTML tags.

I'll point out that the technique of collecting fields into a string or array is a very common pattern in Awk, hence the utility variable NF for "number of fields". The Awk book uses this pattern in quite a few places.

For completeness, here's what that render function looks like:

function render(line) {
    if (match(line, /_(.*)_/)) {
        gsub(/_(.*)_/, sprintf("<em>%s</em>", substr(line, RSTART+1, RLENGTH-2)), line)
    }

    if (match(line, /\*(.*)\*/)) {
        gsub(/\*(.*)\*/, sprintf("<strong>%s</strong>", substr(line, RSTART+1, RLENGTH-2)), line)
    }

    if (match(line, /\[.+\]\(.+\)/)) {
        inner = substr(line, RSTART+1, RLENGTH-2)
        split(inner, spl, /\]\(/)
        gsub(/\[.+\]\(.+\)/, sprintf("<a href=\"%s\">%s</a>", spl[2], spl[1]), line)
    }

    return line
}

This code is noticeably less clean than our earlier HTML rendering, an unfortunate consequence of handling multi-line paragraphs. I won't go into too much detail here since there's a lot of Awk-specific regular expression matching stuff going on, but the gist is a standard regexp-replace of the paragraph text with HTML tags for matching elements.

Another problem that we run into when collecting multiple lines into the line variable is accidentally collecting text from previous match rules. Awk's expression syntax is like a switch statement that lacks a break: a line will match as many expressions as it can before moving onto the next. That means that all of our previous rules for headers, blockquotes, and so on are now also included in our paragraph text. That's no good!

# I match a header here:
/^# /  { print "<h1>" substr($0, 3) "</h1>" }

# But I also match "any text" here, so I'm collected:
/./  { for (i=1; i<=NF; i++) collect($i) }

Each of our previous matchers now has to include a call to next to immediately stop processing and move on to the next line. This prevents them from being included in paragraph collection.

/^# /  { print "<h1>" substr($0, 3) "</h1>"; next }
/^## / { print "<h2>" substr($0, 4) "</h2>"; next }
# ...

Styling for HTML exports

The last piece of this Markdown renderer is adding the boilerplate HTML that wraps our document:

BEGIN {
    print "<!doctype html><html>"
    print "<head>"
    print "  <meta charset=\"utf-8\">"
    print "  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">"
    if (head) print head
    print "</head>"
    print "<body>"
}

# ... all of our rules go here

END {
    print "</body>"
    print "</html>"
}

Unlike other Awk matchers, the special BEGIN and END keywords are only executed once.

As a nice bonus, we can add an optional head variable to inject a stylesheet into our rendered markdown, which can be added via the Awk CLI. The following adds the Simple CSS stylesheet to our rendered output:

awk -v head='  <link rel="stylesheet" href="https://cdn.simplecss.org/simple.min.css">' \
    -f awkdown.awk README.md > docs/index.html

The full source code is available here: https://github.com/mgmarlow/awkdown.

DM Tools with Awk

27 Feb, 2024 blog

I picked up Awk on a whim and am blown away by how generally useful it is. What I thought was a quick and dirty tool for parsing tabulated files turns out to be a fully-featured scripting language.

Before I started reading the second edition of The Awk Programming Language, my only exposure to Awk was from better-minded folk on Stack Overflow. After copy-pasting a short script here or there, I was befuddled by the need for explicit BEGIN and END statements in Awk one-liners. Shouldn't a program know when it begins and ends? Why the redundancy?

Oh how wrong I was. Once you understand how Awk works, the syntax of BEGIN and END makes a ton of sense; it's actually a consequence of Awk's coolest feature. BEGIN and END are necessary because the default mode of an Awk script isn't top-to-bottom execution, like other scripting languages. Instead, Awk programs are executed repeatedly by default, either on the lines of a file or an input stream.

To demonstrate, say I have a file where each line contains a location:

Forest
Hills
Desert
...

I can use Awk to turn that list of locations into one that is numbered with a single statement, no loops required:

$ awk '{ print NR ". " $0 }' locations.txt
1. Forest
2. Hills
3. Desert
4. ...

Without the BEGIN or END markers (which denote "run this before" and "run this after"), Awk runs statements on every line of its input. In this case, that means re-printing each location in the file locations.txt with some minor modifications.

Awk provides a bunch of built-ins that make it easy to work within this execution model. NR refers to "num row", keeping track of the current line of input that is being processed. This generates our numbered list.

The dollar-sign variables refer to fields on an individual line. $0 is the entire line, unmodified. $1, $2, and so on refer to subsets of the line, broken up by a delimiter (e.g. space, tab, or comma) and read from left to right.

And statements are just the tip of the Awk iceberg! You can assign each statement a "matcher" that only runs the expression on lines that are truthy. Here are a few examples:

# Print every row but the first
NR != 1 { print $0 }

# Only print a row if the first field matches "cat"
$1 ~ /cat/ { print "not a dog" }

# Maybe your second field is a number?
$2 >= 12 && $2 < 18 { print "teenager" }

Now the BEGIN and END statements are starting to make more sense.

DMing with Awk

Now for something a little more complicated. As I mentioned before, Awk is a fully-featured scripting language. You can write functions, generate random numbers, build arrays, and do everything that you'd expect a normal language to do (mostly, anyway). I ran across an example in the Awk book that demonstrates the use of rand() via dice rolling and it sparked an idea: how useful can a tool like Awk be for a DM running a Dungeons and Dragons game?

Since Awk is great at reading files, I figured it would also be great for dealing with random tables. Given the locations file that appears earlier in this post, here's how you can select a single location at random:

awk '{data[NR] = $0} END {srand(); print data[int(rand()*length(data))]}' locations.txt

It's easier to read with some annotations:

# Add every line in the file to an array, indexed by the line number
{ data[NR] = $0 }

# After reading the file,
END {
  # Seed randomness
  srand()

  # Pick a random index from the data array and print its respective value
  print data[int(rand() * length(data))]
}

I really like how { data[NR] = $0 } is all that Awk needs to build an array with the contents of a file. It comes in handy in cases like this where we need the file contents in memory before we can do something useful.

Now, you might be thinking that this isn't that cool because sort can already do it better. And you'd be right!

$ cat locations.txt | sort -R | head -1
Plains

So how about moving on to the next step instead: character generation. The next script implements the charater creation rules from Knave, a game based on old-school Dungeons and Dragons.

The first thing we need to do is generate some attribute scores. Each score can be simulated by rolling three 6-sided dice (d6) and taking the lowest result.

BEGIN {
    srand()

    map[1] = "str"
    map[2] = "dex"
    map[3] = "con"
    map[4] = "int"
    map[5] = "wis"
    map[6] = "cha"

    print "hp " roll(8)
    for (i = 1; i <= 6; i++) {
        print map[i] " " lowest_3d6()
    }
}

function roll(n) {
    return int(rand() * n) + 1
}

function lowest_3d6(_i, _tmp) {
    min = roll(6)
    for (_i = 1; _i <= 2; _i++) {
        _tmp = roll(6)
        if (_tmp < min) {
            min = _tmp
        }
    }
    return min
}

The output looks like:

$ awk -f knave.awk
hp 6
str 1
dex 2
con 2
int 1
wis 1
cha 4

Since this Awk program is not reading from a file (yet), everything is run in a BEGIN block. This allows us to execute Awk without passing in a file or input stream. Within that BEGIN block we build a map of integers to attribute names, making it easy to loop over them to roll for scores. Arrays in Awk are association lists, so they work well for this use-case.

The strange thing about this code is the use of parameters as local variables in the function lowest_3d6. The only way in Awk to make a variable local is to provide it to the parameter list when declaring a function, as all other variables are global. Idiomatic Awk attempts to reveal this strangeness by adding an underscore to the parameter names, as I have done, or by inserting a bunch of spaces before their place in the function definition.

Next up is to make these characters more interesting by assigning them careers and starting items. A career describes the character's origin, explaining their initial loot as fitting to their backstory. These careers are taken from Knave second edition.

First, a new data file:

acolyte: candlestick, censer, incense
jailer: padlock, 10’ chain, wine jug
acrobat: flash powder, balls, lamp oil
jester: scepter, donkey head, motley
actor: wig, makeup, costume
jeweler: pliers, loupe, tweezers
...

Now that our Awk program is reading lines from a file, we can add a new block that stores careers into an array so we can make a random selection for the player.

# ...snip

{ careers[NR] = $0 }

END {
    print "\nCareer & items:"
    print careers[roll(100)];
}

When the program is executed with the list of careers, the output looks like this:

$ awk -f knave.awk careers.txt
hp 3
str 1
dex 3
con 3
int 2
wis 3
cha 4

Career & items:
falconer: bird cage, gloves, whistle

Not bad!

I doubt these tools will come in handy for your next DnD campaign, but I hope that this post has inspired you to pick up Awk and give it a go on some unconventional problems.

More in the archive →