My list of reasons for using Apple Notes isn't very long. In many ways I think
Apple Notes is an inferior app, especially when comparing it to alternatives
like Notion, Obsidian, and the like. There is one thing that keeps me using it,
however. Simplicity.
Organization is a death sentence for spontaneity. Tools like Notion have me
questioning my note placement before I've written a single word, killing the
idea before it's had a moment to begin. I've lost track of ideas due to a
distracting home screen ("Oh, I should check my email"), or navigating a Notion
sidebar ("Where's the archive again?"), or generating the title of the very note
I'm starting to write.
Therefore, I value flow over all else. Nothing is more important than getting
the idea into some storage mechanism as fast as humanly possible. Expedience is
the same reason I don't carry a paper notebook and a pen, despite preferring
writing by hand to tapping on a piece of glass. When inspiration strikes and I
need to jot down an idea right now nothing beats Apple Notes. Pop open the
app, hit the lower-right corner, type type type.
Every month or so I'll peruse my notes and organize them into better, more
permanent places. 90% of the time that means deleting the note. I'm not very
sentimental[1]. The remaining 10% is moved into a GitHub README if it's a
project idea or drafted into a proper blog post (markdown + git). This whole
categorization phase is a moment of catharsis and feels wholly productive.
I've always been turned off by the term "second brain". Despite what
productivity gurus say I don't think my simple notes will ever amount to some
magnum opus of material, studied by historians in the distant future. Nor do I
think there's much value in the nitpicky categorization that makes up a
zettelkasten. I believe those who spend the majority of their time organizing
notes and staring at their Obsidian graph view are fooling themselves into
thinking they're more productive. Sure, it's pretty. But is my writing any
better?
Productivity software preys upon novelty. New apps tout new approaches that will
always catch me in the allure of efficiency. Let this ode to Apple Notes serve
as a reminder that sometimes simple is better.
Part of my notetaking philosophy is that a note has served its purpose
through the mechanism of its creation. Field Notes puts it well: "I'm not
writing it down to remember it later, I'm writing it down to remember it
now." So when I say 90% I mean it. ↩︎
On the inside cover of Kafka on the Shore Murakami explains how his idea for
the book started with its title. This approach is opposite to anything I've ever
written, though I recognize there's a notable difference between fiction and
technical writing. But what a powerful idea: a simple phrase shapes the entire
story.
I dug up this quote from an interview:
When I start to write, I don’t have any plan at all. I just wait for the story
to come. I don’t choose what kind of story it is or what’s going to happen. I
just wait.
Earlier this year 37signals released Writebook, a
self-hosted book publishing platform. It's offering number two from their
ONCE series, pitched as the antithesis of SaaS. Buy it once
and own it for life, but run it on your own infrastructure.
Unlike the other ONCE offering, Writebook is totally free. When you "purchase"
it through the ONCE checkout, they hook you up with the source code and a
convenient means of downloading the software on a remote service. Since the
software is free (but not open source) I thought it's fair game to read through
it and write a little post about its implementation. It's not everyday that we
can study a production Rails application made by the same folks behind Rails
itself.
Note: I'll often omit code for the sake of brevity with a "--snip" marker. I
encourage you to download Writebook yourself and follow along so you can
discover the complete context.
Run the thing
A good place to start is the application entrypoint: Procfile. I think
Procfile is a holdover from the Heroku-era, when everyone was hosting their
Rails applications on free-tier dynos (RIP). Either way, it describes the
top-level processes that make up the server:
redis, the backing database for the application cache and asynchronous
workers
workers, the actual process that executes asynchronous tasks
The only other infrastructure of note is the application database, which is
running as a single file via SQLite3.
bundle exec thrust bin/start-app might be surprising for folks expecting
bin/rails server as the main Rails process. thrust is the command invocation
for thruster, a fairly recent
HTTP proxy developed by 37signals specifically for ONCE projects. It provides a
similar role to nginx, a web server that sits in front of the main Rails process
to handle static file caching and TLS. The thrust command takes a single
argument, bin/start-app, which contains your standard bin/rails s
invocation, booting up the application server.
Redis and workers fill out the rest of the stack. Redis fills a few different
purposes for Writebook, serving as the application cache and the task queue for
asynchronous work. I'm a little surprised
Solid Queue and
Solid Cache don't make an appearance,
swapping out Redis for the primary data store (SQLite in this case). But then
again, perhaps it's more cost-efficient to run Redis in this case, since
Writebook probably wants to be self-hosted on minimal hardware (and not have
particular SSD requirements).
You can run the application locally with
foreman (note you'll need Redis installed,
as well as libvips for image processing):
foreman start
Pages that render markdown
When it comes to the textual content of books created with Writebook, everything
boils down to the Page model and it's fancy has_markdown :body invocation:
That single line of code sets up an
ActionText
association with Page under the attribute name body. All textual content in
Writebook is stored in the respective ActionText table, saved as raw markdown.
Take a look at this Rails console query for an example:
writebook(dev)> Page.first.body.content
=> "# Welcome to Writebook\n\nThanks for downloading Writebook...
To my surprise, has_markdown is not actually a Rails ActionText built-in. It's
manually extended into Rails by Writebook in
lib/rails_ext/action_text_has_markdown.rb, along with a couple other files
that integrate ActionText with the third-party gem
redcarpet:
lib/rails_ext/ as the folder name is very intentional. The code belongs in
lib/ and not app/lib/ because it's completely agnostic to the application.
It's good ol' reusable Ruby code for any Rails application that has ActionText.
rails_ext/ stands for "Rails extension", a common naming convention for vendor
monkey patches that might live in a Rails application. This code re-opens an
existing namespace (the ActionText module, in this case) and adds new
functionality (ActionText::Markdown). Within the application, users can use
ActionText::Markdown without evet knowing it's not a Rails built-in.
This is a neat little implementation for adding markdown support to ActionText,
which is normally just a rich text format coupled to the
Trix editor.
Beyond pages
Page is certainly the most important data model when it comes to the core
functionality of Writebook: writing and rendering markdown. The platform
supports a couple other fundamental data types, that being Section and
Picture, that can be assembled alongside Pages to make up an entire Book.
The model hierarchy of a Book looks something like this:
Book = Leaf[], where Leaf = Page | Section | Picture
In other words, a Book is made up of many Leaf instances (leaves), where a
Leaf is either a Page (markdown content), a Section (basically a page
break with a title), or a Picture (a full-height image).
You can see the three different Leaf kinds near the center of the image,
representing the three different types of content that can be added to a Book.
This relationship is clearly represented by the Rails associations in the
respective models:
Well, maybe not completely "clearly". One thing that's interesting about this
implementation is the use of a Rails concern and delegated_type to represent
the three kinds of leaves:
moduleLeafableextend ActiveSupport::Concern
TYPES=%w[ Page Section Picture ]
included do
has_one :leaf,as::leafable,inverse_of::leafable,touch:true
has_one :book,through::leaf
delegate :title,to::leafendend
There are three kinds of Leaf that Writebook supports: Page, Section, and
Picture. Each Leaf contains different attributes according to its kind. A
Page has ActionText::Markdown content, a Section has plaintext, and a
Picture has an image upload and a caption. However, despite their difference
in schema, each of the three Leaf kinds is used in the exact same way by
Book. In other words, Book doesn't care which kind of Leaf it holds a
reference to.
This is where delegated_type comes into play. With delegated_type, all of
the shared attributes among our three Leaf kinds live on the "superclass"
record, Leaf. Alongside those shared attributes is a leafable_type, denoting
which "subclass" the Leaf falls into, one of "Page", "Section", or
"Picture". When we call Leaf#leafable, we fetch data from the matching
"subclass" table to pull the non-shared attributes for that Leaf.
The pattern is made clear when querying in the Rails console:
writebook(dev)> Leaf.first.leafable
SELECT "leaves".* FROM "leaves" ORDER BY "leaves"."id" ASC LIMIT 1
SELECT "pages".* FROM "pages" WHERE "pages"."id" = ?
Rails knows from leafable_type that Leaf.first is a Page. To read the rest
of that Leaf's attributes, we need to fetch the Page from the pages table
associated to the leafable_id on the record. Same deal for Section and
Picture.
Another thing that's interesting about Writebook's use of delegated_type is
that the Leaf model isn't exposed on a route:
resources :books,except:%i[ index show ]do# --snip
resources :sections
resources :pictures
resources :pagesend
This makes a ton of sense because the concept of Leaf isn't exactly
"user-facing". It's more of an implementation detail. The relation between the
three different Leafable types is exposed by some smart inheritance in each of
the "subclasses". Take SectionsController as an example:
All of the public controller handlers are implemented in LeafablesController,
presumably because each Leafable is roughly handled in the same way. The only
difference is the params object sent along in the request to create a new
Leaf.
I appreciate the nomenclature of Book#press to create add a new Leaf to a
Book instance. Very clever.
Authentication and users
My go-to when setting up authentication with Rails is
devise since it's an easy drop-in
component. Writebook instead implements its own lightweight authentication
around the built-in has_secure_password:
The authentication domain in Writebook is surprisingly complicated because the
application supports multiple users with different roles and access permissions,
but most of it is revealed through the User model.
The first time you visit a Writebook instance, you're asked to provide an email
and password to create the first Account and User. This is represented via a
non-ActiveRecord model class, FirstRun:
Whether or not a user can access or edit a book is determined by the
Book::Accessable concern. Basically, a Book has many Access objects
associated with it, each representing a user and a permission. Here's the
Access created for the DemoContent referenced in FirstRun:
Likewise, when new users are invited to a book, they are assigned an Access
level that matches their permissions (reader or editor). Note that all of this
access-stuff is for books that have not yet been published to the web for public
viewing. Writebook allows you to invite early readers or editors for feedback
before you go live.
Whoa, whoa, whoa. What is this rate_limit on the SessionsController?
I like the occasional nesting of concerns under model classes, e.g.
Book::Sluggable. These concerns aren't reusable (hence the nesting), but they
nicely encapsulate a particular piece of functionality with a callback and a
method.
# app/models/book/sluggable.rbmoduleBook::Sluggable
extend ActiveSupport::Concern
included do
before_save :generate_slug,if:->{ slug.blank?}enddefgenerate_slugself.slug = title.parameterize
endend
Over on the HTML-side, Writebook doesn't depend on a CSS framework. All of the
classes are hand-written and applied in a very flexible,
atomic manner:
I'm also surprised at how little JavaScript is necessary for Writebook. There
are only a handful of StimulusJS controllers, each of which encompasses a tiny
amount of code suited to a generic purpose. The AutosaveController is probably
my favorite:
When you're editing markdown content with Writebook, this handy controller
automatically saves your work. I especially appreciate the disconnect handler
that ensures your work is always persisted, even when you navigate out of the
form to another area of the application.
Closing thoughts
There's more to explore here, particularly on the HTML side of things where
Hotwire does a lot of the heavy lifting. Unfortunately I'm not a good steward
for that exploration since most of my Rails experience involves some sort of
API/React split. The nuances of HTML-over-the-wire are over my head.
That said I'm impressed with Writebook's data model, it's easy to grok thanks to
some thoughtful naming and strong application of lesser-known Rails features
(e.g. delegated_type). I hope this code exploration was helpful and inspires
the practice of reading code for fun.
I've read a few stories about folks moving their email from HEY to Fastmail, but
have not seen any in the reverse direction. After two years of Fastmail, I'm
moving back to HEY. Here are my thoughts.
For those unacquainted with HEY, the main pitch is (a) screen unknown senders
(b) into one of three locations: "Imbox", "The Feed", and "Paper Trail". Senders
that are "screened out" are completely blocked, you won't be notified again from
that address. For those "screened in", the split inbox offers more than just
filters and labels. "The Feed" for example aggregates emails into a continuous
reader view that's nice browsing on a weekend morning. There are many
more features but these two are probably the
most important ones.
In my first HEY adventure, I had an @hey.com address for $99/yr. My primary
motivation was moving away from Gmail and freeing some of my dependence on
Google products, which I still maintain is worthwhile. HEY pulled me in with the
marketing, but at $99 I wasn't convinced I was receiving enough value for the
price tag. When I saw that Fastmail supported
Masked Email, my mind was made up. Added
privacy at half the cost? Yes please.
So I migrated, eating the cost of cycling yet another email address but setting
up a custom email domain along the way to future-proof my erratic email
exploration tendencies. I followed this
guide from Franco Correa
to emulate some of the HEY functionality in Fastmail, attempting to hold on to
some of the principles that improved my workflow.
Two years later and I'm moving back to HEY.
Why switch back? The decision mostly comes down to the difference in user
experience between the two apps. Fastmail feels like a chore to use, especially
on iOS where most of my email (and newsletter) reading happens. Here are my two
biggest problems:
I'd often need to close and reopen the Fastmail app because it was stuck on a
black screen. Particularly frustrating when on a slow connection because it
means going through the whole SPA-style loading animation that can take 10-20
seconds.
Using contacts + groups as substitutes for "The Feed" and "Paper Trail" is
tedious. Email addresses that go into either bucket must first be added to
contacts, then edited to include the appropriate filtering group. I honestly
can't remember how to do this in the mobile app.
There were also a handful of workflows that I was missing from HEY:
The ability to merge threads and create collections is incredible when dealing
with travel plans. Rather than juggling a bunch of labels for different trips,
email threads are neatly organized into one spot for each.
"Send me push notifications" on an email thread, which will notify me when
that thread and only that thread receives replies, is genius.
I created a "Set Aside" folder in Fastmail but eventually found myself missing
the nice little stack of email threads that are bundled up in a corner in the
HEY app.
Bundling email from certain senders into a single thread
is an excellent solution for notification streams from Github or Amazon, where
I want to be alerted with updates but don't want to have a bunch of separate
email threads taking up space in my inbox.
I really like clips as an
alternative to slapping on a label so I know to revisit an email for some
buried content.
Don't get me wrong, Fastmail is a great service. If I didn't find out that
masked email could be replaced by
DuckDuckGo Email Protection I would probably
still be using it[1]. I'm especially fond of their investment in
JMAP and attempts
to make the technical ecosystem around email better. Also, if you want to have
multiple custom domains routing to the same email platform, Fastmail is way more
cost effective.
But, having moved back to HEY, I've discovered that I'm easily swayed by
software that can please and delight. Many of HEY's features are UX oddities
that don't exactly nail down ways to make email better, but make the experience
of using it more enjoyable. I think HEY gets it right most of the time.
The calendar is a new addition to HEY in the time that I've been away and it's
interesting. I'm not hugely opinionated when it comes to calendars, I hardly use
them outside of work where my company dictates the platform. The HEY calendar
feels split between innovating for the sake of novelty and innovating for the
sake of good ideas.
For one, there's no monthly view. Only day and week. Instead of viewing a
complete month you view an endless scroll of weeks, with about three and a half
fitting on the screen at any given time. The daily/weekly focus of HEY Calendar
seems catered to daily activities: journaling, photography, and habit tracking.
Not so much complicated scheduling workflows.
HEY's email offering still has some rough spots as well:
No import from an existing email.
Adding additional custom domains is prohibitively expensive for a single user.
Feature rollout is asymmetrical, web and Android often outpace iOS.
Two separate apps for calendar and email (minor, but kind of annoying).
Journal integration with the calendar is interesting, but I'm hesitant to use
it because there's no export.
Can't use HEY with an external app (e.g. Thunderbird).
Still can't configure swipe actions on iOS.
Some of these (like swipe actions and import) are longtime issues that will
probably never be addressed. It's probably also worth noting that the HEY
workflow is rather opinionated and isn't guaranteed to hit. But hey, give it a
try and see if it works for you.
Moral of the story: use custom email domains. It protects you from email vendor
lock-in so you're free to experiment as you see fit.
On that topic, masked email is such a critical privacy feature for email
that I can't believe HEY doesn't offer it. I suppose the screener is meant
to alleviate that concern (since unwanted emails must be manually
screened-in) but it's not quite the same. I'd rather rest easy knowing that
only a randomly-generated email winds up in marketing garbage lists. ↩︎
I finally have started working through
Crafting Interpreters, a wonderful book
about compilers by Robert Nystrom. The book steps through two interpreter
implementations, one in Java and one in C, that ramp in complexity.
Now I don't know about you, but I hate Java. I can hardly stand to read it, let
alone write it. That's why I decided to write my first Lox interpreter in Ruby,
following along with the book as I can but converting bits and pieces into
Rubyisms as I see fit.
In general, the Java code can be ported 1-1 to Ruby with no changes. Of course
there's some obvious stuff, like lack of types means I need fewer methods and no
coersions, or certain stdlib method namespaces that are updated to match Ruby
idioms (while vs. until, anyone?). However, lots of code I just accept as-is
and allow Nystrom to guide me through.
I've only worked through the first 7 chapters, but I did note down a few things
in the Ruby conversion that I found interesting.
Avoiding switch statement fallthrough with regular expressions
Admittedly this difference is just a tiny syntactical detail, but one that plays
to Ruby's strengths. Take the book's implementation of scanToken:
privatevoidscanToken(){char c =advance();switch(c){case'(':addToken(LEFT_PAREN);break;// ...default:if(isDigit(c)){number();}elseif(isAlpha(c)){identifier();}else{Lox.error(line,"Unexpected character.");}}}privatebooleanisDigit(char c){return c >='0'&& c <='9';}// private boolean isAlpha...
Due to limitations in the Java switch statement, the author adds some
post-fallthrough checks to the default case. This removes the need to check
every number and letter individually (0-9, a-z, A-Z as separate cases) because
the check is deferred into the default case, where an additional conditional
statement is applied. Aesthetically it's not an ideal solution since it breaks
up the otherwise regular pattern of case ... handler that holds for the other
tokens. I don't know, it's just kinda ugly.
With Ruby, I can instead employ regular expressions directly in my switch
statement:
No default fallthrough needed! These tiny details are what keep me programming
in Ruby.
Metaprogramming the easy way
The largest deviation between the Java and Ruby implementation is definitely the
metaprogramming. In
Implementing Syntax Trees
the author employs metaprogramming through an independent build step.
First, a new package is created (com.craftinginterpreters.tool) with a couple
of classes that themselves generate Java classes by writing strings to a file:
privatestaticvoiddefineType(PrintWriter writer,String baseName,String className,String fieldList){
writer.println(" static class "+ className +" extends "+
baseName +" {");// Constructor.
writer.println(" "+ className +"("+ fieldList +") {");// Store parameters in fields.String[] fields = fieldList.split(", ");for(String field : fields){String name = field.split(" ")[1];
writer.println(" this."+ name +" = "+ name +";");}
writer.println(" }");// Fields.
writer.println();for(String field : fields){
writer.println(" final "+ field +";");}
writer.println(" }");}
These string builders are hooked up to a separate entrypoint (made for the
tool Java package) and are compiled separately. The result spits out a bunch
of .java files into the com.craftinginterpreters.lox package, whereby the
programmer checks them into the project.
It's not a bad solution by any means, but requiring a separate build step and
metaprogramming by concatenating strings is a little rough. The Ruby solution is
totally different thanks to a bunch of built-in metaprogramming utilities (and
the fact that Ruby is an interpreted language).
Here's how I wired up the expression generation:
moduleRloxmoduleExprEXPRESSIONS=[["Binary",[:left,:operator,:right]],["Grouping",[:expression]],["Literal",[:value]],["Unary",[:operator,:right]]]EXPRESSIONS.eachdo|expression|
classname, names = expression
klass = Rlox::Expr.const_set(classname,Class.new)
klass.class_eval do
attr_accessor(*names)define_method(:initialize)do|*values|
names.each_with_index do|name, i|
instance_variable_set(:"@#{name}", values[i])endenddefine_method(:accept)do|visitor|
visitor.public_send(:"visit_#{classname.downcase}_expr",self)endendendendend
When this file is included into rlox.rb (the main entrypoint to the
interpreter), Ruby goes ahead and builds all of the expression classes
dynamically. No build step needed, just good ol' Ruby metaprogramming.
Rlox::Expr.const_set adds the class to the scope of the Rlox::Expr module,
re-opening it on the next line via class_eval to add in the
automatically-generated methods.
To close the loop, here's what one of the generated classes looks like if it
were to be written out by hand (while also avoiding the dynamic instance
variable setter):
moduleRloxmoduleExprclassBinary
attr_accessor :left,:operator,:rightdefinitialize(left, operator, right)@left= left
@operator= operator
@right= right
enddefaccept(visitor)
visitor.visit_binary_expr(self)endendendend
Comparing the Ruby and Java implementation is interesting because it highlights
some higher-level advantages and disadvantages between the two languages. With
the Ruby version, adding new types is trivial and does not require an additional
compile + check-in step. Just add a name-argument pair to the EXPRESSIONS
constant and you're done!
The flip-side of this is the class is not easily inspectable. Although I wrote
Rlox::Expr::Binary above this paragraph as regular Ruby code, that code
doesn't exist anywhere in the application where a programmer's eyes can read it.
Instead, developers have to read the metaprogramming code in expr.rb to
understand how the classes work.
I think this implementation leans idiomatic Ruby: metaprogramming is part of the
toolkit so it's expected for developers to learn how to deal with it. If you're
interested in learning how the class works and can't understand the
metaprogramming code, you can always boot up the console and poke around with an
instance of the class. It kind of coincides with the Ruby ethos that a REPL
should be close at hand so you can explore code concepts that you might
otherwise misunderstand by reading the code.
That said, I still have respect for the Java implementation because Ruby
metaprogramming can really end up biting you in the ass.
TDD (well, not really)
I'm sure Nystrom omitted tests from the book because it would add a ton of
implementation noise to the project, and not in a way that benefited the
explanation. For my purposes, I wanted to make sure I added tests with each
chapter to make sure my implementation wasn't drifting from the expectation.
It's not perfect by any means, but it definitely gives me a ton of confidence
that I'm following along with the material and exercising some of the trickier
edge cases. I was also impressed that Nystrom's implementation is really easy to
test. Here's an example from the parser:
Astute readers might recognize that the parse helper function defined within
the test is also calling into the Rlox::Scanner class. That's one item that
I've taken the quick and easy approach towards: rather than ensure test
isolation by writing out the AST with the Rlox::Expr/Rlox::Statement classes
(which are incredibly verbose), I use Rlox::Scanner so I can write my tests as
string expressions that read like the code I'm testing. Unfortunately, that does
mean that if I write a bug into the Rlox::Scanner class, that bug is
propogated into the Rlox::Parser tests, but in my head it's better than the
alternative of tripling the lines of code for my test files. What can you do?
Next steps
There might be a part two for this post as I work my way further through the
first Lox interpreter. If you're interested in following along with the code,
check it out on Github.
Emacs 30.1 is near on the horizon with the most recent pretest (30.0.93) made
available in late December.
This post highlights some new features in the upcoming release that I find
especially compelling.
Native compilation
was introduced in Emacs 28 behind a configuration flag, so even though it's been
around for a little while you probably aren't using it unless you compile your
Emacs from source (or use a port that explicitly had it enabled). Enabling it by
default brings it to more users.
This feature compiles Emacs Lisp functions to native code, offering 2-5x faster
performance to the byte-compiled counterpart. The downside is an additional
dependency (libgccjit) and a little extra compilation overhead when installing
a package for the first time. The downsides are so minor that enabling it by
default is a no-brainer.
Native (and faster) JSON support
You no longer need an external library (libjansson) to work with JSON in
Emacs. On top of that, JSON parsing performance in Emacs is significantly
improved (the author provides that parsing is up to 8x faster). This is all
thanks to Géza Herman's contribution:
I created a faster JSON parser.
He summarizes his changes later in that thread:
My parser creates Lisp objects during parsing, there is no intermediate step
as Emacs has with jansson. With jansson, there are a lot of allocations, which
my parser doesn't have (my parser has only two buffers, which exponentially
grow. There are no other allocations). But even ignoring performance loss
because of mallocs (on my dataset, 40% of CPU time goes into malloc/free), I
think parsing should be faster, so maybe jansson is not a fast parser in the
first place.
Great stuff.
use-package version control support
You can now install packages directly from version-controlled repositories (for
those packages that aren't yet in ELPA, Non-GNU ELPA or MELPA).
This also means that you can opt into package updates based on commit instead of
latest release (e.g. :rev :newest). I think this is actually a sleeper feature
of :vc, since the default Emacs package release/update cycle can be a little
wonky at times.
If you want all of your :vc packages to prefer the latest commit (instead of
the latest release), you can set use-package-vc-prefer-newest to t.
Tree-sitter modes are declared as submodes
I had to read this change a few times before I grokked what it was saying.
Tree-sitter modes, e.g. js-ts-mode, are now submodes of their non-tree-sitter
counterpart, e.g. js-mode. That means any configuration applied to the
non-tree-sitter mode also applies to the tree-sitter mode.
In other words, my .dir-locals.el settings for js-mode simply apply to
js-ts-mode as well, without needing to write it explicitly. A nice quality of
life change to help pare down Emacs configurations that rely on both modes
(which is more common than you might think, given that non-tree-sitter modes are
typically more featureful).
Minibuffer QOL improvements
Some nice quality-of-life improvements for the default Emacs completions:
You can now use the arrow keys to navigate the completion buffer vertically
(in addition to the M-<up|down> keybindings).
Previous minibuffer completion selections are deselected when you begin typing
again (to avoid accidentally hitting a previous selection).
completions-sort has a new value: historical. Completion candidates will
be sorted by their order in minibuffer history so that recent candidates
appear first.
I always find myself forgetting the .dir-locals.el syntax (even though they're
just lists!) so this is a surprisingly handy feature for me.
New mode: visual-wrap-prefix-mode
Now this one is cool. I'm the kind of guy who uses auto-mode for everything
because I haven't bothered to figure out how Emacs line wrapping works.
Everything I write hard breaks into newlines after 80 characters.
The new mode visual-wrap-prefix-mode is like auto-mode, except that the
breaks are for display purposes only. I think this is incredibly useful when
editing text that might be reviewed using a diffing tool, since long lines tend
to display more useful diffs than a paragraph broken up with hard breaks. I'm
actually pretty excited about this change, maybe it will get me to stop using
(markdown-mode . (( mode . auto-fill)) everywhere.
New command: replace-regexp-as-diff
You can now visualize regular expression replacements as diffs before they're
accepted. This is actually incredible.
New package: which-key
Previously a package in GNU ELPA, which-key-mode is now built-in. With
which-key-mode enabled, after you begin a new command (e.g. C-x) and wait a
few seconds, a minibuffer will pop up with a list of possible keybinding
completions. It's a super handy tool for remembering some of the more esoteric
modes.
New customizations
Show the current project (via project.el) in your modeline with
project-mode-line.
Add right-aligned modeline elements via mode-line-format-right-align.
You can now customize the venerable yes-or-no-p function with
yes-or-no-prompt.
A few Emacs Lisp changes
There are a few small, yet impactful changes around help buffers and Emacs Lisp
types that I think are worth noting.
describe-function shows the function inferred type when available:
C-h f concat RET
(concat &rest SEQUENCES)
Type: (function (&rest sequence) string)
Built-in types show their related classes:
C-h o integer RET
integer is a type (of kind ‘built-in-class’).
Inherits from ‘number’, ‘integer-or-marker’.
Children ‘fixnum’, ‘bignum’.
The byte compiler warns if a file is missing the lexical binding directive.
Lexical bindings have been included in ELisp for awhile now, so it's nice to
see more effort being made towards making it the default.
;;; Foo mode -*- lexical-binding: t -*-
Read the full details
That wraps up my highlights. There's a ton more stuff included in Emacs 30.1 so
I encourage you to
check it out the NEWS yourself.
PS. Interested in trying out Emacs but don't know where to start? Check out my
MIT-licensed configuration guide:
Start Emacs.
Recently I've been deep down a crossword puzzle rabbit hole. I started a new
side project that has taken most of my writing energy:
them's crossing words, a blog where I post daily
crossword puzzle reviews and articles about the craft of puzzle construction.
Thus far there's about fifty crossword puzzles featured and discussed, a sizable
number of grids with over 10k words dedicated to crossing them.
When I started the project I thought I might burn out quickly on the idea.
Writing a daily review was actually far from my original intent. The thing is,
there's just so much to talk about when it comes to the art of crossword
construction (and puzzles in general, by extension). Every crossword is nuanced
and interesting, built by constructors that bring their own voice into the grid
with interesting clues and clever themes.
The idea of a puzzle blog has been bouncing around in my head for a long time
now, and my motivation to start one was largely influenced by the release of
Braid, Anniversary Edition.
When I was a kid playing through Braid on the Xbox Live Arcade, I didn't
actually care that much for puzzle games. They were too slow and plodding for my
High School brain.
Since then I've come to really appreciate the genre, with games like The Witness
and Outer Wilds completely blowing my mind as to what's possible in the medium
of video games. When Braid re-released this year with loads of developer
commentary, I was in.
Now that I'm playing through it as an adult I have a newfound appreciation for
its narrative and design. There are loads of spots where the narrative of the
tale is paralleled by the mechanics of the gameplay and the design of the
puzzles, a genius combination of factors seen in few games. What really drew me
in, however, was the developer commentary, discussing the minutiae of game,
sound, narrative, and art design behind every level and artistic motif.
The body of commentary in Braid, Anniversary Edition is staggering. The amount
of thought that bleeds into every ounce of that game is an incredible
achievement showing just how artistic the medium of video games can be. It
inspired me to start writing about puzzles because I think they're more than
just a method of wasting time. They're little worlds of simulation where ideas
mesh with action, a creative landscape of human ingenuity clashing with
constraints.
Needless to say I've been noodling on puzzles for the last few months, looking
back on some of my old Puzzlescript prototypes and
some of the things I learned about game design when I hacked them together.
Building a level for a puzzle game is kind of like mentoring new engineers. You
never really know how well you understand an idea until you need to teach it to
someone. Same goes for a mechanic in a puzzle game: what is the truth that
you're trying to expose to the person playing your game? Why is it interesting?
Anyway, this is less of a Recently and more of a ramble.
For this blog, I've also been interested in exploring systems languages now that
I have a year or so of Rust experience under my belt. I'm curious about the idea
of manual memory management and how it is handled by various different
languages, especially with the recent uptick in C-alternatives, like Zig, Odin,
Jai, among others. I've also been wanting to do a post on "learning systems
languages as a webdev" for awhile now, exploring the Rust ecosystem as a Ruby on
Rails/JavaScript developer. I think I need a stronger baseline in systems
performance before I try to tackle that subject.
Right now I'm reading through A Tour of C++ and Understanding Software
Dynamics, testing my knowledge of performance programming and how computers
actually work under the hood. It's been a humbling experience working in manual
memory scenarios after so many years of garbage-collected languages. It's a
different landscape when you have to deal with certain hot-paths in a game loop,
for example, where allocations can lead to undesirable performance
characteristics.
I can't believe I'm writing
another post about Awk but I'm just having too
much fun throwing together tiny Awk scripts. This time around that tiny Awk
script is a markdown renderer, converting markdown to good ol' HTML.
Markdown is interesting because 80% of the language is rather easy to implement
by walking through a file and applying a regular expression to each line. Most
implementations seem to start this way. However, once that final 20% is hit some
aspects of the language start to show their warts, exposing the not-so-happy
path that eventually leads to lexical analysis.
To list a few examples,
there are many elements that can span more than one line, like paragraph
emphasis, links, or bolded text
elements can encompass sub-elements like a russian doll, e.g. headers that
include emphasized text that itself is bolded
elements can defy existing behavior, like code blocks that can themselves
contain unrendered markdown
Each of these conditions complicates the simple line-based approach.
The renderer that I'm building doesn't aim to be comprehensive, so most of these
edge cases are not handled. For my toy renderer, I'm assuming that the markdown
is written in accordance to a general style guide with friendly linebreaks
between paragraphs.
I am also being careful to call this project an markdown "renderer" and not a
"parser" because it's not really parsing the markdown file. Instead, we're
immediately replacing markdown with HTML. The difference may seem nitpicky but
implies that there's no intermediate format between the markdown and the HTML
output, a nuance that makes this implementation less powerful but also much
simpler.
Let's get cracking.
Initial approach
Headers are a natural first step. The solution emphasizes Awk's strengths when
it comes to handing line-based input:
/^# /{print"<h1>"substr($0,3)"</h1>"}
This reads, "On every line, match # followed by a space. Replace that line
with header tags and the text of that line beginning at index 3 (one-based
indexing)." Since we're piping awk into another file, print statements are
effectively writing our rendered HTML.
Line replacements are the name of the game in Awk, where the simplicity of the
syntax really shines. The call to substr is less elegant than the usual
space-delimited Awk fields ($1, $2, etc.), but it's necessary since we want
to preserve the entire line sans the first two characters (the leading header
hashtag).
For something a little trickier, let's move on to block quotes. Block quotes in
Markdown look like the following, leading each line with a caret:
> Deep in the human unconscious is a pervasive need for a logical universe that
> makes sense. But the real universe is always one step beyond logic. - Frank
> Herbert
Finding block quote lines is easy, we just use the same approach as our headers:
But as you have probably guessed, this simplification isn't quite what we want.
Instead of wrapping each line with a block quote tag, we want to wrap the entire
block (three lines in this case) with one set of tags. This will require us to
keep track of some intermediate state between line-reads:
If we match a blockquote character and we're not yet inquote, we write the
opening tag and set inquote. Otherwise, we simply write the content of the
line. We need an extra rule to write the closing tag:
If our program state says we're in a quote but we reach a line that doesn't lead
with a block quote character, it's time to close the block. This matches against
paragraph breaks which are normally used to separate paragraphs in Markdown
documents.
This same strategy can be applied to the other block-style markdown elements:
code blocks and lists. Each requires its own variable to keep track of the block
content, but the approach is the same.
Paragraphs are tricky
It is very tempting to implement inline paragraph elements in the same way as
we've handled other, single-line markdown syntax. However, paragraphs are
special in that they often span more than a single line, especially so if you
use hard-wrapping in your text editor at some column. For example, it's very
common for links to span multiple lines:
A link that spans [multiple lines](https://definitely-a-valid-link-here.com)
This breaks the nice, single-line worldview that we've been operating under,
requiring some special handling that will end up leaking into other aspects of
our rendering engine.
My approach is to collect multiple paragraph lines into a single string,
rendering it altogether on paragraph breaks. This allows me to search the entire
string for inline elements (links, bold, italics), effectively matching against
multiple lines of input.
/./{for(i=1; i<=NF; i++)collect($i)}/^$/{flushp()}# Concatenate our multi-line stringfunctioncollect(v){
line = line sep v
sep =" "}# Flush the string, rendering any inline HTML elementsfunctionflushp(){if(line){print"<p>"render(line)"</p>"
line = sep =""}}
Each line of text is collected into a variable, line, that is persisted
between line-reads. When a paragraph break is hit (a line that contains no text,
/^$/) we render that line, wrapping it in paragraph tags and replacing any
inline elements with their respective HTML tags.
I'll point out that the technique of collecting fields into a string or array is
a very common pattern in Awk, hence the utility variable NF for "number of
fields". The Awk book uses this pattern in quite a few
places.
For completeness, here's what that render function looks like:
This code is noticeably less clean than our earlier HTML rendering, an
unfortunate consequence of handling multi-line paragraphs. I won't go into too
much detail here since there's a lot of Awk-specific regular expression matching
stuff going on, but the gist is a standard regexp-replace of the paragraph text
with HTML tags for matching elements.
Another problem that we run into when collecting multiple lines into the line
variable is accidentally collecting text from previous match rules. Awk's
expression syntax is like a switch statement that lacks a break: a line will
match as many expressions as it can before moving onto the next. That means that
all of our previous rules for headers, blockquotes, and so on are now also
included in our paragraph text. That's no good!
# I match a header here:/^# / { print "<h1>" substr($0, 3) "</h1>" }# But I also match "any text" here, so I'm collected:/./{for(i=1; i<=NF; i++)collect($i)}
Each of our previous matchers now has to include a call to next to immediately
stop processing and move on to the next line. This prevents them from being
included in paragraph collection.
The last piece of this Markdown renderer is adding the boilerplate HTML that
wraps our document:
BEGIN{print"<!doctype html><html>"print"<head>"print" <meta charset=\"utf-8\">"print" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">"if(head)print head
print"</head>"print"<body>"}# ... all of our rules go hereEND{print"</body>"print"</html>"}
Unlike other Awk matchers, the special BEGIN and END keywords are only
executed once.
As a nice bonus, we can add an optional head variable to inject a stylesheet
into our rendered markdown, which can be added via the Awk CLI. The following
adds the Simple CSS stylesheet to our rendered output:
I picked up Awk on a whim and am blown away by how generally useful it is. What
I thought was a quick and dirty tool for parsing tabulated files turns out to be
a fully-featured scripting language.
Before I started reading the second edition of
The Awk Programming Language, my only exposure to Awk was
from better-minded folk on Stack Overflow. After copy-pasting a short script
here or there, I was befuddled by the need for explicit BEGIN and END
statements in Awk one-liners. Shouldn't a program know when it begins and ends?
Why the redundancy?
Oh how wrong I was. Once you understand how Awk works, the syntax of BEGIN and
END makes a ton of sense; it's actually a consequence of Awk's coolest
feature. BEGIN and END are necessary because the default mode of an Awk
script isn't top-to-bottom execution, like other scripting languages. Instead,
Awk programs are executed repeatedly by default, either on the lines of a file
or an input stream.
To demonstrate, say I have a file where each line contains a location:
Forest
Hills
Desert
...
I can use Awk to turn that list of locations into one that is numbered with a
single statement, no loops required:
Without the BEGIN or END markers (which denote "run this before" and "run
this after"), Awk runs statements on every line of its input. In this case, that
means re-printing each location in the file locations.txt with some minor
modifications.
Awk provides a bunch of built-ins that make it easy to work within this
execution model. NR refers to "num row", keeping track of the current line of
input that is being processed. This generates our numbered list.
The dollar-sign variables refer to fields on an individual line. $0 is the
entire line, unmodified. $1, $2, and so on refer to subsets of the line,
broken up by a delimiter (e.g. space, tab, or comma) and read from left to
right.
And statements are just the tip of the Awk iceberg! You can assign each
statement a "matcher" that only runs the expression on lines that are truthy.
Here are a few examples:
# Print every row but the first
NR !=1{print$0}# Only print a row if the first field matches "cat"$1~/cat/{print"not a dog"}# Maybe your second field is a number?$2>=12&&$2<18{print"teenager"}
Now the BEGIN and END statements are starting to make more sense.
DMing with Awk
Now for something a little more complicated. As I mentioned before, Awk is a
fully-featured scripting language. You can write functions, generate random
numbers, build arrays, and do everything that you'd expect a normal language to
do (mostly, anyway). I ran across an example in the Awk book that demonstrates
the use of rand() via dice rolling and it sparked an idea: how useful can a
tool like Awk be for a DM running a Dungeons and Dragons game?
Since Awk is great at reading files, I figured it would also be great for
dealing with random tables. Given the locations file that appears earlier in
this post, here's how you can select a single location at random:
awk'{data[NR] = $0} END {srand(); print data[int(rand()*length(data))]}' locations.txt
It's easier to read with some annotations:
# Add every line in the file to an array, indexed by the line number{ data[NR]=$0}# After reading the file,END{# Seed randomnesssrand()# Pick a random index from the data array and print its respective valueprint data[int(rand()*length(data))]}
I really like how { data[NR] = $0 } is all that Awk needs to build an array
with the contents of a file. It comes in handy in cases like this where we need
the file contents in memory before we can do something useful.
Now, you might be thinking that this isn't that cool because sort can already
do it better. And you'd be right!
$ cat locations.txt |sort-R|head-1
Plains
So how about moving on to the next step instead: character generation. The next
script implements the charater creation rules from
Knave, a game based on old-school
Dungeons and Dragons.
The first thing we need to do is generate some attribute scores. Each score can
be simulated by rolling three 6-sided dice (d6) and taking the lowest result.
BEGIN{srand()
map[1]="str"
map[2]="dex"
map[3]="con"
map[4]="int"
map[5]="wis"
map[6]="cha"print"hp "roll(8)for(i =1; i <=6; i++){print map[i]" "lowest_3d6()}}functionroll(n){returnint(rand()* n)+1}functionlowest_3d6(_i, _tmp){
min =roll(6)for(_i =1; _i <=2; _i++){
_tmp =roll(6)if(_tmp < min){
min = _tmp
}}return min
}
The output looks like:
$ awk -f knave.awk
hp 6
str 1
dex 2
con 2
int 1
wis 1
cha 4
Since this Awk program is not reading from a file (yet), everything is run in a
BEGIN block. This allows us to execute Awk without passing in a file or input
stream. Within that BEGIN block we build a map of integers to attribute names,
making it easy to loop over them to roll for scores. Arrays in Awk are
association lists, so they work well for this use-case.
The strange thing about this code is the use of parameters as local variables in
the function lowest_3d6. The only way in Awk to make a variable local is to
provide it to the parameter list when declaring a function, as all other
variables are global. Idiomatic Awk attempts to reveal this strangeness by
adding an underscore to the parameter names, as I have done, or by inserting a
bunch of spaces before their place in the function definition.
Next up is to make these characters more interesting by assigning them careers
and starting items. A career describes the character's origin, explaining their
initial loot as fitting to their backstory. These careers are taken from Knave
second edition.
Now that our Awk program is reading lines from a file, we can add a new block
that stores careers into an array so we can make a random selection for the
player.
When the program is executed with the list of careers, the output looks like
this:
$ awk -f knave.awk careers.txt
hp 3
str 1
dex 3
con 3
int 2
wis 3
cha 4
Career & items:
falconer: bird cage, gloves, whistle
Not bad!
I doubt these tools will come in handy for your next DnD campaign, but I hope
that this post has inspired you to pick up Awk and give it a go on some
unconventional problems.