Zod refinements are complicated

26 Feb, 2025 by Graham Marlow in til

Today I found myself at the bottom of a rabbit hole, exploring how Zod's refine method interacts with form validations. As with most things in programming, reality is never as clear-cut as the types make it out to be.

Today's issue concerns zod/issues/479, where refine validations aren't executed until all fields in the associated object are present. Here's a reframing of the problem:

The setup:

  • I have a form with fields A and B. Both are required fields, say required_a and required_b.
  • I have a validation that depends on the values of both A and B, say complex_a_b.

The problem:

If one of A or B is not filled out, the form parses with errors: [required_a], not [required_a, complex_a_b]. In other words, complex_a_b only pops up as an error when both A and B are filled out.

Here's an example schema that demonstrates the problem:

const schema = z
  .object({
    a: z.string(),
    b: z.string(),
  })
  .refine((values) => !complexValidation(values.a, values.b), {
    message: 'complex_a_b error',
  })

This creates an experience where a user fills in A, submits, sees a validation error pointing at B, fills in B, and sees another validation error pointing at complex_a_b. The user has to play whack-a-mole with the form inputs to make sure all of the fields pass validation.

As a programmer, we're well-acquainted with error messages that work like this. And we hate them! Imagine a compiler that suppresses certain errors before prerequisite ones are fixed.

If you dig deep into the aforementioned issue thread, you'll come across the following solution (credit to jedwards1211):

const base = z.object({
  a: z.string(),
  b: z.string(),
})

const schema = z.preprocess((input, ctx) => {
  const parsed = base.pick({ a: true, b: true }).safeParse(input)
  if (parsed.success) {
    const { a, b } = parsed.data
    if (complexValidation(a, b)) {
      ctx.addIssue({
        code: z.ZodIssueCode.custom,
        path: ['a'],
        message: 'complex_a_b error',
      })
    }
  }
  return input
}, base)

Look at all of that extra logic! Tragic.

From a type perspective, I understand why Zod doesn't endeavor to fix this particular issue. How can we assert the types of A or B when running the complex_a_b validation, if types A or B are implicitly optional? To evaluate them optionally in complex_a_b would defeat the type, z.string(), that asserts that the field is required.

How did I fix it for my app? I didn't. I instead turned to the form library, applying my special validation via the form API instead of the Zod API. I concede defeat.

Modularizing Start Emacs

24 Feb, 2025 by Graham Marlow in emacs

Some folks don't want their entire Emacs configuration to live in a single, thousand-line file. Instead, they break their config into separate modules that each describe a small slice of functionality. Here's how you can achieve this with Start Emacs.

Step one: load your custom lisp directory

Emacs searches for Emacs Lisp code in the Emacs load path. By default, Emacs only looks in two places:

  • /path/to/emacs/<version>/lisp/, which contains the standard modules that ship with Emacs
  • ~/.emacs.d/elpa/, which contains packages installed via package-install

Neither of these places are suitable for your custom lisp code.

I prefer to have my custom lisp code live within ~/.emacs.d/, since I version control my entire Emacs configuration as a single repository. Start Emacs adds ~/.emacs.d/lisp/ to the load path with this line in init.el (the Init File):

(add-to-list 'load-path (expand-file-name "lisp" user-emacs-directory))

Where user-emacs-directory points to ~/.emacs.d/, or wherever it may live on your machine.

The rest of this guide assumes your load path accepts ~/.emacs.d/lisp/, but feel free to swap out this path for your preferred location.

Step two: write your module

Next we'll create a module file that adds evil-mode with a few configurations and extensions.

Create the file evil-module.el in your ~/.emacs.d/lisp/ directory. Open it up in Emacs and use M-x auto-insert to fill a bunch of boilerplate Emacs Lisp content. You can either quickly RET through the prompts or fill them out. Note: to end the "Keywords" prompt you need to use M-RET instead to signal the end of a multiple-selection.

Your evil-module.el file should now look something like this:

;;; evil-module.el ---      -*- lexical-binding: t; -*-

;; Copyright (C) 2025

;; Author:
;; Keywords:

;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with this program.  If not, see <https://www.gnu.org/licenses/>.

;;; Commentary:

;;

;;; Code:

(provide 'evil-module)
;;; evil-module.el ends here

Most of these comments aren't relevant for your custom lisp module but they're good to have in case you ever want to share your code as an Emacs Lisp package. The single line of Emacs Lisp code, (provide 'evil-module), is the most important part of the template. It denotes 'evil-module as a named feature, allowing us to import it into our Init File.

Since we're building an evil-mode module, I'll add my preferred Evil defaults to the file:

;;; Commentary:

;; Extensions for evil-mode

;;; Code:

(use-package evil
  :ensure t
  :init
  (setq evil-want-integration t)
  (setq evil-want-keybinding nil)
  :config
  (evil-mode))

(use-package evil-collection
  :ensure t
  :after evil
  :config
  (evil-collection-init))

(use-package evil-escape
  :ensure t
  :after evil
  :config
  (setq evil-escape-key-sequence "jj")
  (setq evil-escape-delay 0.2)
  ;; Prevent "jj" from escaping any mode other than insert-mode.
  (setq evil-escape-inhibit-functions
        (list (lambda () (not (evil-insert-state-p)))))
  (evil-escape-mode))

(provide 'evil-module)
;;; evil-module.el ends here

Step three: require your module

Back in our Init File, we need to signal for Emacs to load our new module automatically. After the spot where we amended the Emacs load path, go ahead and require 'evil-module:

;; init.el
;; ...
(add-to-list 'load-path (expand-file-name "lisp" user-emacs-directory))

(require 'evil-module)

Reboot Emacs and your module is ready to go!

Async IO in Emacs

16 Feb, 2025 by Graham Marlow in til

Stumbled on the emacs-aio library today and it's introduction post. What a great exploration into how async/await works under the hood! I'm not sure I totally grok the details, but I'm excited to dive more into Emacs generators and different concurrent programming techniques.

The article brings to mind Wiegley's async library, which is probably the more canonical library for handling async in Emacs. From a brief look at the README, async looks like it actually spawns independent processes, whereas emacs-aio is really just a construct for handling non-blocking I/O more conveniently.

Karthink on reddit comments on the usability of generators in Emacs:

I've written small-medium sized packages -- 400 to 2400 lines of elisp -- that use generators and emacs-aio (async/await library built on generator.el) for their async capabilities. I've regretted it each time: generators in their current form in elisp are obfuscated, opaque and not introspectable -- you can't debug/edebug generator calls. Backtraces are impossible to read because of the continuation-passing macro code. Their memory overhead is large compared to using simple callbacks. I'm not sure about the CPU overhead.

That said, the simplicity of emacs-aio promises is very appealing:

(defun aio-promise ()
  "Create a new promise object."
  (record 'aio-promise nil ()))

(defsubst aio-promise-p (object)
  (and (eq 'aio-promise (type-of object))
       (= 3 (length object))))

(defsubst aio-result (promise)
  (aref promise 1))

Pulling Puzzles from Lichess

03 Feb, 2025 by Graham Marlow in til

Lichess is an awesome website, made even more awesome by the fact that it is free and open source. Perhaps lesser known is that the entire Lichess puzzle database is available for free download under the Creative Commons CC0 license. Every puzzle that you normally find under lichess.org/training is available for your perusal.

This is a quick guide for pulling that CSV and seeding a SQLite database so you can do something cool with it. You will need zstd.

First, wget the file from Lichess.org open database and save it into a temporary directory. Run zstd to uncompress it into a CSV that we can read via Ruby.

wget https://database.lichess.org/lichess_db_puzzle.csv.zst -P tmp/
zstd -d tmp/lichess_db_puzzle.csv.zst

CSV pulled down and uncompressed, it's time to read it into the application. I'm using Ruby on Rails, so I generate a database model like so:

bin/rails g model Puzzle \
  puzzle_id:string fen:string moves:string rating:integer \
  rating_deviation:integer popularity:integer nb_plays:integer \
  themes:string game_url:string opening_tags:string

Which creates the following migration:

class CreatePuzzles < ActiveRecord::Migration
  def change
    create_table :puzzles do |t|
      t.string :puzzle_id
      t.string :fen
      t.string :moves
      t.integer :rating
      t.integer :rating_deviation
      t.integer :popularity
      t.integer :nb_plays
      t.string :themes
      t.string :game_url
      t.string :opening_tags

      t.timestamps
    end
  end
end

A separate seed script pulls items from the CSV and bulk-inserts them into SQLite. I have the following in my db/seeds.rb, with a few omitted additions that check whether or not the puzzles have already been migrated.

csv_path = Rails.root.join("tmp", "lichess_db_puzzle.csv")
raise "CSV not found" unless File.exist?(csv_path)

buffer = []
buffer_size = 500
flush = ->() do
  Puzzle.insert_all(buffer)
  buffer.clear
end

CSV.foreach(csv_path, headers: true) do |row|
  buffer << {
    puzzle_id: row["PuzzleId"],
    fen: row["FEN"],
    moves: row["Moves"],
    rating: row["Rating"],
    rating_deviation: row["RatingDeviation"],
    popularity: row["Popularity"],
    nb_plays: row["NbPlays"],
    themes: row["Themes"],
    game_url: row["GameUrl"],
    opening_tags: row["OpeningTags"]
  }

  if buffer.count >= buffer_size
    flush.()
  end
end

flush.()

And with that you have the entire Lichess puzzle database available at your fingertips. The whole process takes less than a minute.

Puzzle.where("rating < 1700").count
# => 3035233

Logseq Has Perfected Note Organization

01 Feb, 2025 by Graham Marlow

A little while ago Apple Notes left me with quite the scare. I booted up the app to jot down an idea and found my entire collection of notes erased. I re-synced iCloud, nothing. Just the blank welcome screen.

Luckily my notes were still backed up to iCloud, even though they weren't displaying in the app (I checked via the web interface). After 40 minutes of debugging and toggling a series of obtuse settings, my notes were back on my phone. Yet the burn remained.

Since then I've been looking at alternatives for my long-term document/note storage. Apple Notes was never meant to be a formal archive of my written work, it just came out that way due to laziness in moving my notes somewhere permanent. I investigated the usual suspects: Notion, Obsidian, Bear, Org mode, good ol' git and markdown. Nothing stuck. Then I found Logseq and was immediately smitten.

The truth is, I don't actually use Logseq. I use Obsidian. You see, Logseq is a outliner. Every piece of text is attached to some kind of bulleted list, whether you're writing a code sample or attaching an image. Bulleted lists are great for notes, but not so great for blog posts or longform writing. I need a tool that can easily handle standard markdown for this blog, for example.

But despite not actually using Logseq, I've structured my Obsidian identically to Logseq. The Logseq method of organization is just so good. Everything boils down to three folders:

  • journal/: the place for daily notes.
  • pages/: high-level concepts that link between other pages or entries from the journal.
  • assets/: storage for images pasted from clipboard.

That's it! Just three folders, each containing a ton of flat files. All of my actual writing happens in journal pages, titled with the current day in YYYY-MM-DD format. I never need to think about file organization, nor do I struggle to find information.

Looking at a long list of YYYY-MM-DD files sounds difficult to navigate, but the key is that they're tagged with links to relevant pages (like [[disco-elysium]]) that attach the journal entry to a concept. When I want to view my notes on a concept, I navigate to the concept page (disco-elysium) and read through the linked mentions. I don't need to worry about placing a particular thought in a particular place because the link doesn't care.

I got hooked on this workflow because Logseq is incredible at linked mentions. Just take a look at this example page:

Logseq linked mentions example

All of the linked mentions (journal entries containing the tag [[disco-elysium]]) are directly embedded into the concept page. Logseq will even embed images, code samples, to-do items, you name it. It works incredibly well.

The Obsidian equivalent isn't quite as nice, but it gets the job done. Obsidian mentions are briefer, lack context, and stripped of formatting:

Obsidian linked mentions example

The flip-side is that I don't need to write notes in an outline form and can more easily handle moving my Obsidian notes into plain markdown files for my blog.

If you're like me and you want to use Logseq-style features in Obsidian there are a few configuration settings that are worth knowing about:

  • In your Core plugins/Daily notes settings, set the New file location to journal/ and turn on "Open daily note on startup".
  • In Core plugins/Backlinks, toggle "Show backlinks at the bottom of notes".
  • In Files and links, set the "Default location for new attachments" path to assets/.

These three settings changes will get you most of the way there. That said, before messing with those settings I encourage you to give Logseq a try. It's free and open source, it's built in Clojure, and it has an excellent community forum. Although I don't use it for my longform/personal writing, I use it at work where outlining fits my workflow better.