Stream of Thoughts

Momentary fascinations & inklings of ideas

Hello!
This is my place for messy drafts and other early-stage writing. Feel free to poke around, or go to the home page.

Recent Notes

TIL: Hyparquet

I came across a nice dependency-free library today, hyparquet, for parsing Parquet files in JavaScript.

It supports many compression formats including Snappy and Zstd, and reads Parquet files in chunks using range requests rather than loading the entire file. You can read just the metadata, or particular rows and columns.

It’s a good companion to Jeff Heer’s flechette, a lightweight library for reading and writing Arrow files. (API reference)

See also: fzstd, a small library for decompressing zstd files.


Advanced Essay

In the 1970s, decades before computers began to compete at the highest levels of chess, a new variant of the game was invented, called Advanced Chess. The idea was that instead of a human player competing against a computer, the human and machine would play together on the same side.

Thinking about this made me wonder about other forms of human–AI collaboration. In particular, I became curious about how the existence of AI models on the reader’s side might fundamentally change the essay form.

First, what is an essay?

Whenever you start thinking about how to cleave apart knowledge into its constituent components and dynamically reassemble them, there’s a risk of blurring the lines between the various forms of media entirely. So, for the purposes of this discussion, I’m going to say that an essay consists of a sequence of words and other elements intentionally arranged by its author in a particular order.1 This includes computational essays, in which the essay itself is formed from a computational medium in which text and images can exist alongside of dynamic illustrations, simulations, and other forms of interactive media.2

Constraints on essays

Traditionally, essays have been written for humans to read, which imposes a constraint on the essay form (and other art forms), which is that human attention is finite and valuable. Every second, you are gifted with exactly one second of the reader’s attention, and you had better make good use of it.3

I’ve sometimes wished for the “extended” version of an essay, or for a briefer version. But today, essays are typically provided in a one-size-fits-all fashion that does not adapt to the reader, and that the reader cannot easily adapt to themselves.

How does this change in the presence of reader-side AI models?

Compared to human attention, AI attention is potentially much, much cheaper. This lowers the cost of adding extra information to an essay so long as it does not interfere with the “main track” designed for people to read.

Imagine if the essay came bundled with a whole bunch of extra material:

The idea of curating a context for a particular piece of writing seems like a very powerful idea.

So, what can the AI do with all of this extra information?

It can take advantage of not only the material provided by the author, but also the context it understands about the reader. The writer knows things the reader doesn’t, and the reader knows things the writer cannot.4

For example, the AI can…

The User Interface

How would this look from the reader’s perspective? A few quick points here. I’d love to experiment with this, but I don’t currently have the time!

Existing examples

There’s a lot of prior art, though of course none of it has been designed with the intention of LLM use, since LLMs only appeared on the scene after these works were published.

Foot-sidenotes

So, that’s it for now. If I had had more time, I would have written a shorter essay. But in fact, there may be an advantage to this style of writing, and a future AI can condense the ideas context-specifically for individual readers based on their interests. 🙃

In service of that future, here are some extended side-footnotes (or foot-sidenotes) that I (or my AI) might return to if I decide to tinker with this concept later.


1

This is probably only roughly right, but the key idea I’m thinking about here is how we can augment a traditional essay, given the new possibilities and constraints of AI assistance on the reader side.

2

To date, the best instantiations of the computational notebook idea are found in Mathematica, which introduced the format, and Observable notebooks, which are an innovative browser-based take on some of the same ideas.

3

I spend a lot of my time thinking about data visualizations, which are not entirely unlike the written word. In that context, one consequence of the finiteness of human attention is that all of the visual elements on the screen are competing for the finite, precious resource of the reader’s attention, which makes it necessary to be very intentional about choosing what to show, emphasize, or hide. But there is no best visualization – it’s a function of not only the data, but also the audience and their interests and goals. Presenting an expert with a visualization design for a beginner will often cause them to be dissatisfied, as the simplifications that were necessary for basic comprehension precluded some of the more advanced insights, or omitted some important controls. . Fortunately, there’s more of a practice of providing alternate views of the same data so that you can meet the reader (or user) where they are.

4

This feels a bit like late binding in dynamic programming languages, and like Julia’s just-in-time-ahead-of-time compilation strategy. The idea is that the author of the library has written the logic, but does not know what concrete types their function will be called with. The user of the library has the values in their hand at the time they go to call the function, so the compiler just-in-time specializes the code and compiles an optimized version of the function for the exact types of values the user wants to call the function with. This is analogous to a writer who knows some things but lacks the full context of the reader’s experience level and interests, where the AI model can dynamically adapt the work to the interests of and capabilities of the reader.


TIL: Rust-analyzer can expand macros

If you use an editor with Rust’s LSP integration, you can put your cursor on a particular derive macro, such as Clone in #[derive(Clone)], or an the name of a macro application, such as matches! in matches(foo, bar), and select “Expand Macros Recursively”, which will open up a side buffer showing the expanded code from that macro.

This is extraordinarily useful when trying to understand what code is generated from a derive, or when debugging an issue with your own macros.

I discovered this through a comment made by David Barsky in the Rust Zulip.


Formatting Code with a Git Hook

Here’s a git pre-commit hook to auto-format code on commit. It’s useful if you’re working with code whose formatting guidelines differ from the ones you’ve configured in your code editor.


# .git/hooks/pre-commit

# Store list of staged files that match your target pattern.
# The `git diff` command returns a list of staged files.
files=$(git diff --cached --name-only | grep -E '\.(js|ts|svelte)$')

if [ -n "$files" ]; then
    pnpm format
    
    # Add all of the previously staged files back to the staging area
    # to check in the formatted files.
    git add $files
fi

# Exit with an error when there are no longer any staged changes,
# since otherwise git will create an empty commit.
if git diff --cached --quiet; then
    echo "Error: After formatting, there are no longer any files are staged for commit" >&2
    exit 1
fi

Quicker Netlify Deploys

I use Netlify to host small static sites like this one. It does its job well and is very convenient to use.

But I noticed that since the standard way of setting things up is to link Netlify directly with your Git repo, deploys could take a while. When you push your code, Netlify builds the site on their own infrastructure before deploying it.

I recently came across a very neat alternative where you can build your site locally and deploy it to Netlify directly with a single HTTP request. This is really convenient when you don’t need the overhead that comes with more careful deploy management.

The link above does a good job of describing the process, but the upshot is that the following command is all you need to have a deploy running live just a few seconds later.

curl -H "Content-Type: application/zip" \
     -H "Authorization: Bearer PERSONAL_ACCESS_TOKEN_FROM_STEP_1" \
     --data-binary "@FILE_NAME.zip" \
     https://api.netlify.com/api/v1/sites/SITE_NAME.netlify.com/deploys

Right now I use it like this (these build commands go in a justfile) to ensure that there’s a corresponding commit for every deployed version:

pub: no-uncommitted-changes build
  @rm -f dist.zip
  @zip -q -r dist.zip dist;
  @curl -H "Content-Type: application/zip" \
       -H "Authorization: Bearer $(cat .netlify-token)" \
        --data-binary "@dist.zip" \
       https://api.netlify.com/api/v1/sites/what.yuri.is/deploys | jq
  @rm -f dist.zip
  @echo ""
  @echo "Deployed!"

# Halts with an error if the repository contains uncommitted changes
no-uncommitted-changes:
  @ git diff --exit-code > /dev/null || (echo "Please commit changes to the following files before proceeding:" && git status --short)

build:
  zola build