Seasonal Spirals

Note: This was originally published as an Observable notebook.

Human behavior is seasonal. People act differently on weekdays in winter than they do on weekends in summer, and these differences often show up even when you're not expecting them.

Let's take view counts of Wikipedia pages as an example. Here's a chart of four years of daily view counts for the Wikipedia page on whiskey sours:

The big spikes happen on New Year's Day and the small wobbles are weekly cycles. There's clearly a lot of structure, but the line chart makes it hard to see what's going on.

Let's take the same data and paint a new picture designed to bring seasonality into focus:

This chart is basically what you'd get if you took a calendar, colored each day by the number of visits that day, then wrapped it around a spiral.

How to read it:

The seasonal patterns are much easier to see here. There's a strong weekly pattern—people like their whiskey sours on weekends—and the buzz around New Year's is still clearly visible.`

Variations on a theme

The word seasonality evokes smooth transitions, from summer to fall to winter to spring, but seasonality in real data is often richer than that. Seasonal patterns interact with each other and shift over time. And sometimes a pandemic comes along and changes everything.`

Interesting decisions

A few specific design decisions make the spiral an effective way to communicate information.

Optimize for seasonality. The spiral does not try to replace the line chart, it just tries to be good at visualizing seasonal patterns. It's often better to use multiple pictures side-by-side, each revealing a different aspect of the same object, than it is to attempt one perfect picture.

Turn cognition into perception. The spiral is designed to make looking for seasonality an exercise in visual pattern matching. Unlike the line chart, where it was hard to see if a pattern was the same from year to year, the spiral feeds this information directly to your visual system. It's feature engineering for humans.

Piggyback on something familiar. The spiral adopts the conventions of a standard calendar layout—one week per row, Sunday to Saturday—which lets us reuse people's existing knowledge and makes the chart easier to understand and explain.

Expect yearly and weekly patterns. Yearly and weekly patterns are easy to see in the spiral because they're hardwired into the design: each week is a row, each year a circular revolution.

It's fun to look at the rare page with an offbeat pattern because it still looks quite pretty. Offbeat pages often have to do with celestial mechanics, such as this chart for full moon:

Use a spiral. The spiral has high data density, enables easy comparison across years, and looks nice, which, aside from being valuable for its own sake, encourages interaction. One alternative would be to use a long rectangular grid that unwraps the spiral, or a stack of rectangular grids, one per year. These can be reasonable choices depending on what you're trying to accomplish.

The spiral devotes more space to recent data, which makes sense when recent data is more important. In contrast, rectangles give equal weight to every day.

The spiral represents the continuous unbroken flow of time, while stacked yearly rectangles artificially divide it into yearly chunks. This would be a disadvantage for us but might be useful in other contexts. One advantage of a stack of rectangles is that they scale better to showing more years at a time.

Specialize the color scale to the data distribution. A lot of Wikipedia traffic is driven by current events and Wiki pages often experience several days with an extraordinary number of views.

To accommodate this mixed distribution in the color scale, the spiral combines a linear segment for ordinary days together with a log segment for the extraordinary ones. The linear segment goes from light green to dark blue, and the log segment from dark blue to deep red.

This approach solves the histogram equalization problem, keeping the subtle seasonal shifts visible while still calling out days with extreme values.

The caveat with this approach is that you must pick a precise cutoff between ordinary and extraordinary. Right now we use a heuristic (n times the mth percentile), which occasionally runs into trouble.

If you enjoyed this post, you might enjoy WikiPulse, my side project to draw spirals like this for any page on English Wikipedia.`

Thanks to Andrew Lin, Zora Killpack, Yuriy Rusko, Boris Vishnevsky, Dillon Shook, and Dan Luu for reading drafts of this post. Credit for the linear-log color scale idea is due to Andrew Lin.