<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://miketalbot.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://miketalbot.io/" rel="alternate" type="text/html" /><updated>2026-03-06T19:15:12+00:00</updated><id>https://miketalbot.io/feed.xml</id><title type="html">Michael T. Talbot</title><subtitle>personal description</subtitle><author><name>Mike Talbot</name></author><entry><title type="html">Out of Bounds, On Purpose</title><link href="https://miketalbot.io/posts/2026-01-21-out-of-bounds-on-purpose-legendry/" rel="alternate" type="text/html" title="Out of Bounds, On Purpose" /><published>2026-01-21T00:00:00+00:00</published><updated>2026-01-21T00:00:00+00:00</updated><id>https://miketalbot.io/posts/out-of-bounds-on-purpose-legendry</id><content type="html" xml:base="https://miketalbot.io/posts/2026-01-21-out-of-bounds-on-purpose-legendry/"><![CDATA[<h2>Data visualization as an art</h2>
<p>I felt the need to include this brief preamble for one reason:<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> to acknowledge that data visualization is often (but not always) more art than science. This will actually be my first in a series of posts on this subject, and we’ll cover some broader topics later.</p>

<p>A perfect and ubiquitous example of where data visualization can go wrong is with <em>color scales</em>: we can easily churn through a long list of palettes to find one that makes a figure “pop”—but if the encoding implies a numeric story the data don’t support, the plot becomes persuasive in the wrong way. My rule of thumb is that the figure itself should carry the burden of correct interpretation: captions should add context, not rescue an ambiguous encoding (“let the plot do the work”). So it behooves me to be constantly on the lookout for small, concrete tricks that make a visual more intuitive. This post is about one such trick.</p>

<p><img src="https://imgs.xkcd.com/comics/painbow_award.png" alt="XKCD Painbow Award" /></p>

<p><small>Source: <a href="https://xkcd.com/2537/" target="_blank">xkcd</a>, of course&lt;/a&gt;</small></p>

<h2>Visualizing out-of-bounds (OOB) data using R</h2>
<p>Let’s say you want to plot some data, and let’s say that data has a relatively “long-tailed” distribution. This need not even necessarily be non-normally distributed data, but often this is also the case. There are many ways to visualize this kind of data and, obviously, the best ways will often depend upon the context. But instead of talking circles around this, let me provide an example.</p>

<p>Let’s look at some streamflow data. Without getting into details (because they are irrelevant to the substance of this post), I’ve extracted the largest observed streamflow values between 1981 and 2020 for 494 streams and rivers across the contiguous US.<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> Here’s what the distribution of these values looks like:</p>

<p><img src="../../images/oob_histogram.png" alt="OOB Histogram" /></p>

<p>We can see that these data roughly resemble some kind of heavily<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> right-skewed distribution. Now, let’s say we’re interested in mapping these data. We can do this quite easily using R:</p>

<pre style="font-size: 0.6em;">
library(tidyverse)
library(terra)
library(tidyterra)
library(colorspace)

us_states &lt;- vect(rnaturalearth::ne_states(country = "United States of America", returnclass = "sf")) %&gt;%
    project("EPSG:5070") %&gt;%
    dplyr::select(postal) %&gt;%
    dplyr::rename("state" = "postal") %&gt;% 
    dplyr::filter(!(state %in% c("AK", "HI")))

ggplot() +
    geom_spatvector(data = us_states,
        alpha = 1, 
        linewidth = 0.5, 
        fill = "white") +
    geom_spatvector(data = data, 
        aes(color = largest_observation),
        size = 1) +
    scale_color_continuous_sequential(
        name = "Peak Q (m^3/s)", 
        palette = "Viridis", 
        rev = TRUE,
        limits = c(0, NA)
    ) +
    theme_void() +
    theme(
        legend.key.height = grid::unit(0.4, "in"),
        legend.key.width  = grid::unit(0.2,  "in")
    )
</pre>

<p><img src="../../images/oob_map_complete.png" alt="OOB Map Complete" /></p>

<p>What might immediately jump out to you from this map is that our skewed distribution leads to a large number of values showing up in yellow (the bottom of our color scale), and progressively fewer showing up in green, blue, or purple. What this effectively does is cluster the majority of values in a narrow band of our color scale, which makes it difficult to differentiate among the stations clustered at the low end.</p>

<p>A common goal is to use the color scale efficiently so that the bulk of observations spans more of the available range, which improves visual discrimination among data points. In practice, real datasets rarely fill a color scale evenly, so this is a common issue.</p>

<p>I should note here that there are many different ways we could deal with this issue, including transforming the distribution of the data or using a more complex color mapping, and in some situations these are fine. However, I often avoid these strategies because they can reduce interpretability—especially on maps, where I prefer legends in the original units and a linear scale so equal numeric differences remain visually comparable. For this example, let’s prioritize distinguishing among the bulk of the data rather than among the largest values—in other words, the only thing we really need to know about the largest values is that they are large. Looking back at our histogram, the bulk of our values appear to lie between about 0 and 1,000 m^3/s.</p>

<p>An obvious solution here is to cap the color scale so the bulk of values use more of the available range, improving visual discrimination where most observations live. In <code class="language-plaintext highlighter-rouge">ggplot2</code>, you can do this by setting an upper <code class="language-plaintext highlighter-rouge">limits</code> value. The catch is that, by default, values outside the <code class="language-plaintext highlighter-rouge">limits</code> become <code class="language-plaintext highlighter-rouge">NA</code> for the scale and are drawn with <code class="language-plaintext highlighter-rouge">na.value</code> (often grey), which could be mistaken for missing data. Instead of dropping those values from the mapping, you can <em>squish</em> them: use <code class="language-plaintext highlighter-rouge">oob = scales::oob_squish</code> (or its alias <code class="language-plaintext highlighter-rouge">scales::squish</code>) so anything above the upper limit is mapped to the maximum color, and anything below the lower limit is mapped to the minimum color:</p>

<pre style="font-size: 0.6em;">
ggplot() +
    geom_spatvector(data = us_states,
        alpha = 1, 
        linewidth = 0.5, 
        fill = "white") +
    geom_spatvector(data = data, 
        aes(color = largest_observation),
        size = 1) +
    scale_color_continuous_sequential(
        name = "Peak Q (m^3/s)", 
        palette = "Viridis", 
        rev = TRUE,
        limits = c(0, 1000),
        oob = scales::oob_squish
    ) +
    theme_void() +
    theme(
        legend.key.height = grid::unit(0.4, "in"),
        legend.key.width  = grid::unit(0.2,  "in")
    )
</pre>

<p><img src="../../images/oob_map_clipped.png" alt="OOB Map Clipped" /></p>

<p>This clearly increases interpretability by spreading out our common values across the color scale. However, it also creates a new problem: once you squish out-of-range values into the <code class="language-plaintext highlighter-rouge">limits</code>, the legend no longer communicates that those colors include values beyond the endpoints. Without an explicit indicator, a reader can reasonably interpret the darkest color as “≈ 1,000” rather than “≥ 1,000” (up to 4,842, our largest value), collapsing the extremes into an indistinguishable category.</p>

<p>There’s an admittedly neat thing that Python’s <code class="language-plaintext highlighter-rouge">matplotlib</code> can do when you cap a color scale (i.e., you set limits but still want to signal out-of-range values). In <code class="language-plaintext highlighter-rouge">plt.colorbar</code>, you can use “extended” colorbars (triangular end caps) to indicate out-of-range values at one or both ends (see <a href="https://content.cld.iop.org/journals/1748-9326/18/10/104047/revision2/erlacfdbcf4_hr.jpg" target="_blank">Figure 4</a> of <a href="https://iopscience.iop.org/article/10.1088/1748-9326/acfdbc" target="_blank">this article</a> for a real-world example). That would solve our problem elegantly: we keep better contrast for the bulk of the data while still communicating that some stations exceed the plotted maximum. But, sadly, this is not a native feature of <code class="language-plaintext highlighter-rouge">ggplot2</code> :-(</p>

<p>Yes, we could switch to Python.<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup> But here’s the hard truth: I simply do not like <code class="language-plaintext highlighter-rouge">matplotlib</code>. I do not like it in the rain. I would not, could not, on a train. I will not use it here or there. I do not like it anywhere!</p>

<p>I know I could make equivalent plots using <code class="language-plaintext highlighter-rouge">matplotlib</code>, but I strongly prefer the syntax of <code class="language-plaintext highlighter-rouge">ggplot2</code>, and as such R will likely continue to remain my default for visualizations. So, I wanted to find a way to do this using R. Luckily, it’s quite easy. Enter the <a href="https://teunbrand.github.io/legendry/" target="_blank"><code class="language-plaintext highlighter-rouge">legendry</code></a> package: <code class="language-plaintext highlighter-rouge">guide_colbar()</code><sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup> can add end caps automagically when (and only when) the data exceed your scale limits by specifying <code class="language-plaintext highlighter-rouge">show = NA</code>:</p>

<pre style="font-size: 0.6em;">
# install.packages("legendry")

ggplot() +
    geom_spatvector(data = us_states,
        alpha = 1, 
        linewidth = 0.5, 
        fill = "white") +
    geom_spatvector(data = data, 
        aes(color = largest_observation),
        size = 1) +
    scale_color_continuous_sequential(
        name = "Peak Q (m^3/s)", 
        palette = "Viridis", 
        rev = TRUE,
        limits = c(0, 1000),
        oob = scales::oob_squish,
        guide  = legendry::guide_colbar(
          shape = "triangle",
          show  = NA,
          oob   = "squish",
          theme = theme(
            legend.key.height = grid::unit(2, "in"),
            legend.key.width  = grid::unit(0.2,  "in")
            )
          )
    ) +
    theme_void()
</pre>

<p><img src="../../images/oob_map_clipped_with_endcaps.png" alt="OOB Map Clipped with End Caps" /></p>

<p>Voila—out of bounds, on purpose. Now your plot can focus on the bulk of the data yet stay honest about the extremes.</p>

<p>Big thanks to the developer of <code class="language-plaintext highlighter-rouge">legendry</code>. You made my day.</p>

<hr />

<p><small>I do this for fun, but if you enjoyed reading this (and without ads!), consider <a href="https://buymeacoffee.com/talbotmichu">buying me a coffee</a> to help fuel my next post :coffee:</small></p>

<hr />

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Ok, two reasons… because every good blog post starts with an <a href="https://xkcd.com/2537/" target="_blank" rel="noopener">xkcd comic</a>. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Here’s a <a href="https://gist.github.com/realmiketalbot/0bd0af38c5b0f74c0d1fe16f895fe80d" target="_blank">reprex</a> you can play with since I haven’t provided you with my data. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>As I’m using this data to illustrate a visualization method, I’m intentionally not normalizing streamflow by watershed area, which would of course reduce the skewness of the distribution considerably. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>I use Python for a significant portion of my research (e.g., my machine learning model code), so this isn’t about “R vs Python,” which I believe to be an <a href="https://nonstandarddev.com/posts/r-vs-python/" target="_blank">utterly pointless debate</a> that nonetheless seems to persist. Part of why I often default to R is that I generally need to write significantly fewer lines of code in R than I would in Python to complete exactly the same task. But that’s not the only reason. If you’re interested in a deeper dive on why I believe R is <em>objectively</em> better than Python <strong>for data visualization</strong>, check out <a href="https://edwinth.github.io/blog/nse/" target="_blank">Edwin Thoen’s blog post</a> on one of the features of R that makes it so powerful: non-standard evaluation (NSE). <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Note that while <code class="language-plaintext highlighter-rouge">oob = scales::oob_squish</code> controls how out-of-range data are mapped to colors, <code class="language-plaintext highlighter-rouge">guide_colbar(oob = "squish", show = NA)</code> controls how the legend signals (and colors) the out-of-range end caps. You’ll typically want to use both. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Mike Talbot</name></author><category term="data-visualization" /><category term="science-communication" /><summary type="html"><![CDATA[Data visualization as an art I felt the need to include this brief preamble for one reason:1 to acknowledge that data visualization is often (but not always) more art than science. This will actually be my first in a series of posts on this subject, and we’ll cover some broader topics later. Ok, two reasons… because every good blog post starts with an xkcd comic. &#8617;]]></summary></entry><entry><title type="html">In Defense of the Acre-Foot</title><link href="https://miketalbot.io/posts/2024-12-31-in-defense-of-acre-feet/" rel="alternate" type="text/html" title="In Defense of the Acre-Foot" /><published>2024-12-31T00:00:00+00:00</published><updated>2024-12-31T00:00:00+00:00</updated><id>https://miketalbot.io/posts/in-defense-of-acre-feet</id><content type="html" xml:base="https://miketalbot.io/posts/2024-12-31-in-defense-of-acre-feet/"><![CDATA[<p>Let me start by asking a simple question: is it easier for you to visualize, in your mind’s eye, a million gallons of water or an Olympic swimming pool? Or for you cynics out there: is it easier to visualize a million milliliters of water or a refrigerator?</p>

<p>For most of us, there’s an obvious answer. Numbers alone are rarely intuitive, but a well-chosen analogy can transform the abstract into the tangible. This is at the heart of one of the challenges we face in science communication: how do we make complex ideas resonate with different audiences?</p>

<p>In this post, I’ll use the acre-foot – a peculiar and often-criticized unit of measurement – as a case study to explore (a) why the choice of units matter, (b) why it’s so hard to make numbers make sense, and (c) what we can learn from the acre-foot to improve how we share science.</p>

<p>Let’s dive in.</p>

<h2>The case against the acre-foot</h2>
<p>One <a href="https://en.wikipedia.org/wiki/Acre-foot" target="_blank">acre-foot</a> (abbreviated <em>ac-ft</em>) is very simply defined as the volume that is equal to one foot of depth over one acre of land and, as far as I know, it is exclusively used to measure a volume of water. Over the years, I’ve seen and heard many criticisms of the use of the acre-foot and the <a href="https://www.govinfo.gov/content/pkg/GOVPUB-C13-691a9b38e29a85d0925f4db586b60735/pdf/GOVPUB-C13-691a9b38e29a85d0925f4db586b60735.pdf" target="_blank">US customary system of measure</a> more broadly, of course. Since I primarily engage with the international scientific community, this is where I most commonly hear it, but I do have a sense that this criticism (…contempt, derision, disdain…) is more widespread than that. As a scientist, it doesn’t bother me because I use the metric system in my research, and as an engineer it doesn’t bother me because I am quite comfortable performing unit conversion. But as a red-blooded American… well, you know how defensive we can get in the name of “patriotism” (or something).</p>

<p>I’m clearly joking, as you’ll see. And anyway, despite the title, this post isn’t meant to be a defense of US customary units (or “English units”, or “the imperial system”, both of which are incorrect, by the by). Rather, this is a call to those critics to consider a more nuanced and pragmatic view of the issue of unit selection. In other words, let’s not throw the proverbial baby out with the swimming pool water just yet. But I’m getting ahead of myself.</p>

<p>Here’s the thing: on the surface, of course people are right to criticize the acre-foot. Neither acres nor feet make any sense in our modern scientific age, really: the <a href="https://en.wikipedia.org/wiki/Acre" target="_blank">acre</a> originated as the approximate unit of land that a yoke of oxen could plow in a day and, as we all know from the name, the <a href="https://en.wikipedia.org/wiki/Foot_(unit)#Historical_origin" target="_blank">foot</a> has varied, ancient, <a href="https://en.wikipedia.org/wiki/Anthropometry" target="_blank">anthropometric</a> origins… but this isn’t a history lesson, and if you’re reading this I trust you can hold your own against me in a “who reads more Wikipedia” contest. The point is that we’ve diligently and meticulously <a href="https://gallica.bnf.fr/ark:/12148/bpt6k3098j.image.f1082.langFR" target="_blank">defined</a> and <a href="https://opg.optica.org/ao/abstract.cfm?uri=ao-2-5-455" target="_blank">redefined</a> and <a href="https://www.npl.co.uk/si-units/metre" target="_blank">redefined</a> units in the international system of units (“SI units” or “the metric system”), which includes units of measure such as the meter and the gram, and we should use them for science.</p>

<p>I agree.</p>

<p>I grew up in the US, so as hard as it is for me to intuitively grasp how much I weigh in kilograms, or how hot (or cold?) I’ll be at 20 degrees Celsius, or what a stream looks like when it’s flowing at a cubic meter per second, I still agree that these are the units we should use for science and I use them in my own research. This, however, is the entirety of the argument against the use of the acre-foot: it’s simply not well-known to the vast majority of humans on planet earth, and it isn’t perhaps as standardized as the alternative (i.e., the cubic meter). While it is certainly a valid argument from a scientific perspective, let’s remind ourselves that most humans aren’t scientists and in doing so explore some other critical perspectives.</p>

<h2>The case for the acre-foot</h2>
<p>Allow me to briefly remind you that an acre is defined as 43,560 square feet (which, by the way, makes an acre-foot equal to 43,560 cubic feet) and that this is about the size of an American football field without its end zones, or somewhere near one half the size of a professional soccer pitch. A US survey foot is roughly 0.3048 meters, so an acre-foot is roughly equivalent to 1233.48 cubic meters. This is not close to a nice round number, to be sure, and I can understand the frustration that someone who is familiar with cubic meters might have at being forced to do the mental unit-conversion-math during a presentation at a scientific conference… just for, uh, sake of example (ok, yes, I may be subtweeting an AGU24 talk or two, but no shade).</p>

<p><img src="../../images/acre-size-comparison.png" alt="Acre Comparison" /></p>

<p><small>A visual comparison of an acre to an American football field and a standard soccer pitch (to scale) [original content].</small></p>

<p>Admittedly, the acre-foot is a unique unit. I’ve been racking my brain and perusing the <a href="https://en.wikipedia.org/wiki/List_of_unusual_units_of_measurement">list of unusual units of measurement</a> to think of another unit that combines two different units of measure for the same dimension<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> – length in this case – and I’ve only come up with two: the <a href="https://en.wikipedia.org/wiki/Board_foot">board foot</a> and the hectare-meter – the latter of which I have seen in the scientific literature as, I believe, merely an SI version of the acre-foot. So, carpentry aside, it appears that this may be unique to the field of hydrology where, perhaps, we simply have a need for such a unit – e.g., to convey how much water there is when it floods in a way that humans understand. This is really the fundamental utility of such a unit: someone without scientific training can visualize what you mean (with a modicum of explanation, if necessary) when you talk about an acre-foot of water.</p>

<p>Here is also where the criticism breaks down a bit, because an acre can be somewhat<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> neatly defined in square feet, so multiplying an acre by a foot simply yields cubic feet. If we discount the spurious origins of the foot and consider that it, too, has been modernly <a href="https://oceanservice.noaa.gov/geodesy/international-foot.html">redefined and standardized</a> (the international survey foot is exactly 0.3048 meters), then the ease of precise conversion to SI units means that, depending on the context, we shouldn’t worry too much about error propagation.</p>

<p>My point is this: while there are strong and sound arguments for using the metric system, and unit conversions have historically been the root cause of both minor oopsies and <a href="https://www.nist.gov/pml/owm/metrication-errors-and-mishaps" target="_blank">major catastrophies</a> alike, in many cases it actually isn’t a huge issue if we use something like US customary units in practice. Any engineer worth their salt should be capable of performing these conversions as necessary, triple-checking their math, and (these days) effectively using software to avoid needing to do either of these things in the first place. Like the cubic foot, the acre-foot is commonly used in the US engineering industry, and since <a href="https://www.npr.org/sections/thetwo-way/2017/12/28/574044232/how-pirates-of-the-caribbean-hijacked-americas-metric-system" target="_blank">pirates prevented metrification in the US</a>, we’re unlikely to see that change any time soon.</p>

<p>Briefly, it is also worth mentioning that there are some other practical considerations we need to make here. Perhaps the most obvious among these is the use of units like acre-feet in regulatory language like the <a href="https://en.wikipedia.org/wiki/Colorado_River_Compact">Colorado River Compact</a>. Historical and institutional momentum play a large role here, and in all likelihood this is just something that we’ll have to continue dealing with whether we like it or not. Moving on.</p>

<h2>Context is everything</h2>

<p>You may have already noticed, but we’re actually circling around a central theme here, which is this: <em>different units of measure are contextually intuitive</em> - i.e., they are intuitive based on where you grew up (the US vs <a href="https://sites.isucomm.iastate.edu/timothyl/imperial-in-a-metric-world/" target="_blank">nearly anywhere else</a>) or, more generally, they are intuitive based on your life experiences. If you’ve never seen an Olympic swimming pool or an American football field, then neither the pool nor the football field analogy are going to be effective examples to use to bring things into perspective, regardless of the units against which we’re trying to compare.</p>

<p>And this isn’t unique to the US, of course. If you grew up in England, you may still be measuring things in stone, pounds, and ounces, despite the <a href="https://en.wikipedia.org/wiki/Metrication_in_the_United_Kingdom" target="_blank">formal beginnings of metrification</a> some six decades ago. While the metric system did either kill off or standardize many traditional units of measure,<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> there are some that have stubbornly stuck around, particularly in agriculture and trade, including the <a href="https://en.wikipedia.org/wiki/Dunam" target="_blank">maund</a> (a unit of mass in Southern and Central Asia), the <a href="https://en.wikipedia.org/wiki/Hand_(unit)" target="_blank">hand</a> (used universally and exclusively for measuring equines) and the <a href="https://en.wikipedia.org/wiki/Cuerda" target="_blank">cuerda</a> (used in some Spanish speaking countries as, confusingly, either a unit of length, area, or volume).</p>

<p>On top of this, the <em>magnitude</em> of the values we use also matters, and this is intrinsically tied to the units we choose. Let’s explore this a bit more in the context of human psychology.</p>

<h2>The psychology of numbers</h2>
<p>So far we’ve established that, on one hand, standards are objectively good for science, and on the other that we live in a world where certain things are the way that they are and <a href="https://knowyourmeme.com/memes/old-man-yells-at-cloud" target="_blank">yelling at clouds</a> doesn’t really get us anywhere. Let’s turn our attention now to some more fundamental truths about the way that humans perceive both numbers and scales, because I think this is where things get really interesting.<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup></p>

<p>If you haven’t heard it in a while (or ever), do yourself a favor and go listen to the classic Radiolab episode from 2009 called <a href="https://radiolab.org/podcast/91697-numbers" target="_blank">Numbers</a>. Apart from being an example of a perfect podcast episode (IMHO, anyway), it features a fascinating discussion about how humans naturally relate to numbers logarithmically. As it turns out, this is <a href="https://www.science.org/doi/abs/10.1126/science.1156540" target="_blank">especially true</a><sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup> for those without formal math training, though I can’t help but wonder if even trained minds are constantly battling that underlying “lizard brain” math.</p>

<p>As a practical example of this idea, consider how we describe earthquakes using the Richter scale. The scale is logarithmic, so an earthquake with a magnitude of 6 releases ten times the energy of a magnitude 5, and a magnitude 7 releases ten times the energy of a 6. One could argue that the Richter scale works in part because it mirrors how humans instinctively process numbers: proportionally rather than absolutely. The proportional difference between two numbers is far more innately obvious to us than the absolute difference, such that the gap between 1 and 2 <em>feels</em> larger than the gap between 8 and 9. Extrapolate this into the realm of thousands, millions, or billions, and the absolute difference between two adjacent numbers effectively becomes meaningless to us. If you don’t believe me, try looking around and estimating, to the nearest person, the number of people in attendance the next time you’re at a concert venue or a sporting event. I guarantee you won’t be debating between 12,304 and 12,305; rather, your answer will be something to the effect of “more than a thousand but less than a million”.</p>

<p>This brings us to the next issue humans have with numbers, which is slightly different but highly related: <a href="https://towardsdatascience.com/the-small-problem-with-big-numbers-4f3dad23ce01" targets="_blank">we lack the ability to truly conceive of large numbers</a>.<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup> Try for a moment to imagine one million of something. For example, what would one million humans look like, all in one place? As it turns out, it’s difficult to find photos of so many people gathered together, as even the largest crowds top out at around 250,000 by many accounts (and it gets increasingly difficult to verify such claims), but the <a href="https://s.abcnews.com/images/Politics/GTY-obama-2009-inauguration-11am-jef-170120.jpg" target="_blank">crowd at Obama’s inauguration address</a> is <a href="https://en.wikipedia.org/wiki/First_inauguration_of_Barack_Obama#Crowds_and_general_ticket_holders" target="_blank">likely one of them</a>. And if I asked you how many people were in that photo and gave you three choices – (a) 100,000, (b) 500,000, or (c) 1,000,000 – would you feel confident in your answer? I, for one, would not, and this illustrates my point, which is that after a certain point a large number is no longer useful for conveying anything other than the simple fact that <em>it is a large number</em>.<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup> This is also somewhat related to the so-called <a href="https://mathworld.wolfram.com/LawofTrulyLargeNumbers.html">law of truly large numbers</a>, which to me is the reason people continue to play the lottery even after knowing the odds. They may <em>know</em> the odds, but they don’t <em>understand</em> them (or as Penn Jillette put it, “Million-to-one odds happen eight times a day in New York”).</p>

<p>Before we move on, there is one more related point of discussion that I can’t skip over, and that is how scale impacts the ability of humans to conceive of quantities. In short, <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285423" target="_blank">we can process information better</a> when it aligns with familiar, human-scale experiences. Analogies like “Olympic swimming pools” and “dump truck loads” make abstract volumes easier to grasp because they provide a tangible, proportional reference point. Similarly, just as logarithmic scales like the Richter scale mirror how we instinctively process proportional differences, human-scale analogies bridge the gap between abstract numbers and our everyday experiences.</p>

<p>Naturally, this discussion about how we perceive numbers leads us right into a discussion about how we might use this knowledge for good… or perhaps for evil.</p>

<h2>The ethics of choosing the right units</h2>
<p>I read <a href="https://www.nature.com/articles/s41586-024-07299-y" target="_blank">an article in <em>Nature</em></a> recently that contains a figure conveying changes in streamflow timing, the units of which are “months per year” (month yr<sup>-1</sup>). These units alone are not necessarily problematic, but when you see that the scale ranges from -0.02 to 0.02 month yr<sup>-1</sup>, you might be tempted to agree with me (based on all the above) that this was probably not the best choice of units in this case. Setting aside the fact that a month is not even a consistent measure of time, using another analogous unit in this example would yield a nice integer value (for example, using days per decade instead would yield a range of about -300 to 300). Now, while I think this example is more careless than unethical, things are not always so.</p>

<p>There are myriad <a href="https://wpdatatables.com/misleading-statistics/" target="_blank">ways to bias the presentation of numbers</a> whether purposely or accidentally, but one of the more annoying ones to me personally is simply choosing units that make your numbers look bigger or smaller to meet your aims. This can range from the relatively innocuous such as reporting <a href="https://madeblue.org/en/17-billion-litres-on-world-water-day-2024/" target="_blank">clean water produced in billions of liters</a> or <a href="https://www.researchgate.net/publication/257984531_Managing_Materials_for_a_Twenty-first_Century_Military" target="_blank">mineral stocks in trillions of grams</a> to the more nefarious such as <a href="https://www.huffpost.com/entry/using-misleading-science_b_585666" target="_blank">comparing bottled water consumption to total groundwater withdrawals</a> (i.e., using a very large denominator) to make said consumption seem insigificant.</p>

<p><img src="../../images/wonka-volume.png" alt="Wonka Volume" /></p>

<p>Even the intentional <em>exclusion</em> of numbers can be highly unethical. A perfect example of this is the way in which conservative US politicians have swayed public opinion by repeatedly saying that we spend “too much” on foreign aid, which has led to the widespread belief that US foreign aid represents 25% of the federal budget when <a href="https://www.brookings.edu/articles/what-every-american-should-know-about-us-foreign-aid/" target="_blank">it is actually less than 1%</a>.<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup></p>

<p>Some of this can be “justified” in the context of marketing or <a href="https://www.intereconomics.eu/contents/year/2018/number/1/article/nudging-in-public-policy-application-opportunities-and-challenges.html" target="_blank">nudge theory</a>, or as an innocent attempt to put things into perspective (as we’ve discussed at length), but I would argue that a lot of it simply unethical, if to varying degrees. If you’re in the business of marketing or policy-making my guess is you’ll continue doing what you want no matter what I say, but assuming I’ve convinced you to keep ethics in mind when presenting your numbers (and I hope it didn’t take much convincing), let’s take a moment to think more generally about effective science communication.</p>

<h2>What we can learn from science communication experts</h2>
<p>My high school physics teacher <em>hated</em> Bill Nye (his words). I have to believe this was because he was a legitimate physicist who retired to pursue his other passion of teaching (and who, by the way, gained my eternal gratitude by introducing me to the writings of Richard Feynman). To him, I can imagine that Nye was oversimplifying concepts, downplaying the rigor of physics, and simply “dumbing things down” to an unacceptable degree. But to me, who grew up not just watching but captivated by <em>Bill Nye the Science Guy</em>, Bill Nye was (in hindsight) simply another in a line of high-profile science communicators who truly understood how to get around the <a href="https://web.archive.org/web/20160410233551/http://www.cwsei.ubc.ca/resources/files/Wieman_APS_News_Back_Page_with_refs_Nov_2007.pdf" target="_blank">burden of knowledge</a> – a tidy explanation for the “disparity between good intentions and bad results” when it comes to science education. Regardless of what my teacher thought about him, as a protégé of Carl Sagan and a predecessor to Neil DeGrasse Tyson, Bill Nye impacted an entire generation of would-be scientists and continues to <a href="https://sustainability.wisc.edu/on-bill-nye-and-our-search-for-a-savior-against-climate-change/" target="_blank">lead the fight against climate science misinformation</a>.</p>

<p>But my views on science communication haven’t only been influenced by the likes of Bill Nye and Richard Feynman. I’ve learned a lot by simply paying attention to what works and what doesn’t. My engineering consulting career in particular taught me many things, but perhaps above all it helped me refine my ability to convey complex technical information to general audiences. As a result, I’d like to think of myself as a decent science communicator, but I’ll still readily admit that I’m not an expert and have a lot to learn.</p>

<p>There is an abundance of good information out there on the subject of science communication, including <a href="https://www.ascb.org/science-policy-public-outreach/science-outreach/communication-toolkits/best-practices-in-effective-science-communication/" target="_blank">this list</a> from the American Society of Cell Biologists (ASCB) Communication Toolkit. The first tip on that page is effectively “Know your audience”, which seems pretty on point here. In my experience, we could probably rephrase this as “Consider your audience”, since I’ve listened to talks where it seemed pretty clear that I, the Audience Member, was not even <em>on</em> the list of priorities when the slides were thrown together. Either way, by attempting to understand how your audience will perceive whatever it is you’re presenting to them, in all likelihood your key points will be far better received.</p>

<p>Consider the American farmer as an audience, for example, simply because it’s one with which I have some familiarity. It’s true that <a href="https://www.fb.org/market-intel/over-140-000-farms-lost-in-5-years" target="_blank">small farms are disappearing</a> but, despite what many people seem to think, there are still a lot of family farms out there. Trust me: I’ve spent countless hours traveling around farm country<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">9</a></sup> – whether knocking on doors in an attempt to get land access for field work,<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">10</a></sup> delivering presentations at multi-day workshops, or helping lead discussions at community open house meetings – and if at any time I started throwing “hectares” and “meters” around, I’d run the risk of losing the room (and perhaps my job), because farmers in the US are familiar with acres and feet. That’s just the reality. Put aside whatever you might think about industrial agriculture and the <a href="https://www.apmreports.org/story/2024/12/16/iowa-farm-field-nitrates-saturated-buffers" target="_blank">completely ineffective ways</a> we are trying to reduce its environmental impact,<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">11</a></sup> because to communicate with your audience (whoever they are) in a relatable way is to put your best foot forward and set yourself up to make the biggest impact. This is as true for farmers as it is for policymakers, schoolchildren, or the international scientific audience at your AGU meeting presentation.</p>

<p>I certainly learned a lot the hard way by making mistakes of my own and, at times, alienating entire segments of my audience, but I learned the most by observing how more experienced communicators dealt with these crowds. I’m thinking of those working in extension or outreach, or those who have simply had years of experience running public meetings. I can only describe what they do as a form of adept <a href="https://www.unitedlanguagegroup.com/learn/linguistic-code-switching" target="_blank">code-switching</a>, and I think it’s something more scientists and engineers really need to work on. While code-switching can be a double-edged sword and is <a href="https://lsa.umich.edu/lsa/news-events/lsa-magazine/Summer-2022/the-burden-of-code--switching.html" target="_blank">tantamount to pandering</a> if done poorly, doing it well is truly an invaluable skill. Just remember: it’s not about “dumbing down” your message, but rather meeting your audience where they are. And doing the opposite – for example, <a href="https://www.plainlanguage.gov/guidelines/words/avoid-jargon/" target="_blank">using jargon</a> because you think it makes you sound smart or authoritative – <a href="https://www.nature.com/articles/d41586-020-00580-w" target="_blank">can be equally alienating</a>.</p>

<p>I may be a little off track here, but I think these are incredibly important things to think about. To summarize, here are what I think are the five most important things you can do as a scientist to help your audience digest quantitative results:</p>

<ol>
  <li><strong>Consider your audience</strong> and tailor your units and analogies to the intellectual and cultural context</li>
  <li><strong>Use visual aids and contextualize abstract quantities</strong> – for example, putting the size of an acre in context</li>
  <li><strong>Use dual units</strong> when appropriate – for example, “1 acre-foot ≈ 1,200 cubic meters” – to help bridge contextual gaps</li>
  <li><strong>Prioritize proportionality and small magnitudes</strong> in presenting large quantities and changes over time</li>
  <li><strong>Simplify without oversimplifying</strong> to avoid distorting the science or coming off as pandering</li>
</ol>

<p>If I’m talking about flooding and I’m in the US, I believe that using the acre-foot instead of the cubic meter checks all these boxes. My audience is generally going to be familiar with these units :ballot_box_with_check:, I can easily use a football field to conceptualize an acre :ballot_box_with_check:, I can equate an acre-foot to cubic feet or cubic meters for any international outliers in the crowd :ballot_box_with_check:, it allows me to use smaller numbers than I would with something like gallons :ballot_box_with_check:, and since it’s a common unit, I don’t come off as pandering or condescending like I might if I tried to use something like “grain bins” or “corn fields” :ballot_box_with_check:.</p>

<h2>What in God's holy name are you blathering about?</h2>
<p>I’ll tell you what I’m blathering about! My point is not that the acre-foot is a “good” unit, nor is it that you need to use two completely different languages or sets of math in your research. As scientists, we necessarily suffer from the burden of knowledge, so we present numbers with which we are all too familiar without considering whether they actually make sense to our audience. We also have an ethical responsibility to communicate our findings in an understandable, replicable way. We must not forget that publishing papers is not about increasing our H-index; that presenting at a conference is not about padding the stats on our CVs; and that holding a public forum is not about meeting the requirements of a contract. If you work in science (and teaching and/or research in particular), chances are you are <a href="https://www.jyi.org/2015-november/2017/3/22/the-tough-choice-of-a-life-scientist-industry-vs-academia" target="_blank">more passionate about your work than you are about your paycheck</a> – but to what end?</p>

<p>As an advocate of <a href="https://www.cos.io/" target="_blank">open science</a>, I believe that it is our responsibility to make science <em>accessible</em> in every sense of the word. This obviously means continuing the move toward open access publications, sharing our data, using visual abstracts, and all the rest. But more fundamentally, it means giving thoughtful consideration to how we present our messages in order to achieve balance between accuracy, completeness, integrity, and comprehension. It’s a tough balance to strike, for sure, but I hope doing something as simple as thinking carefully about our choice of units can help us get a little closer.</p>

<p>So, the next time you’re presenting your results, consider putting them in context. If you’re debating between using “28 million acre-feet”, “35 billion cubic meters”, and “10 trillion gallons”, well, you’re simply having the wrong debate. Instead: about how many Lake Meads is that?<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">12</a></sup></p>

<p>And let’s embrace the swimming pool, the dump truck, and the acre-foot for what they are: flawed, imprecise, arbitrary, yet highly intuitive ways to convey the idea of “volume”.</p>

<hr />

<p><small>I do this for fun, but if you enjoyed reading this (and without ads!), consider <a href="https://buymeacoffee.com/talbotmichu ">buying me a coffee</a> to help fuel my next post :coffee:</small></p>

<hr />

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Something like the kilowatt-hour doesn’t count because kilowatts and hours are two different dimensions (power and time). Also, I’m not including the oft-used acre-inch because it’s not fundamentally different from an acre-foot. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>I say “somewhat” because the square root of 43,560 is not an integer - i.e., an acre is not “neatly” defined as a square with sides <em>x</em> feet in length, as is a hectare (100 m x 100 m). <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>Let’s have a moment of silence for my personal favorite traditional units of measure that were killed by the metric system: the <a href="https://en.wikipedia.org/wiki/Pood" target="_blank">pood</a> and the <a href="https://en.wikipedia.org/wiki/Batman_(unit)" target="_blank">batman</a>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>I couldn’t find a natural place for these, but somewhere in here I had to drop links to my <a href="https://xkcd.com/2319/">two</a> <a href="https://what-if.xkcd.com/4/">favorite</a> XKCD comics about large numbers. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Here’s a <a href="https://www.scientificamerican.com/article/a-natural-log/" target="_blank">pop-science version</a> in case you hit a paywall. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>Compounding this difficulty is <em>anchoring bias</em>, where our initial reference point—-or “anchor”—-skews our perception of a number’s true magnitude. For example, if I tell you that the crowd at Obama’s inauguration was reported as “more than 200,000,” you’re likely to unconsciously use that figure as a baseline for any further guesses. Whether the actual count was closer to 500,000 or 1,000,000 becomes harder to conceptualize because your mind has already latched onto the first number it encountered. This underscores the importance of carefully choosing reference points in science communication, as they can significantly shape how audiences perceive and interpret data. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>The same, I believe, goes for very small numbers as well. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>This is obviously big if true, because it means the intentionally deceptive reporting of numbers by politicians (and the subsequent failure in critical thinking by voters) likely plays a role in who gets elected to office. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Mostly in southern Minnesota, northern Iowa, western Wisconsin, and southern Ontario, but also a bit in North and South Dakota, Kansas, Nebraska, Colorado, California, Quebec… and even a bit in places like Ecuador and Slovakia. I’d consider myself a casual agritourist. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:11" role="doc-endnote">
      <p>If you happen to be someone who in any way oversees H/H modeling projects, I beg of you: send your modelers into the field. Even if only for one day, seeing things on the ground is the most valuable perspective a modeler can have. You’ll end up with better models that are more accurate and have the necessary details in the right places. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12" role="doc-endnote">
      <p>Just for the record: in my opinion, (small) farmers are no more responsible for being caught in a broken system than are doctors or teachers. We all need to set our sights higher and follow the money to the sources of these problems. <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13" role="doc-endnote">
      <p>Answer: that’s about the volume of one Lake Mead… at least when it was full (too soon?). <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Mike Talbot</name></author><category term="science-communication" /><summary type="html"><![CDATA[Let me start by asking a simple question: is it easier for you to visualize, in your mind’s eye, a million gallons of water or an Olympic swimming pool? Or for you cynics out there: is it easier to visualize a million milliliters of water or a refrigerator?]]></summary></entry></feed>