Chandler Squires
chandlersquires.bsky.social
Chandler Squires
@chandlersquires.bsky.social
CMU postdoc, previously MIT PhD. Causality, pragmatism, representation learning, and AI for biology / science more broadly. Proud rat dad.
Tons of interesting questions related to these topics, and tons of technical perspectives to explore. I'm keen to see where this line of thinking might lead, please link to any references that might be related 😀
February 2, 2025 at 10:00 PM
Ultimately, I think the distinction between "interpolation" and "extrapolation" needs to be made in terms of these levels. In some sense, it seems like being able to "extrapolate at level L" requires some machinery for moving up to level L+1 (or higher), interpolating, and moving back down.
February 2, 2025 at 10:00 PM
This hierarchical structure appears over and over again.

In Bayesian statistics, there are priors, hyperpriors, hyperhyperpriors, and so on. In topology, there are paths, homotopies (paths between paths), and so on. In geometric algebra, we have vectors, bivectors, trivectors, and so on...
February 2, 2025 at 10:00 PM
In the LLM example, the base level might be concepts like "pizza" and "Beowulf", the next higher level might be relationships between these concepts, then we can think about relationships between these relationships, and so on...
February 2, 2025 at 10:00 PM
Ultimately, we need some inductive biases to favor certain parameterizations over others. We can't escape the need for "domain knowledge", we just outsource it to a higher level and might call it something else.
February 2, 2025 at 10:00 PM
Now, parameterization-sensitivity is not inherently bad, it's just important to know what we're committing to - to identify our implicit assumptions and make them explicit.

If we know the "right" parameterization, then we should use it, but what if we don't, and how do we even define "right"?
February 2, 2025 at 10:00 PM
This instability of convexity-based definitions has been discussed a lot in the philosophical and cognitive science literature on natural kinds and categorization.

I'm currently reading "Conceptual Spaces: The Geometry of Thought" which covers this in great detail, would recommend.
February 2, 2025 at 10:00 PM
However, beyond 1D, our "nice" transformations T don't preserve convexity. So this distinction between interpolation and extrapolation is not parameterization-invariant.

Thus, a convexity-based distinction presupposes that we already (to some extent) know the "right" parameterization.
February 2, 2025 at 10:00 PM
How does this extend to higher dimensions? A natural generalization of the interval is the convex hull of x's that we've already seen, call this set S. Then we might think that predicting for a new x* in S is "interpolating", and predicting x* outside of S is "extrapolating".
February 2, 2025 at 10:00 PM
What if we reparameterize, letting x'=T(x) for some
"nice" (continuous and invertible) transformation T? In 1D, such a transformation T must be monotonic, so intervals map to intervals, and the distinction between interpolation/extrapolation remains the same.
February 2, 2025 at 10:00 PM
On the mathematical side, I think that 1D intuitions lead us astray.

For y=f(x), with x and y both scalars, we typically think of "interpolation" as predicting f(x) within the interval of x's that we've already seen, and "extrapolation" as predicting f(x) outside of that interval.
February 2, 2025 at 10:00 PM
How does an LLM "know" how to combine two concepts which it hasn’t seen combined before? Well, maybe it’s seen similar pairs of concepts combined.

Then, the LLM is "extrapolating" at the base level of these concepts, but "interpolating" at the level of relationships between concepts.
February 2, 2025 at 10:00 PM
LLMs highlight the difficulty of making these distinctions. If I prompt an LLM to give me a recipe for deep dish pizza in the style of Beowulf, then is it extrapolating (since that’s not in its training set) or interpolating (since pizza recipes and Beowulf are both in its training set)?
February 2, 2025 at 10:00 PM
I think “extrapolation” is in that dangerous category of words that make us feel like we know what we’re talking about, even when we aren’t.

The distinction between interpolation and extrapolation isn’t a given, it’s heavily theory-laden.
February 2, 2025 at 10:00 PM
I think a ton of others would have interesting thoughts on this - @jhartford.bsky.social, @moberst.bsky.social, @smaglia.bsky.social, to name a few
January 29, 2025 at 12:00 AM
Not sure whether this all fits better into the “variable-centric” or “mechanism-centric” perspective. It reminds me of a lot of other conceptual dualities, e.g. the literal duality between a vector space and its dual, or the categorical distinction between objects and morphisms.
January 29, 2025 at 12:00 AM
We usually talk about interventions in black and white - a mechanism was changed, or it wasn’t. I think the grey area (how much are they changed) is woefully unexplored, and is going to be key to many applications.
January 29, 2025 at 12:00 AM
That metadata can be thought of as a variable/vector, e.g. a molecular embedding when the interventions are drugs. Then we can encode priors, like similar drugs should have similar intervention targets.
January 29, 2025 at 12:00 AM
In my opinion, this should be one of the first things we teach. It also naturally suggests a lot of extensions which are perhaps less obvious from other perspectives. For example, in the unknown-target interventions setting, we might have some metadata about each intervention.
January 29, 2025 at 12:00 AM
You can find this kind of trick in a lot of works: Joint Causal Inference, the selection diagrams of Bareinboim and his collaborators, the decision-theoretic framework of Dawid and his collaborators, my work on structure learning and experimental design. Super useful!
January 29, 2025 at 12:00 AM