The One Huge Oversight in Project Hail Mary
What do Eridians know about music theory?
Warning: This post contains spoilers. A lot of them. Do not read any further if you haven’t seen the film yet.
I loved Project Hail Mary. The film that is. I loved it so much that I went and bought the book and read it in two days. Andy Weir is a master of the softer side of the hard sci fi genre. He tells a great story and keeps it (mostly) within the bounds of our physics. Liu Cixin by contrast tells a story and stretches physics to its breaking point, but his stories have a galactic scope and you lose track of the characters over the vast distances and timespans. Weir’s stories are so great because the science is just a vehicle for a story about a person, or in this case a guy pairing up with an Eridian pal.
One thing that I really love about PHM is that Weir constantly asks himself “what would it be like if life evolved nearby in our galactic neighborhood?” What might life look like under different atmospheric, gravitational, and mineral conditions? And he plays out the thought experiment to the fullest. Not only biology, but language and culture. And under what circumstances might two civilizations of similar technological attainment meet? So many other sci fi stories pit humans against aliens. Modern ones like Three Body assume vast technological gulfs between different species and adopt a fundamentally antagonistic approach that prohibits collaboration. But Weir asks: what if humans and aliens worked together? I’m sure this has been done elsewhere, but I’ve never encountered a story like this where the author attempts to give us scientifically plausible answers as to how this might work, while also making it charming and even cute.
In the end we get Rocky and the Eridians, strange but relatable creatures that evolved under conditions of extreme pressure and heat and developed no vision whatsoever, communicating entirely through sound waves. Despite these limitations, Dr. Grace is able to reverse-engineer their language and learn to communicate with Rocky so they can collaborate to tackle the existential crisis both species face.
One of the things I really loved about the book versus the film is how Weir took the time to explain what the biology of a creature might be if an organism evolved under conditions of 29x Earth pressure, an ambient temperature of 100 degrees c, an atmosphere made mostly of ammonia, and permanent darkness. Result: a race of self-contained, tough-shelled crab-like creatures with a carapace made of heavy metals, with ammonia blood and a crystalline brain and steam-powered joints. Living in permanent darkness, the Eridians never bothered to evolve sight.
These form the building blocks with which Weir asks the more interesting questions of what language, culture, norms and even habits might look like based on that biology.
The thing that I absolutely love about the book is Weir constantly asking “what might we have in common with these creatures, despite our colossal differences?” This is kind of his overarching premise: even a race of blind steam-powered metallic spider crabs with ammonia blood and photographic memories could not only communicate with us, but actually build common ground and form friendships. For instance, we use base 10 because we have 10 fingers, but they use base 6 because they have two main three-fingered claws. We have our own specialties: humans are better at science, Eridians are better at engineering. They are very good at memorization and mental math, but somehow never figured out relativity or radiation. And of course we have better visual sensing (they don’t have it at all) while they can do echolocation and sonar with their sensitive ears.
Big big spoiler here: towards the end of the book (and this isn’t even mentioned in the film), we realize that the Tau Ceti star system where astrophage originates and where we go to seek answers is actually the common ancestor of life in the entire star cluster – including Erid and Earth. That’s why astrophage has mitochondria just like human cells. We, the Eridians, and the astrophage (and the taumoeba) all share a common ancestor 4 billion years ago. So really, this is a story about reconciliation with our long-lost brethren (as we unite to fight a distant cousin).
The question of how we might relate to the Eridians is asked and answered in many different ways in the book (the film doesn’t have much time to go into this detail).
For instance, in one of the more 4th-wall-breaking moments of the book, Rocky and Grace spend a few moments wondering about why the two civilizations met at relatively similar stages of technological development, an obvious puzzle, given the vastness of the universe and the relative rapidity of technological development (think about how different the Spaniards were from the Aztecs – now scale that up to a galactic level). Answer: both species were sufficiently advanced to launch interstellar voyages (but only just, in both cases), but not so advanced they could trivially defeat the astrophage threat. (Presumably earthlings in 2100 would be able to do this easily). Hence, both notice the unusual signature of Tau Ceti and in their desperation dispatch relatively primitive missions to the star system.
Theres another incredible passage where Grace and Rocky surmise that the reason they think at roughly the same speed is because they are subjected to about the same gravity, and that thought speed is an evolutionary function of prey-predator dynamics, and that gravity determines how quickly critters can scamper around. The book is full of speculative xeno-evo-bio theories like this. Basically: why is this feature of our world that we take for granted the way that it is?
As for how alien language, this is a common theme in the sci fi genre now. Call it xenophilology. This is the whole point of Arrival, an amazing film, though not hard sci fi by any means, as well as Interstellar, District 9, and Contact. Close Encounters is the closest encounter with the grammar of PHM, in that its aliens use musical tones to communicate.
Eridians as Musical Instruments
The Eridian language is audio and frequency-based combining elements of both echolocation and whalesong. Again these life-forms are totally blind and can’t even conceive of what vision-based sensing even is like. Rocky communicates in notes and chords and even short musical sequences.
Eridians have five vocal pipes and can therefore create polyphonic tones. Rocky is basically whistling by pushing Eridian atmosphere gas (their air) made of ammonia and CO2 back and forth past vibrating membranes between internal bladders. Weir in interviews describes these as “vocal chords” but it really isn’t similar to humans at all because Rocky is under turbo amounts of pressure and doesn’t have soft fleshy tissues. Some people might think Rocky is like a pipe organ (blowing air through pipes to produce different tones), but that doesn’t work, because he only has 5 pipes, and in an organ they have fixed frequencies, and he has to cover the whole frequency range. (At the end of the book, Grace actually does use an organ to communicate with the Eridians.)
I spent a lot of time trying to figure out what kind of “instrument” Rocky’s pipes are. Based on the book, he’s most akin to a polyphonic whale. Baleen whales push air from their lungs, past a vibrating larynx into their laryngeal sac, and then they start it over. If we had to analogize it to a human instrument, it would actually be a bagpipe. So Rocky, being a musical instrument himself, is kind of like 5 sets of bagpipes inside a resonant rocky carapace.
There was no pronunciation or inflection of the sounds. Just notes. Like whale song. Except not quite like whale song, because there were several at once. Whale chords, I guess.
Unfortunately, the film mostly discards the musical nature of his language and very quickly gives Rocky a translated human voice. In the film, for the few moments that we listen to him directly, Rocky sounds like a robotic cat purring. (In the audiobook, they try to replicate Rocky’s speech more faithfully and actually use the specific musical tones described in the book.) Apparently for the film the sound designers used a “jug, ocarina, didgeridoo, contralto flute, and contralto clarinet” to craft Rocky’s voice. This is problematic from a music theory perspective because they are mixing different taxonomies of instruments as far as harmonics are concerned. I’ll get into this more later.
In the book, Rocky’s first few words are described explicitly by Weir as musical notes. So we can assume he is creating relatively clean tones in the 20 Hz – 20 kHz frequency band that humans can hear. (Rocky admits he can hear and produce tones outside those frequencies. To him, earthlings must seem practically deaf and mute.)
Eridian Musical Grammar
So we’ve established that Rocky is five bagpipes crammed inside an ammonia rock spider alien. But how does the Eridian musical language convey meaning?
Rocky speaks in polyphony and can create sequences of notes and chords. He has five distinct voices which is enough to play Bach and Chopin, but not Rachmaninoff. He can convey emotion with frequency, dropping a whole octave to show sadness. He conveys emphasis by tripling certain words (“bad bad bad” or “amaze amaze amaze”). Beyond that we don’t get a ton of insight into the Eridian grammar. Cadence can convey humor (jokes have a different rhythm). More basic concepts generally have fewer notes in the chords.
Eridian speech is far more compressed in terms of bits per unit time than human speech, I would estimate by a full order of magnitude. Because Eridians have many different variables they can rely on to create meaning. These are the ones that are mentioned or suggested in the book:
Pitch (frequency)
Number of tones
Rhythm
Duration
Chord or tonal sequencing (melody, harmony)
Relative pitch (intervals, chords)
Amplitude (loudness)
Timbre/harmonics (other pitches that are nested within the base frequency, giving the instrument its characteristic sound)
As any music producer will know, these aren’t the only musical variables available to convey semantic content. We also have envelope (attack, decay), modulation (vibrato, tremolo), directional control (mono, stereo, probably more in the case of the Eridians).
Now to get to the point.
Music is a great choice for Weir’s book because the overarching theme is “universal languages”. Ways that biologically incompatible civilizations lightyears apart could find a way to collaborate. When Grace meets Rocky, he tries to build up a shared semantic grammar by conveying concepts of time, units, and math. These are based on hard laws of physics. It’s plausible that the Eridians built up their own counting system (base 6), their own notion of time and standard units. But what about music?
Music is “universal” in the sense that good music sounds good to most cultures. Without any cultural training, babies recognize rhythm and prefer regular beats to irregular ones. Humans generally prefer consonance (perfect fifth or octave intervals, etc) to dissonance, but this seems to vary with training (musicians prefer it a lot more than average folks, and westerners prefer it more than uncontacted tribes). And music is universal in the sense that most people regardless of national origin will interpret a bass drop as a “release of tension”. It’s totally plausible that the Eridians, who communicate through sound, would be strongly attuned to melodies and harmonies and have an inherent musical sense, thus creating a shared grammar with humans.
So this brings me to a question which has been bothering me nonstop since I read the book:
Why didn’t Andy Weir explain how harmony actually derives from universal laws of physics?
The entire point of the book is to explain how despite our differences, we could develop an understanding with an alien civilization, and to specifically ground these cultural similarities and differences in physical, chemical, and biological realities.
Music is a perfect choice for this because it’s a domain in which meaning is directly derived from physics! But Weir only vaguely hints at it, and doesn’t flesh it out, or even really explain it to readers who are unfamiliar with psychoacoustics (which is almost everyone). In the book, Rocky employs fifths, octaves, and even minor seventh chords. A smart generalist might say “well this totally breaks immersion. Why on earth would Eridians know about western music theory concepts?” What Weir should have done is explain how these are built from basic physics primitives, but he never does. In my opinion, this is a huge oversight, because Weir’s entire schtick is to avoid being handwavy and ground all his claims in mathematical theory.
The one time he sort of does this is in this passage:
My mouth hangs open.
“You’re the only person on that huge ship?!”
He’s quiet for a moment, then says, “♫♩♪♫♩♪♫ ♫♪ ♩ ♪ ♫ ♫♪♫♪♩ ♫♪♩♪ ♫♩ ♪ ♫♩♪ ♫ ♩♪♫♩♪ ♫♩♪ ♫.”
Complete nonsense. Did my kludged-together translation software fail? I check it out. No, it’s working fine. I examine the waveforms. They seem similar to the ones I’d seen before. But they’re lower. Come to think of it, that whole sentence seemed lower in pitch than anything Rocky has ever said before. I select the whole segment in the software’s recording history and bump it up an octave. The octave is a universal thing, not specific to humans. It means doubling the frequency of every note.
This is the only part of the book where Weir hints at the universality of music. He is right. Quick refresher. What we think of as pitch is just frequency. Conventionally, the A above middle C is 440 hertz. This actually means “the sound we think of as A above middle C is defined as a sound wave vibrating 440 times per second”. More vibrations equals higher pitch. When you get to low enough frequencies, you go from hearing pitch (100 Hz) to a very low buzz (80 Hz) to literally “feeling” the bass as it vibrates your chest cavity (40 Hz) to hearing distinct taps (10 Hz).
Now intervals. An 88-key piano contains a range of 27 to 4186 Hz over seven and a third octaves. Those octaves are subdivided, in western music, into 12 semitones (but you can divide them however you like). In case you can’t imagine what an octave sounds like, think of the first two notes (some…..WHERE) in Somewhere Over The Rainbow.
All an octave is, as Weir says, is a doubling of frequency. On a piano, the As are 27.5 Hz, 55 Hz, 110 Hz, 220 Hz, 440 Hz, 880 Hz and so on. But why do octaves sound “good”? Why might humans and Eridians be interested in octaves specifically as opposed to any other interval?
People will give you a lot of second-order answers to this question – octaves minimize “beating”, octaves are the second harmonic in the series – but the fundamental reason is simply that a 2:1 frequency ratio is perfectly periodic. The wave cycles of the two notes line up perfectly, with the shorter cycles of the higher note amplifying the slower cycles of the lower half the time. A 2:1 ratio of frequencies is the least confusing auditory pattern our brains can receive. So we find octaves pleasant. They feel “stable” or “simple”.
That one’s easy enough. How about all of the other intervals? And how does Rocky know what a dominant seventh is?
“Okay,” he finally says. “Name is ♫♩♪♫.”
I don’t need the frequency analyzer anymore. That was an A-below-middle-C major fifth, followed by an E-flat octave, and then a G-minor seventh.
I enter it into my spreadsheet. Though I don’t know why. I haven’t had to look at that thing in days.
“What does it mean?”
“It is name of my mate.”
This is where in reading the book I started to feel pained.
There is actually a good answer as to why Rocky might know what major fifths and sevenths are, and might prefer them in his chordal language.
Remember how human brains like 2:1 frequency ratios, or octaves? If you keep incrementing ratios you get some pretty interesting intervals that you have probably heard of:
2:1: Octave
3:2: Perfect fifth
4:3: Perfect fourth
5:4 Major third
6:5 Minor third
You’ll notice that these intervals are getting closer because the frequency ratios are getting smaller. Why are fourths and fifths “perfect”? Because they are the next simplest frequency ratios after the octave.
Grace seems to have perfect pitch, instantly recognizing an “A-below-middle-C major fifth, followed by an E-flat octave, and then a G-minor seventh”. This would be a feat for even a trained musician.
If we strip away the music theory abstraction and just use frequencies, the sequence is as follows:
A major fifth: 220 Hz to 330 Hz, or a 3:2 ratio
E flat octave: 331 Hz to 622 Hz, or a 2:1 ratio
G minor 7th: 196 Hz 245 Hz 294 Hz 343 Hz or a 4:5:6:7 ratio
So Rocky’s partners name, Adrian, in Eridianese is simply:
3:2 frequency ratio, 2:1 frequency ratio, and then 4:5:6:7 frequency ratio.
No human western music theory overlay required.
The actual musical sequence isn’t … that nice sounding. Here’s my interpretation on the keyboard:
It sounds a little discordant because you are flipping from A down a semitone to E flat, and then resolving to G minor 7th. These three chords are not in a shared key, and you get a tritone jump from A to E flat. The tritone is considered one of the most discordant intervals in western music.
But who am I to begrudge Eridian linguistic taste?
Why do integer ratios lead to consonance?
So I think we can accept that frequency ratios could be the basic building blocks of a universal harmonic language shared by both humans and Eridians. I think I have done a reasonable job at convincing you that we didn’t pick the intervals that provide the building blocks for western music out of thin air, but rather that they are grounded in physics.
In fact, these ideas originate with the ancient Greeks going all the way back to Pythagoras. He figured out that when you cut a string in half and pluck it, the frequency is exactly an octave higher – QED. For him, numbers were the underlying substance of reality, and harmony was part of that since it was reducible to simple ratios. The Greeks felt that the planets even moved according to mathematical ratios that formed a kind of cosmic music.
But this still leaves something unanswered. Discarding the metaphysical claims, why is it that we feel that frequency ratios of 2:1, 3:2, and 4:3 sound good?
What is it about whole integer ratios of vibration frequencies that is pleasing to the ear? The Pythagorean view that “well it should just be the case that simpler is better and more harmonious” isn’t sufficiently explanatory to me.
Enter psychoacoustics.
If you’ve ever sung in a choir, you probably know that when two people sing notes that are close but not identical, you can hear a kind of wavering in the tone. It sounds like a wub wub wub. It goes slower if the notes are very close together and faster if they are a little further apart. (See here for a quick demo.)
This is called beating. Singers actually use it to carefully tune themselves a few cents sharp or flat. Once you stop hearing the beating you know that you are in perfect unison with your co-choir member. If you sing two A notes, one at 440 Hz and one a little sharp at 442 Hz, you will hear a slow beating at 2 Hz, or two cycles per second.
Formally, people like Galileo and Mersenne figured out that beating is interference between two close frequencies, causing variable amplitude due to alternative cycles of destructive and constructive interference. So two frequencies that are close but not identical combine to create a third frequency, which cycles at the rate of the higher frequency minus the lower one. When frequencies are very close together you can hear clear slow beating, but as they get further apart, beating moves into the 10-20 Hz range and decomposes into “roughness” or general dissonance.
In 1877, Hermann Von Helmholtz decided to test whether people’s own subjective view of the consonance or dissonance between two notes matched the Pythagorean theory. He had people listen to pairs of frequencies and asked them whether the intervals sounded consonant or dissonant. His observations produced this graph.
You can see that there are peaks and troughs exactly where we’d expect them to be following the integer ratio theory. The unison is highly consonant, and then nearby tones are highly dissonant as we get beating. Then you have troughs at the round number ratios like the perfect fifth and the octave.
Helmholtz’ theory, now generally accepted, was actually that the simple ratios worked because they minimized the beating between the notes’ overtones.
Wait, what are overtones?
Well, as it turns out, when you play a note on virtually any instrument – especially strings or air columns like the flute or the human voice – you get a whole stack of notes, nested inside the one note. That’s right, unless you are creating a pure sine wave in a synthesizer, every note you have ever played contains the frequencies of a whole bunch of other notes. And those additional frequencies, called overtones or harmonics, are what give the instrument its specific character, called timbre (annoyingly pronounced tam-ber).
If a note has frequency f its overtones are 2f, 3f, 4f, and so on.
So when you play C2 on the piano, you get (in theory):
65.41 Hz (C2), plus 130.81 Hz (C – octave), 196.22 Hz (G – fifth), 261.63 Hz (C – octave), 327.03 Hz (E – major third), and so on
Unless you are listening carefully you won’t hear these additional notes, but they are there creating the shape or the form of the sound. The overtones are the “pianoness” or “fluteness” of the sound. Overtones determine whether an instrument sounds bright or dark, warm or metallic, or smooth or buzzy.
And overtones, as it turns out, are the mechanism through which integer ratios create consonance.
This is what Helmholtz figured out. Every note is a stack of frequencies, defined by the harmonic series. When frequencies are close, they pulse unpleasantly (beating). Consonance is simply a function of how much beating occurs between the overtones of two notes. Pythagoras didn’t have it quite right. Harmony is not a function of the frequency ratios of the fundamental tones in an interval. It’s a function of the absence of dissonance between their overtones.
Putting it all together, you get this chart, a more precise version of the dissonance chart from above. It maps the perceived dissonance as you play one tone C and sweep up an octave with a second tone. You can see the dissonance troughs at E (major third), F (perfect fourth), G (perfect fifth), and C again.
You can actually verify the overtones theory as opposed to the simpler Pythagorean approach for yourself with this quick demo.
Would the Eridians use western harmony?
So… back to our alien mineralized spider friends. What do they make of all this? Would they also appreciate what we think of as harmony?
Physically speaking, they experience pressure waves in a gaseous medium. They are highly perceptive so they would certainly be able to detect beating through close frequencies. Helmholtz’ mechanism should apply to them too.
They would also most likely notice the stability inherent in the ratios that gives us octaves, fifths and thirds, not because they have any preference for the numbers 2:1 or 3:2 but because those ratios minimize interference between their component frequencies.
“But wait,” the informed reader might ask. “Not all instruments are harmonic emitters like resonant strings (pianos, violins) or pipes (human voices, flutes, oboes). Some instruments have overtones which are not integer multiples of the fundamental, like bells, gongs, cymbals, or marimbas. 1D systems like strings or pipes are harmonic, but 2 or 3D systems tend to be inharmonic! Rocky is 3D!”
This is an astute observation. If Rocky and his Eridian friends were bell or gong-like in how they produced sound, they would not have neat islands of consonance around which they could craft a familiar harmony. If they were not harmonic emitters they might never come to appreciate integer ratios because it wouldn’t come naturally to them (unless they took the time to build Eridian violins and pianos). Human harmony, by contrast, is naturally imbued with Pythagoras’ ratios because our emitters are 1-dimensional tubes of air.
But the way Weir describes Rocky makes it explicitly clear that Eridians are harmonic instruments. Rocky produces sound by pushing gas past vibrating membranes inside a vocal pipe. We know this can be done at pressure because whales do it too. (And whales are harmonic too, in case you were wondering.) Eridians are mineralized, polyphonic whales. And we also know Eridians emit harmonic overtones because Rocky produces an octave, a major fifth, and a minor seventh!
But Rocky wouldn’t be completely beholden to western harmony. For one, he wouldn’t use 12 tone equal temperament (12-TET) as we do. Though I don’t have time to get into it fully here, 12-TET is the conventional western tuning system. We divide the octave into 12 equal logarithmic steps: 12 semitones. We do not do this for any fundamental physics reason. We can divide the octave arbitrarily. We needed a tuning system that gives us the “natural-sounding” intervals like the fifth and the third that conform to the ratios we discussed. A division into 12 doesn’t hit the ratios perfectly, but it’s close enough for us to not notice the differences. Before we had 12-TET we would play or sing in “just intonation” (mathematically ideal intervals), but these only work in one key, and don’t allow you to modulate to other keys. Because complex music involves moving between keys we needed something that worked in all keys; hence we found a division that sounded close enough to the frequency ratios while working identically in every key. The kicker is – you can divide the octave into more parts to get closer to the mathematically ideal ratios, which is why we have 19-EDO and 31-ET, but these are too complicated. Twelve semitones was the fewest notes available that also gave us reasonable approximations for what a fifth, a fourth, and a third should sound like. And not all humans use 12-TET anyway. For instance, in the middle east you have tunings like Maqam which uses quarter tones, giving musicians access to more notes in the scale and new harmonies. The only western rock band that has ever written a decent album in Maqam is King Gizzard and the Wizard Lizard. This is to say that tunings are definitely arbitrary.
So Rocky and the Eridians would most likely not be using 12-TET tuning like we do. In the book it is mentioned that they get together and do big “thrumming” sessions where they discuss things collectively through harmony. But I am assuming in these cases they are using just intonation and eliminating beating by pitching up or down a few cents. This is what barbershop quartets do. Not to mention, the Eridians have incredible memory and cognitive faculties and the ability to do mental math, so they could probably use 351-TET or whatever if they had to agree on a system. There would be no need for them to vastly simplify and compress the frequency space into a mere 12 semitones as we do.
Birds are a good case study here. We know that birds can detect beating. Certain birds are aware of and prefer the harmonic series. The Hermit Thrush, for instance, specifically chooses notes from the harmonic series (harmonics 3 through 12). A study of Great Tits found that individuals that were most able to precisely sing “in tune” along the harmonic series exhibited better sexual fitness and status. But of course, as with the Eridians, birds are blissfully unaware of the constraints of fixed tuned instruments that led us to develop 12-TET.
So we would still have marginal idiosyncratic differences, but by and large I do think it’s reasonable for Andy Weir to assign the Eridians a shared understanding of harmony. Though harmony seems extremely cultural, it derives ultimately from basic properties of physics. Harmony is not a human invention but a discovery. The octave, the fifth, and the third are not arbitrary conventions but local maxima, places where the physics of vibrating systems produce unusually simple, low-interference patterns. The Eridians would be drawn to these regions for the same underlying reasons we are.
And this is why it feels like a missed opportunity.
Project Hail Mary is a story about building a shared language across an unimaginable biological divide, and Weir does an admirable job of grounding these speculative elements in math and science. Music is one domain in which convergence would be plausible and perhaps inevitable, but Weir only ever hints at it by having Rocky speak in fifths and sevenths. We are never shown why that overlap exists. He gives us the answer without showing the work. In a novel built on first principles, that’s the one place where handwaving feels the least justified and most conspicuous.








This is one of the most autistic things I’ve ever read, cheers to you
Gonna be a couple weeks before I can get out to theaters to see it, so I'm gonna skip the article. Gotta feed the engagement algo for you tho. Ty for your hustle Nic.