Years ago I discovered Jim Bumgardner’s “Whitney Music Box”. Recently I recreated it in Elm.
TL;DR: check it out.
I’ve always been fascinated by the structure behind music. I’m fascinated with the structure behind any performance or experience, really. Going into college I planned to double-major in math and music, but one fateful night I printed “Hello, World” to the BlueJ console for CS 101 and realized almost immediately that something had to make room.
Most of my exploration into the structure behind music took the form of studying music theory, but there’s so much rigorous math behind sound that continues to capture my attention and imagination. I remember spending a lot of time wondering about the overtone series. Why does it all fall into place like that? Why do we experience those particular mechanical waves, arranged in the way they are, in the way that we do?
Really, I mean, what the heck is music?
I don’t remember exactly when, but sometime in college I stumbled upon this link to something called the “Whitney Music Box,” by Jim Bumgardner. I stared at it for hours, fueled by my existing academic interests (and, perhaps, some collegiate substances). It made me think of the overtone series, about that mysterious link between mathematical structure and musical experience.
Years later, it popped up on my radar again and rekindled a lot of these college-era thoughts. I decided re-creating it would be a fun, enlightening, and perhaps nostalgic little side project.
I’ve always found that one of the best ways to understand something, especially the structure behind some experience, is to try to create it myself. So I did!
I decided to build it in Elm because I’m always looking for more concrete Functional Programming experience. If you’re curious, here’s a link to the repo.
Diving into the AudioContext documentation surfaced a deeper technical challenge - I wasn’t familiar with how synthesizers / oscillators worked in general, which is what AudioContext provides an API for (if you want to design the tone programmatically rather than invoke audio files), and which the documentation assumes knowledge of.
I spent a few relatively confusing days looking around for answers. Here is the operant knowledge I consolidated.
We experience sound when mechanical sound waves in the air hit our ear drums. Different instruments produce sound waves with different shapes, which causes them to sound different (“timbre”, pronounced “tamber”). The simplest sound wave might look like the straightforward, smooth sine wave, and this sound wave produces a very pure, synth-y sound. Real-world instruments have all sorts of squiggles in their sound waves, corresponding to louder or softer overtones. An oscillator synthesizer can take the description of a periodic (repeating) wave and produce sound from it. If we give it a wave that looks a lot like, say, the wave a flute makes, the produced sound will sound a lot like a flute (though real-world instruments’ waves constantly change, which means they aren’t exactly periodic, and in general our ear can tell something is synthetic about a purely periodic wave).
So how do we programmatically describe arbitrarily-squiggly waves? The answer lies in the Fourier Theorem, which states that any periodic function (no matter how squiggly) can be approximated by combining some set of sine and cosine waves. This combination of sines and cosines is called a Fourier Series, where we use weights attached to sine and cosine waves of increasing frequencies to specify each component wave that we want to include in the combination. The AudioContext oscillator API takes in these weights, and produces a tone resulting in the combination of those weights and the corresponding sine and cosine terms.
I wasn’t quite wrapping my head around this until I found an excellent YouTube series about it, specifically this video and the one after that, so if you’re curious about this stuff I’d highly recommend giving these videos a watch.
Currently the sine terms are static, set to
[0, 0, 1, 0, 1]. The first value isn’t actually part of the Fourier series, it specifies the DC offset for the AudioContext API, which we ignore here by setting it to 0. The remaining values specify the amplitude of the fundamental tone and the overtones. So this set of terms specifies a tone with a muted fundamental tone and second overtone, with amplified first and third overtones. The result of this is the timbre of the sound you hear when the dots cross the line.
In the future, I’m considering making those sine terms easily configurable in the UI so people can play around with changing the tone played in the music box. But that might not be this project anymore… that might be a graphical oscillator interface. We’ll see.
Thank you for reading!