created 2025-03-31, & modified, =this.modified
rel: 33 1-3 Koji Kondos Super Mario Soundtrack Reading Sounds - Captioned Media and Popular Culture by Sean Zdenek
Why I’m reading
Mentioned in Reading Sounds - Captioned Media and Popular Culture by Sean Zdenek.
Games are interactive, and that relationship is intriguing to explore with music. It’s a different type of challenge, because during a film you know precisely when each scene transitions, and how long.
But this challenge also can, if the sound developer wishes, afford a dynamic and expressive experience that is also interactive.
This book is also from 2008, so it could be dated in a revealing way.
Introduction
Video games require radical revisions of older theories and approaches to sound in media.
On films it was written (and is now applicable to games):
whatever we do in our attempts to theorize, we need to welcome all the available sources of information, from all available perspectives, tainted or not, and try to put them in balance.
Games specifically use interactivity and nonlinearity. Interactivity is critiqued: ”All classical, and even more so modern, art is interactive in a number of ways. Ellipses in literary narration, missing details of object in visual art, and other representational shortcuts require the user to fill in the missing information.
Unlike the consumption of many other forms of media in which the audience is a more passive ‘‘receiver’’ of a sound signal, game players play an active role in the triggering of sound events in the game (including dialogue, ambient sounds, sound effects, and even musical events).
Main Points
- Understanding how and why games are different from or similar to film or other linear audiovisual media in terms of the needs of audio production and consumption is useful to our understanding of game audio in general.
- Technology and the constraints it has placed on the production of game audio throughout history. The relationship between tech and aesthetics in videos games is one of mutual influence.
The Rise of V.Games
Some novelty games that preceded video games were Bagatelle, a kind of bumper-billiards which developed into Pinball. Pinball games included various bells and buzzers that served to attract players and generate excitement.
Also in relation here, are slot machines. Early slot machines (1907) included a ringing bell with a winning combination. Many of the same companies that were developing pinball machines, also made slots. During the Prohibition many of these companies would become the first to market electronic video arcade games.
Sound was a key factor in generating the feeling of success.
The earliest Electronic games *Tennis for Two (1958) and Spacewar! (1962) had no sound. The first mass produced arcade computer game, Computer Space (1971) included a series of different sounds.
Pong was immediately successful and generated countless variants.
Sounds at this time were not an aesthetic decision fully, but were the result of the limited capabilities of the technology.
The balance was between making things good and thing that fit.
Space Invaders set an early precedent for continuous music.
‘Most music and sound in the arcade era (Donkey Kong and Mario Brothers) was designed little by little, by combining transistors, condensers, and resistance. And sometimes, music and sound were even created directly into the CPU port by writing 1s and 0s, and out- putting the wave that becomes sound at the end. In the era when ROM capacities were only 1K or 2K, you had to create all the tools by yourself. The switches that manifest addresses and data were placed side by side, so you have to write some- thing like ‘1, 0, 0, 0, 1’ literally by hand’
When referring to game processors (16-bit) the bits describe how much data the computer’s main processor can manipulate simultaneously.
Frogger was one of the first games to incorporate dynamic music.
Hip Tanaka:
The sound for games used to be regarded just as an effect, but I think it was around the time Metroid was in development when the sound started gaining more respect and began to be properly called game music… . Then, sound designers in many studios started to compete with each other by creating upbeat melodies for game music. The pop-like, lilting tunes were everywhere. The industry was delighted, but on the contrary, I wasn’t happy with the trend, because those melodies weren’t necessarily matched with the tastes and atmospheres that the games originally had. The sound design for Metroid was, therefore, intended to be the antithesis for that trend. I had a concept that the music for Metroid should be created not as game music, but as music the players feel as if they were encountering a living creature. I wanted to create the sound without any distinctions between music and sound effects… . As you know, the melody in Metroid is only used at the ending after you killed the Mother Brain. That’s because I wanted only a winner to have a catharsis at the maximum level. For this reason, I decided that melodies would be eliminated during the gameplay. By melody here I mean something that someone can sing or hum.
NOTE
Reminds me of Elden Ring OST, with this tonal ambiance sound that rubbed some players the wrong way at first.
Summary of Sound Synthesis
rel:
Granular Synthesis
- Programmable Sound Generators
- sound chips designed for audio applications. Instrument sounds are typically created with both a waveform (tone generator) and envelope generator
- Subtractive Synthesis
- common in PSGs, a waveform is created by an oscillator, and uses a filter to attenuate (subtract) specific frequencies, this signal is then passed to the amp to control the amplitude and envelope.
- Frequency Modulation (FM)
- 16-bit era advancement. It uses a modulating (usually sine) wave signal to change the pitch and another (known as the carrier). There can be many oscillators for each sound or instrument.
- Wavetable
- A more realistic sound FM synthesis, but is more expensive, requiring its own RAM or ROM.
- What the human ear recognizes most of each sound is the attack transient.
- Granular Synthesis
- Based on the principle of microsound, hundreds or thousands of small grains of sound are mixed together to create an amorphous atmosphere.
Sound Waves
Sound waves are described by three properties: wavelength, frequency and amplitude. (the fourth, velocity = wavelength x frequency)
- Sine waves have only one frequency, they are pure and have no harmonics. In games, they often have use as (laser, alarm) or for flute-like melodic parts.
- Sawtooth waves, or ramp waves have odd and even harmonics. They can produce bass parts as it resembles a warm, round sound.
- Pulse waves contain only odd harmonics, on and off slopes known as duty cycle. They are often describes as hollow sounding.
- Triangle waves contain only odd harmonics like pulse waves, but triangle waves harmonics finish faster, so the sound is smoother.
- Noise contains every frequency within a range. It is also commonly used for percussion or rain sounds.
DSP - Digital Signal Processing
- Echo - delayed signals 50ms or above
- Reverb - recreates sound waves off solid surfaces, typically with delay less than echo
- Chorus - a delayed sound added to the original with constant delay. Choruses may have more delayed sounds with multiple voices and layers.
- Time stretch - adjusting the speed of a sound without adjusting the pitch
- Compression - reducing the dynamic range of a sound. Compressors make loud sounds quieter and quiet sounds louder.
- EQ - attenuating (reducing or eliminating) the amplifying of various frequency bands
- Filtering - specific frequency ranges can be emphasized or attenuated.
NES
NES had five channel chip
- two pulse waves with 8 octave capabilities
- A triangle wave channel 1 octave lower
- noise channel for white noise percussion
- delta modulation channel (DMC) with two methods for sampling
- pulse code modulation for speech
- direct memory access for sound fx
Computers
The original Apple II had a one channel bipper for warnings and errors. Later versions improved the sound capabilities. Later versions improved sound capability and allowed third party soundcards (which became difficult to program for since different versions of the code had to be included to accommodate for each soundcard).
Classical music was common in earlier games, like C64 games. There was also mingling of popular and classic songs.
Cover tunes seemed largely the result of the whims of the games’ producers, and there was little or no concern for copyright infringements.
Certain games mirrored the Nintendo style of music, with longer overworld music and short boss loops. This approach to game audio suggests that what had developed even in the 8-bit era was as much a result of aesthetic as it was technology.
16-Bit and the Death of the Arcade
One of the major advances of the 16-bit era was FM modulation.
Some companies such as Nintendo and Konami used custom-made speech chips for their arcade games. Walking around the arcade in the late 1980s, the machines would literally call out the players, begging to be played.
Nintendo and Sega: The Home Console Wars
The PC Engine (TurboGrafx16) was the first 16-bit (actually a dual 8-bit) home console.
Sega’s coin-op arcade games were key to the console’s success. They understood the importance of software to drive hardware sales. Popular arcade games were ported to the Genesis.
Stratovox (1980) is the first game with speech synthesis. Among the voices the player hears are the phrases “Help me, help me”, “Very good!”, “We’ll be back”, and “Lucky”. The phrase “Help me” is played during attract mode. Around this time Berzerk also used speech synthesis in game.
It could be argued that in the Genesis, despite the increased quality of sounds (in terms of realistic instruments) there was little change from the music of 8-bit games. But sound effects became more realistic sounding and vocal samples were far in advance of their predecessors.
Toejam and Earl: Panic on Funkotron (1992) had moments where players “jam out” to the percussion in a simple Simon-type game.
One of the most distinguishing features of the Sega Genesis audio was the adoption of progressive-rock stylistic traits. The chip could mimic common progressive rock instruments.
it was a common element in some game music of the 16-bit era and beyond to avoid anything too ‘‘catchy’’ that might become annoying after many repetitions, in favor of various smaller melodic riffs which, collectively, could often be played like a longer epic soundtrack, with each tune thematically and instrumentally tied to each other.
In Sunsoft’s Pirates of Dark Water each distinct level had a different theme, but all music shared one bass line with a couple of melodic variations.
One final element worth noting, is the use of modal harmony, exotic modes and chromaticism.
MIDI
MIDI was designed in 1983 to allow musical devices to be compatible in a standardized format. Only code was transmitted rather than actual sounds. The standard laid out a template for 128 instruments and sound effects, so that the same number setting would be the same on any MIDI device.
LucastArts iMUSE (interactive music streaming engine) stopped the abrupt cuts seen in previous game music. Transitions could be mapped, for things like room changes. For example, a defeat sequence could be cued when the fight was lost.
VGM Comes to Age (32-64bit+, 90s)
CDs
CD-ROM tech meant that audio was not reliant on sound card synthesis. The downside was that discs could only hold a maximum of 72 minutes of uncompressed Redbook Audio.
RedBook Spec
- minimum duration is 4 seconds (including 2 second pause)
- maximum number of tracks is 99
3D or Surround Sound allowed for enhanced spatial fidelity of sound.
Beginning with Windows 95, there was DirectX, a series of multimedia APIs that improved the speed which graphics cards could communicate, and allowed specialized hardware features without having to write hardware-specific code.
Players could now ward off annoyance at repetitive looping by selecting their own music (like in The Sims). Starting with The Sims 2 players could also put MP3s into a folder which would be used in a radio in-game.
NOTE
This in-game radio which brings in elements of real life is cool. It’s an early point of the virtual world blending in the with desktop environment and reality.
The Sony Playstation
The Sony PlayStation had begun its life as a CD-ROM add-on component for Nintendo’s SNES system. Nintendo had joined forces with Sony to better compete with Sega in the video games market, but the two companies could not agree on the system, and Nintendo eventually signed a contract with Philips. Sony decided to press ahead with its own 32-bit system, the PlayStation
N64
Nintendo then jumped to 64bit, but did not rely on CD audio but a general MIDI-based system. There were several custom sound programs for Nintendo developers to compose music, such as MusyX.
Banjo and Kazooie (Rare, 1998, music by Grant Kirkhope), for instance, had a dynamic MIDI-based score, which changed instruments in the track as the player moved about various locales.
Dreamcast
Sega was in trouble after the Genesis. They released the 128bit Dreamcast but it failed to capture consumers despite impressive capabilities.
5.1 = five full bandwidth channels, the .1 stands for low frequency effects channel (subwoofer)
Technology, Process and Aesthetic
One way of organizing music is Music Cue Sheets like:
File | FileName | Action | Time | Notes |
---|---|---|---|---|
1 | dungeon_01 | nonlooped | 2:07 | Slightly dramatic or dark mood |
2 | caves_01 | looped | 1:37 | scarier, more foreboding than dungeon_01 |
3 | fields | looped | 1:00 | upbeat |
4 | gameover | nonlooped | :10 | decrescendo |
Gameplay, Genre and the Functions of Game Audio
Overview of different diegetic and non-diegetic examples. Kinetic gestural interaction - where the player as well as the character, bodily participates with the sound on screen. On a simple level any controller input is like this, but it is also present in games like donkey konga, or playing the guitar in guitar hero.
Linear music fails to adapt to onscreen events. Particularly important to games is the use of sound symbols, or leitmotifs to help identify goals and focus the player’s perception on certain objects.
Immersion, is ‘‘characterized by diminishing critical distance to what is shown and increasing emotional involvement in what is happening’’
Audio plays a role in immersion. Any kind of gameplay interruption, from drops in framerate or sluggish interaction of audio feedback, detracts.
Dynamic Game Music
Koji Kondo’s tenants of dynamic music
- ability to create music that changes each playthrough
- the ability to create multicolored production by transforming themes in the same composition
- the ability to add new surprises and increase gameplay enjoyment
- ability to add musical elements as gameplay features
In the Halo Combat Evolved score: ‘there is this ‘bored now’ switch, which is, ‘If you haven’t reached the part where you’re supposed to go into the alternative piece, and five minutes have gone by, just have a nice fadeout’’’
One method of reducing fatigue (stemmed from repetitive or predictable), especially in online games, is to have music fade in and out, slowly, at regular intervals.
Ten approaches to variability
- Variable tempo
- increase tempo if time is running out
- pitch
- each successful strike of the sword, the musical pitch raises
- rhythm/meter
- rhythmic changes for anticipation
- volume/dynamics
- drop volume in menu screens
- raise volume when a breakthrough occurs
- DSP/timbres
- delay for dazed effects
- reverb for dreamy scenes
- melodies (algo gen)
- algorithms can also control which instruments are presented
- harmony (chordal arrangements, key or mode)
- change of mood
- meditative spaces
- mixing
- bringing up percussion
- form (open form)
- form (branching parameter-based music)
Open Form and Branching
Open form has certain difficulties with Western Audience. The goal-oriented directed nature of linear music that exists in the West. Most music we listen to has a clear beginning, middle, and end, and the music is designed to progress toward a final cadence.
There is elimination of the dramatic curve. There is no climax in the music, it can very easily be an aural wallpaper.
A transition matrix builds from the idea of open form. Rapid complexity explosion can occur.