Understanding the Science of Music through Software Synthesis
Mark Dal Porto, Texas Women's University
To better understand the basics of electronic music today, the study of analog synthesis is extremely helpful. Analog synthesizers are unsurpassed in being able to demonstrate basic sound synthesis. By patching together various modules of an analog synthesizer using patch cords, one can effectively learn the basic rudiments of electronic music. Through visually seeing and hearing the manipulation of sounds in an analog synthesizer, one comes to learn and better understand the "mechanics" of how electronic music works.
However, how might one better understand the "science" of sound and music? In discussing this "science," I am referring to the study of the close interrelationship of music, physics, acoustics, psycho-acoustics, and mathematics. In discovering more about this interrelationship, I have found that the study and application of software sound synthesis is invaluable. The study of software sound synthesis to better understand the science of sound and music is comparable to studying analog synthesizers to acquire a more precise understanding of the basics of electronic music.
What, however, is software sound synthesis? How can it be used today to understand and aurally demonstrate basic physical laws that govern all music? In this article, I focus in on one specific program for software sound synthesis, Csound, and explore some general concepts about this particular branch of synthesis.
What Software Sound Synthesis Is
Software sound synthesis is a programming language designed to synthesize or create sound. It gives you all of today's digital synthesis and signal processing techniques and allows you to use them in any combination. It provides you with an unlimited number of individually controllable voices and is, in essence, the equivalent of hundreds of MIDI synthesizers. These software programs provide a means to design digital instruments using sound-producing and processing modules called "unit generators." These "unit generators" include oscillators, reverbs, delay lines, filters, and MIDI controllers. Instruments are created by "patching" together a series of these generators and assigning them whatever values are desired. The user is given the ability to manipulate these instruments with extreme precision. An accompanying musical score is written by the user in the form of a "note" or "event list," which specifies when and what notes each instrument is to play. When the instruments "play" the score, the program generates a series of discrete numbers which constitutes a digital representation of the sound (i.e., the samples), which it stores in "soundfiles." These soundfiles can then be sent to a digital-to-analog converter for playback. This converter transforms the digital sequence of numbers into corresponding analog voltages so that loudspeakers can playback the soundfile.
The first software languages for sound synthesis date back to the set of programs named Music 1 through Music 5 (Roads, 1983). They were written by Max Mathews at AT&T Bell Laboratories in the late 1950s and 1960s. Numerous descendants followed, adapting the basic model to new software technologies and new generations of computers. At present, three software sound synthesis packages are widely used in the research community: cmix, cmusic, and Csound (Pope, 1993). The Csound language has been chosen here above the cmix or cmusic languages for practical reasons. It is widely distributed by several Internet hosts and runs on many different computing platforms (see Appendix, page 11). The "c" beginning for these software packages indicates that all three are based on the "C" computer language.
The author of Csound is Barry Vercoe. Vercoe wrote his first software language for sound synthesis, MUSIC 360, back in 1969. To use MUSIC 360 back then, the composer would input a score by punching holes into paper cards--one card for each note! Tedious as this may seem, the program was used at some 40 universities and produced over 100 major works.
In 1971, Vercoe was invited to establish an electronic music facility at the Massachusetts Institute of Technology (MIT) where he continues to teach today. His first two years were spent designing a real-time digital synthesizer, and he wrote MUSIC 11 as a means of working out his hardware ideas in software. At the time, this language was written in such a way that it could only run on limited hardware, so in 1984 Vercoe decided to rewrite MUSIC 11 in the more portable C programming language, and Csound was born.
While there are numerous differences between these various programs when it comes to the details, software languages for sound synthesis all share one important similarity. Basically, they all follow a "modular" approach to sound synthesis. Each module (or "unit generator") is either creating or modifying a stream of numbers (an audio signal) and any number and type of modules may be "patched" together in a network. These modular networks are referred to as "computer instruments."
These "computer instruments" with their separate instrument modules are often represented graphically by flow charts which show symbols of the modules and the associated data flow between them (see Figures 1 and 2). This "data flow" has its analogy in analog synthesizers where one "patches" together various modules of the synthesizer via patch cords to produce the desired musical effect.
The following is a brief tutorial on how to create a soundfile using a Csound orchestra and score. You will see how all aspects of sound, i.e., pitch, harmony, timbre, durations, dynamics, articulation, special effects, yes, everything required to create sound is totally determined by you.
A Csound Computer Orchestra
Working with Csound starts by designing one or more instruments in a standard ASCII text file called an orchestra file. This file contains your own individually created instrumental timbres.
Example 1 illustrates a simple orchestra file. We will call it "simple.orc."

In the orchestra above, only the text in bold is the actual orchestra code. All text preceded by a semi-colon (;) are explanatory comments about the code. The semi-colon is necessary in order to prevent these comments from being read and interpreted as errors when the orchestra is instructed to "perform" the score. The first line of code (sr=22050) defines the sample rate (commas are not allowed in numbers greater than 999). Most composers will work at a sampling rate of around 20 kHz because that rate represents a good trade-off between sound quality, computation time, and storage requirements. For example, one minute of stereo sound at a 20K rate will require about 2 MB of storage. The number 22050 has been selected since most computer sound cards support this sample rate (this happens to be exactly the sample rate of a compact disc). The next line (nchnls=1) determines the number of channels we want our soundfile to be, whether it be mono, stereo, or quad (1, 2, or 4). We have set it to just one for a mono soundfile. On the next few lines of the file, our instrument is named "instr 1," referring to it as such when we create the score. The next line of code shows the first unit generator, an oscillator, which Csound calls "oscil." This is our actual source of sound. Additionally, every unit generator produces some "result"--it either makes a sound or modifies one. The result has been labeled "asig" which stands for "audio signal." We also have to give the oscil three values: amplitude (or "volume"), frequency (or "pitch"), and a number that references a statement in our score containing a description of its waveshape. The values set here are 10000 for amplitude (maximum possible number for amplitude is 32767), 440 for frequency (440 cycles per second), and 1 stands for function #1 found in the score file which defines the waveform used for oscil.
What happens to the sound the oscillator creates? In order to hear it, we must send it out to our "digital" amplifier. The "out" statement in the next line of code sends the "asig" to an output. You can think of this "out" statement as a patch cord between your synthesizer and amplifier (with a digital-to-audio converter in-between). The final line of code ("endin") tells the program that we are done designing the instrument (it stands for "end instrument").
Since the above oscillator has no envelope, when we actually hear it, it will tend to have a click or "bump" sound at the beginning and end since "oscil" is abruptly turned on and off. If, however, we shape the amplitude of our oscillator with an envelope, we can avoid this noise. Hence, Example 2 illustrates shaping our oscillator with an envelope.

In Example 2, another unit generator has been added to our orchestra called "linen." This is an envelope generator consisting of straight lines ("linen" stands for "linear envelope"). I have labeled the result of linen as "kenv" to stand for "kontrol envelope." (In Csound, results we want to hear are preceded by an "a" and controlling functions are preceded by a "k.") The linen in Csound consists of four "arguments": amplitude, a rise time (in seconds), total duration of note, and a decay time (in seconds). Hence, the above linen indicates an amplitude of 10000, a rise time of .05 seconds, a total duration which will be found in the third p-field (p3) of the score ("p" stands for "parameter"), and a decay time of .1 seconds. If the duration of our note (p3 in the score) is 1 (which stands for 1 second), the description of our envelope will be as follows: .05 seconds linear rise time, .85 seconds sustain time, and .1 seconds linear decay time. Notice that .05 + .85 + .1 = 1, the total duration of our note. Next, since we want to shape the amplitude of oscil, we plug (or "patch") the result of linen ("kenv") into the amplitude argument of oscil as shown above. Recognize, too, that we could put "kenv" into the frequency argument of oscil (instead of the amplitude argument) which would create an envelope of our frequency. This pitch envelope would result in an up, sustain, and down glissando effect.
As mentioned earlier, the computer instruments separate modules are frequently represented by flowcharts which shows symbols of the modules and how the data flows between them. Figure 1 is a flowchart of our previous (Example 2) computer instrument.
As it is, our instrument can only play one note at the given amplitude and pitch. So, for our next example we will use pointers that say, "Go look at the score to get these values." In our orchestra, we will instruct our linear envelope generator to get its amplitude from the fourth column in the score, which we will refer to as "p4." The oscillator will now be told to get its frequency (or "pitch") from column 5, or "p5." The envelope generators rise time will come from column 6, or "p6," and its decay time from column 7, or "p7." By using these pointers, our instrument will now be able to play different pitches at different amplitudes with different rise and decay times by looking at the score to obtain these values. The use of "pointers" in our instrument is now illustrated in the next example, Example 3.

Let us see our flowchart again with the orchestral "pointers" now inserted into the appropriate places. Figure 2 is a flowchart of our previous (Example 3) computer instrument.
A Csound Computer Score
Along with the orchestra file, you must also create a score file. The score file's main duty is to tell the orchestra what notes to play utilizing the instruments found in your orchestra.
Example 4 illustrates a simple score file. We will name it "simple.sco."
As was the case in the orchestra, all comments in the score above are preceded by a semicolon (;) and are ignored when the score is "performed" by the orchestra. Text in bold is the actual score code. The first thing required in the score is the function statement that describes the oscillators waveform. In our orchestra file, we told the oscillator to look for "statement 1" for its waveform definition, so we will put an "f1" for "function statement #1" at the beginning of this line. Using a set of routines called "Generative functions," (Gen functions) we can build any waveform we want by specifying its individual frequency and amplitude components. The one requested in the above score is a simple sine wave with only one harmonic (no "overtones" present).
After "f1" follows a "0" which refers to the start time for this function. The number "8192" specifies the size of the wavetable. This number tells the computer how many sample points it must use to represent this waveform as a wavetable. The number itself must be a power of 2 (e.g., 512, 1024...8192, etc.). A fairly large number here (such as the one we have used) will insure that the waveform is very accurately drawn for a "clean" sound. The "10" refers to the specific Gen function being used to build the wave. Gen function 10 assumes we want in-phase, harmonic partials. How many partials are there in a sine wave? One, of course, so we end the line with a "1" to represent the fundamental.

In order to hear any sound, we need to specify some actual notes to play. Immediately following the f-statement we created some "note statements." The "i1" in the first column (p1) tells Csound to use instrument 1 from our orchestra (which is the only instrument that we actually created in the orchestra file). Column 2 (p2) after the "i1" represents the notes start time, while column 3 (p3) is its duration (both measured in seconds). We told the instrument that it will find its amplitude in column 4 (p4) so we will build a nice crescendo by raising the level of each note. The frequency values of each note are stated in column 5 (p5) in cycles per second. These particular frequency values create an A-major scale. We also instructed our instrument that the notes rise time (in seconds) would be found in column 6 (p6) and its decay time (in seconds) in column 7 (p7). For maximum variety, each note played in our score has a different rise and decay time. On the last line of the score we use an "e" statement which simply indicates the "end" of the score file. It functions just as "endin" does for the orchestra file.
The above orchestra and score are now ready for performance. To direct the orchestra to play the score and compute a soundfile, we instruct our computer to execute the following command: "csound simple.orc simple.sco." The performance is then compiled into a soundfile and is now ready for playback through our computer sound card.
Of course, more can be done with this simple orchestra and score. You could also build chords by having several notes playing at once (just give those notes the same start time). Since we are working with cycles per second for indicating pitch, microtonal scales constructed of any interval would be extremely simple to create, while tempo changes merely involve adding a simple, one-line "tempo statement" anywhere within the score. If desired, we could easily enhance our instrument much further by creating a more complex waveform by adding more harmonics, or adding reverb, panning, etc. We are limited only by our imagination.
Other Possibilities
Sounds with any number of partials (both harmonic and non-harmonic), an infinite variety of envelopes, all kinds of effects processing (reverbs, filters, delays, etc.), panning, random noise generation, various synthesis techniques (additive, subtractive, FM, wavetable, etc.), analysis/resynthesis, and phase vocoder effects are all possible. Any number of pre-existing soundfiles can be "read in" by Csound and processed with various effects. It also has the ability to read real-time control data. Hence, Csound can be linked via various MIDI devices such as sequencers and MIDI controllers to play alongside MIDI-controlled music. Even Standard MIDI Files can be used to generate the actual notes played in place of Csound score files! Utility programs are available that transcribe Standard MIDI Files into Csound scores and Csound scores into Standard MIDI Files.
Because of Csound's tremendous flexibility, all kinds of various acoustical phenomena can be easily demonstrated with the utmost precision and control. Waveforms of all created sounds can be displayed in Csound (or any waveform editor) to better understand the physical nature and building blocks of sound. Because of this virtually unlimited power, software sound synthesis offers far more instructional value in being able to demonstrate the "science" of music far better than any hardware synthesizer could ever aspire to do.
"Interdisciplinary" is an important "buzzword" these days. Interdisciplinary education refers to the interrelated and simultaneous study of two or more disciplines. This method of teaching and learning is becoming increasingly important in today's modern society. Its philosophy is to better understand the world in which we live by associating and linking all the complex elements of life into a composite whole. With the assistance of Csound, many related disciplines can be studied simultaneously, e.g., music, physics, acoustics, psycho-acoustics, and mathematics. I had the privilege to team-teach an interdisciplinary course entitled "The Science of Sound and Music" with physics professor Dr. Duane Dolejsi at Northern State University in the Fall of 1993. I must admit that this interdisciplinary setting was for me personally one of the most enjoyable and enlightening things I have ever done. Throughout the whole course, the use of Csound proved to be an effective teaching and research tool. With it, not only was I able to demonstrate aurally and visually the building blocks of sound and music, but it also enabled me to easily present all kinds of acoustical phenomena, special effects, and the sounds of various tuning systems.
Where Do We Go From Here?
Csound can be effectively incorporated into a school curriculum in a variety of settings. In addition to the study of analog synthesis to better understand the basics of electronic music, software sound synthesis is a logical continuation of gaining a greater in-depth understanding of all aspects of electronic music. It can also be used as an important tool in an interdisciplinary environment where music, physics, acoustics, psycho-acoustics, and/or mathematics may be effectively studied together.
Csound is available anonymously from various Internet sites and is obtainable on a wide variety of computer platforms, including IBM and Macintosh. This availability provides any synthesist easy access to one of the more powerful sets of tools available today. If you are not yet familiar with software sound synthesis, you owe it to yourself to better acquaint yourself with this powerful instrument in music.
Bibliography
Articles
Boulanger, R., & Miller, D. (1990, January). Beyond MIDI: The return of computer music. Electronic Musician, pp. 60-69.
Pope, S.T. (1993). Machine Tongues XV: Three packages for software sound synthesis. Computer Music Journal, 17(2), pp. 23-54.
Books
Dodge, C., & Jerse, T. A. (1985). Computer music: synthesis, composition, and performance. Schirmer Books.
Mathews, M.V., & Pierce, J. R. (Eds.). (1989). Current directions in computer music research. MIT Press.
Moore, F. R. (1990). Elements of computer music. Prentice-Hall.
Roads C., (Ed.). (1989). The music machine. MIT Press.
Strawn, J., (Ed.). (1985). Digital audio signal processing: An anthology. A-R Editions.
Vercoe, B. (1993). Csound Manual.. MIT Press.
Journals
Computer Music Journal. Roads C., (Ed.). MIT Press.
Journal of the Acoustical Society of America.
Journal of the Audio Engineering Society.
World Wide Web Page
http://www.leeds.ac.uk/music/Man/c_front.html
Internet Sites (where Csound is available on various computer platforms):
Atari - ftp.maths.bath.ac.uk:pub/dream/*.ttp
MAC - cecelia.media.mit.edu:pub/Csound/csbeta/Csound.hqx
or ftp.ircam.fr:pub/incoming/Csound-SDII.sit.hqx
or ftp.maths.bath.ac.uk:pub/dream/Csound_68k.hqx
MAC - (with fpt coprocessor) - ftp.maths.bath.ac.uk:pub/dream/Csound_881.hqx
MAC - (with background processing) - ftp.latrobe.edu.au:pub/music/CsoundRB*.sit.hqx
NeXT - ftp.cs.orst.edu:/pub/next/binaries/sound (bundled with Csnd.app v1.6)
or ftp.maths.bath.ac.uk:pub/dream/Next68KCsound.tar.gz
PC/286-DOS (with fpt coprocessor) - ftp.maths.bath.ac.uk:pub/dream/csound_286_fpt.zip
PC/286-DOS (without fpt coprocessor)- ftp.maths.bath.ac.uk:pub/dream/csound_286.zip
PC/386/486/Pentium-DOS - ftp.maths.bath.ac.uk:pub/dream/csound_new.zip
PC/Windows - ftp.maths.bath.ac.uk:pub/dream/csoundr2.zip
PC/Windows95/NT - ftp.maths.bath.ac.uk:pub/dream/csound_win.zip
PowerPC (Power MAC) - beef.med.cornell.edu:pub/Csound.ppc.hqx
or ella.mills.edu:pub/ccm/csound.ppc/csound.ppc.hqx
or ftp.maths.bath.ac.uk:pub/dream/Csound.ppc.hqx
SGI - ftp.maths.bath.ac.uk:pub/dream/SGI/*
SUN - ftp.maths.bath.ac.uk:pub/dream/SPARC/csound (no utilities)
Csound Manual (obtainable at any of the following Internet Sites):
ASCII Text Version: ftp.bath.ac.uk:pub/jpff/Csound.manual
or vax1.umkc.edu:pub/dos/music/CSOUND.MANUAL
Postscript Version: ftp.bath.ac.uk:pub/jpff/Csound.man.ps.z
or vax1.umkc.edu:pub/dos/music/CSNDMAN2.PS