Real-time Interactive Digital Signal Processing of Audio using MSP

Leonard V. Ball, Jr.

School of Music, The University of Georgia

lvball@arches.uga.edu

The control of MIDI instruments, sequencers, and processors has become a standard technique for composers. Real-time digital signal processing of previously recorded or live audio, however, without the aid of "out-board" effects, has been problematic, at best. Until recently, techniques involving real-time control using the Macintosh platform could only be extended to live or sampled audio through extraordinary efforts to link software and hardware designed for disparate purposes. Completed works with which the author is familiar typically involved more than four simultaneously operating software programs, external signal processors, bulky digital converters, and multi-track digital recorders. MSP replaces these extended, intricate set-ups.

John Harvey's Missa Pro Mundo Morente, a 1998 multimedia work created for and produced at the University of Georgia Center for New Music, made use of a variety of software packages and hardware devices. Seven different software programs were required to complete the work. Opcode Music System's MAX initiated routines to record and reproduce information from the performance hall according to previously programmed instructions. Digidesign, Inc.'s Pro Tools and SampleCell II editor software, responding to the MAX call, recorded, stored and reproduced sampled audio during the performance. Opcode's Open Music System (OMS), MIDI Machine Control (MMC) extension, and Inter-application Communication (IAC) buss allowed intra-computer communication between the various software programs. Scripting utilities such as Apple Computer, Inc.'s Applescript and Binary Software's KeyQuencer provided the communication "glue" for the project. Hardware for the work included a TASCAM DA-88, Digidesign, Inc.'s Pro Tools DSP processing and routing cards in the company's expansion chassis, a Digidesign, Inc. 888 I/O Audio Interface, an Opcode Studio 5 MIDI Interface, and an Infusion Systems, Inc. I-Cube digitizer. During the performance, audience noise and a choral performance were systematically recorded, processed, and returned to the performance space as part of a complete multimedia work involving sound and video. While successfully implemented, the fact that the Harvey work was produced in Roger and Phyllis Dancz Hall of the Center for New Music–an environment that specializes in this type of event–is noteworthy. To produce the work in another performance space, even if it had the requisite quadraphonic audio resources, would have been extremely difficult due to the extensive computer and digital audio requirements.

At the same time that Missa Pro Mundo Morente was in the preparatory stages at the University of Georgia, a new audio capability was being introduced to the MAX programming environment. Cycling '74, a company founded by David Zicarelli, announced the release of MSP. A set of object extensions for the MAX 3.5 environment, MSP allows the real-time manipulation of digital audio using the same graphical interface techniques already familiar to MAX users. To quote the Cycling '74 web-page description of MSP:

For anyone who has used a patchable analog synthesizer, using MSP will be a familiar experience. Put a bunch of modules on the screen. Connect an oscillator to a filter using patch cords, and the sound instantly changes to reflect what you just did. However, unlike the analog equipment, you only run out of patch cords and modules when you hit the processing capacity of your CPU.

MSP is also familiar to Max users, since it introduces only one new idea to Max: that of audio signal connections. MSP audio objects have inlet and outlet just like normal MAX objects. But instead of a control system, MSP objects describe a signal processing algorithm that normal MAX objects control.

These extensions, over 75 objects, provide extremely sophisticated "on-board" processing capabilities with the added plus of real-time manipulation by performers using MIDI-capable instruments. As indicated in the above quote, MAX users work in a familiar environment, connecting modules (objects) in the "Patcher" with patch cords. Audio cables are differentiated from normal MAX data paths by adding a thicker, color-coded "sheath." MSP audio objects are identified with a tilde after the object name. Audio can be input from a variety of sources. The Apple Macintosh provides input from the internal CD, DVD (if the computer is so equipped), the Apple microphone, and external audio sources routed through the Macintosh computer's stereo input (Sound In). Audio files might also be created and recorded on the spot using DSP algorithms either provided with MSP or created by the user. Additionally, previously prepared sound files might be downloaded from other computers, storage drives, or the Internet. These files can be either Apple Computer's Audio Interface File Format (AIFF) files or Digidesign, Inc.'s Sound Designer II (SD II) files. All that is necessary to work with audio is a fast Macintosh computer with adequate memory and, ideally, the processing power provided by a PowerPC chip. Currently, the Macintosh G3 is recommended.

In order to demonstrate MSP's unique, self-contained environment, the rest of this document will be dedicated to developing a Patcher that will process audio in the same manner as the set-up for the "Introit" of the Harvey work. To accomplish this project, objects from both MAX and MSP will be used. For this reason, from this point on, the programming environment will be referred to as MAX/MSP. Example 1, below, diagrams the audio software and hardware used in the Harvey Mass.

Example 1: Hardware and software requirements for the Harvey Mass.

In this set-up, data from the I-Cube digitizer was used to determine when the ambient sound of the performance space would be recorded and reproduced. After ten people entered the performance space, MAX, emulating MIDI control information that would normally be sent by a JL Cooper CS 10 Control Station, initiated periodic record and playback sessions using Pro Tools and SampleCell II. These sessions were timed to produce thirty-second Sound Designer II files. Applescript and KeyQuencer played vital roles during this stage of the project. During playback through SampleCell II, MIDI control data generated by MAX was used to vary the frequency, panning, and amplitude of the newly recorded sound files as they were reproduced back into the performance space. After sixty people had entered the performance space, MAX started playback of previously-recorded and processed four-channel audio contained on the TASCAM DA-88 by issuing a play command in the MIDI Machine Control protocol. This command was routed via the Studio 5 interface.

In order to reproduce the above set-up, MAX/MSP would have to provide four distinct capabilities. First, MAX/MSP would have to be able to play previously recorded material upon demand. Second, the software should be able to record audio into sound files for subsequent playback. Third, MAX/MSP would have to be able to generate data to manipulate soundfiles while they were being played back through the system. Finally, MAX/MSP should provide at least four channels of simultaneous, discrete playback.

MAX/MSP does indeed have the capability to play back pre-recorded soundfiles upon demand. In order to accomplish this, one must first load the file into a storage area (a buffer) where it will be held until MAX/MSP is directed to "read" it. Upon demand, MAX/MSP will go to the named buffer and read the soundfile using additional information supplied at that time. Example 2, below, shows a MAX/MSP patcher window with a basic configuration capable of playing back a stereo soundfile. The input at the top of the page (a) receives a bang from a "Loadbang" object in the main patcher. This inlet sends the bang to three points in the patcher. First, the inlet sends the bang to a delay object (b) that holds it for sixty seconds. After sixty seconds, the bang is used to trigger a message to the "line" object (c) initiating a one-minute fade-in of the file once playback begins. "Line" does this by using the start value ("0"), the end value ("1"), and the time period ("60000" milliseconds) contained in the message box (d) to create a ramp of control values that are sent to the amplifiers (e). These are simple multiplier objects that control the amplitude of outgoing signals. Next, the bang from the inlet (a) is connected to a message box (f) which insures that the required soundfile is loaded into the "sfplay~" object (g) when the main window opens. The third location to receive the incoming bang is the delay object (h) at the left of the window. This object ensures the file will begin playing one minute (60,000 milliseconds) after the main window opens. The playback is routed back to the main patcher through the outlets at the bottom of the page (i). Finally, as the "line" object is triggered a bang is sent back to the main patch through an outlet (j) to start a one-minute crossfade with the playback of files recorded from the performance space

Example 2: MAX/MSP Patcher designed to playback soundfiles upon demand.

The ability to record and playback audio in real time is also easily attainable using MAX/MSP. First, to ensure that continuous audio is available for reproduction, at least two data buffers have to be created to hold successive audio recordings. Then, at a predetermined cue, audio is successively recorded into those buffers. As the second buffer fills, the first, already loaded, is available for playback, and vice versa. Example 3 contains a possible MAX/MSP patcher for the prescribed activity. In this example, a "Loadbang" object (a) has been substituted to begin the patch when the window opens rather than information from the I-Cube, as in the Harvey work. It would be a fairly simple matter, however, to use an "I-Cube" object routed to a "Select" object targeted for the number ten. When that number is received at the input, the "Select" object would output a "1" which would be interpreted as a bang to start the recording process. The "delay" object (b) set for one thousand milliseconds gives the window a chance to open before the analog-to-digital converter ("adc~") activates (c). The three five-second "delay" objects (d) set up the timing sequence so that the "record" objects (e) alternate their activity. The two buffers (f), located just below the LED meter used to view the input level of the microphone (g), are supplied with a "clear" message to empty the buffers of any unwanted material. This particular window also provides a button to start the recording process manually (h) and the prudent "all stop" to quit recording and playback when desired (i). Unattached lines at the right margin (j) connect to the playback section of the Patcher.

The third requirement, the ability to manipulate soundfiles while they are being played back through the MAX/MSP system, is established in Example 4, below. In true MAX fashion, we will cover systems in this Patcher as we have in the previous examples, from right to left. The two "systems" at the right of the window execute the playback from the buffers created and filled by the portion of the patcher described in Example 3 above. The "groove~" objects (a) call the appropriate buffers while the sub-systems labeled "Playback Direction" (b) alternate forward and reverse readings of the buffers (an extension of the process used in the Harvey work). The five second "delay" objects seen in Example 3 above activate the appropriate playback systems, reading one buffer while the other receives new information. While each buffer is read, a sub-system labeled "Amplitude Modulation" (c) provides random signals to alter the resulting output of the audio as it is reproduced. While the modulation rate remains constant at 6 Hz, the depth of the modulation is altered based on numbers generated by the "drunk" object (d). To prevent exceeding a reasonable depth, this modulation index is limited to a top value of .30 and a minimum value of zero (no modulation).

Example 3: Audio recording in MAX/MSP

Example 4: Audio Playback and more in MAX/MSP.

The Panning system (e), is designed to provide random panning of the audio that is sent to the "dac~" object (f), the digital-to-analog converter that outputs the audio. The process uses amplitude to place the audio at various points in the stereo field. A "metro" object (g) triggers the random number generator (h) every 500 milliseconds to generate a number between 0 and 99. This number is then converted into a decimal value (a float) between 0 and 1 using a multiplier (i). The resulting number is subtracted from 1 in order to get its complement and the complement is converted to a positive value (j). Both the original decimal figure and its complement are then sent to multipliers (k) which act as signal amplifiers to the "dac~". To provide more dynamic control of both the panning and frequency modulation, MIDI information could be routed from a synthesizer, wind-controller, or even from a digital camera as control information to place the audio in the stereo field or to manipulate the modulation index for frequency or amplitude modulation. The small system at the lower left of Example 4 contains the playback of prerecorded material. This object, TDML1, contains the sub-patch previously viewed in Example 2 above. Left and right output from that sub-patch is routed into the "dac~" object (f). The crossfade output (labeled "xfd") routes the bang from the sub-patch ("(j)" in Example 2) to the line object (m) in order to initiate the required decrescendo of manipulated performance-space audio.

The complete patcher is shown in Example 5 below. Record, playback, panning, and prerecorded playback sections, all explained in previous examples, are clearly labeled for easy reference. When this patcher is opened, the sub-patch TDML1 is automatically loaded. The window shown in this example, therefore, is all that a performer would need to access in order to execute the patch.

Example 5: The complete Patcher window.

The final requirement–four channels of simultaneous, discrete audio playback–is not currently available without additional hardware. Audio output on the Macintosh computer is limited to mini-stereo pins and on-board audio converters that can only be termed "adequate." If higher quality conversion and more channels are desired, however, there is a PCI expansion chassis that provides the possibility of professional quality throughput in four or more channels. One company, Magma, Inc., offers a 4-slot PCI expansion chassis with ears for mounting the unit in a standard 19" rack. This chassis, which fits neatly under a PowerBook G3, also has a host interface option specifically for that portable laptop. With the Magma expansion chassis, several PCI-based audio cards are available, including current Digidesign Pro Tools hardware that exceeds the processing power of the equipment used in the Harvey composition.

Conclusion

MSP offers an alternative to large equipment set-ups when processing digital audio in interactive environments. If the basic MAX program is already familiar to the user, the additional learning curve mainly involves information concerning digital audio and processing methods. This learning curve can be steep, depending on a user’s previous audio experience. However, when compared to learning several diverse software programs; finding, installing, and becoming adept at using different hardware; and establishing a stable, dependable working environment in which all software and hardware can operate, MSP is a viable, cost effective, and efficient tool for today’s composer.

Reference Links

http://www.cycling74.com

http://www.digidesign.com

http://www.infusionsystems.com

References

Dobrian, J. Christopher (Revision 1.1, June 1998) MSP, The Documentation; What you can do with it.

San Francisco, CA: Cycling '74.