Movement and Sound, Controlling Digital Audio with softVNS
School of Music, The University of Georgia
The Proposal
In Fall 1999, Bala Sarasvati, director of CORE Concert Dance Company, the University of Georgias premiere modern dance ensemble, invited me to create a work in collaboration with her ensemble. This was to be the fourth "formal" collaboration with CORE, and the desire was to make the work something unusual and, perhaps, more vital for the dancers. In the previous works, the composer and choreographer independently created pieces that would essentially stand alone. The choreographer and dancers created vital, connected movement that might be combined with any appropriate music. Meanwhile, Ias composerwrote a piece that would be successful with or without the dancers. These two more or less complete and separate parts, when combined into an artistic whole, had always resulted in an effective, thought-provoking statement. As a composer, it was both surprising and pleasing to watch the choreography and have new vistas opened as I listened to and reconsidered the already completed sound sculpture. The sound sculpture itself infused the dancers with renewed energy and focus, validating and emphasizing their previously conceived choreography.
The Opportunity
Although the previous manner of collaboration had always been effective, my own evolving interest in live, interactive performance suggested a different approach to the new project. Perhaps this was an opportunity to allow the dancers to become more involved in the actual sound production as the work unfolded. When approached with this idea, CORE was intrigued and eager to participate. Always interested and actively involved in technological advances, Ms. Sarasvati was known for her efforts to combine dance and computer enhanced video. Most, if not all, of these efforts, while producing interesting and effective works of art, used outside sound sources or addressed instruments and sound cards that responded to the Musical Interface Digital Instrument (MIDI) language. Additionally, while the technology was innovative and, in many cases, had been developed specifically for a particular work by the technicians/musicians involved in the projects, the dancer (there was usually only one) was restricted in terms of the available dance space. This restriction came from the need to focus the camera on one particular field in order to provide specific hot spots that the dancer would use to activate MIDI sequences or video effects.
The Challenge
In addition to my interest in interactive composition and performance, my preference for audio sources has always been real sound processed to obtain the desired musical effect rather than patches or performances found on various synthesizers or MIDI-capable sound modules. My previous music had predominantly involved sampled audio files manipulated and arranged in digital audio sequencers such as Sound Designer II, Session Eight, Pro Tools, or Studio Vision Pro. This approachusing real world, concrete sounds, manipulated through analog and/or digital techniquespresented intriguing possibilities. Would it be possible for the dancers to initiate and, more importantly, control digital audio files during the dance? Would we be able to expand the dance floor to enable more than one dancer to interact with the files, or to provide space for suitable improvisation? If we found equipment and software to accomplish these desired elements, would we be able to afford it? As these questions were considered, a decision was made to commit to the use of digital audio in the proposed work.
We began to search for programs that would allow them to activate, deactivate, and manipulate audio using physical gestures associated with the dance. It was hoped that the dancers, once they realized they could initiate and affect the sound, would begin to shape their movement to interact with the sound they were creating. Additional concerns arose due to limited funding for the musical part of the project. The chosen software application had to be compatible with the Macintosh platform and, ideally, be able to work with or increase the sophistication of software already on hand. This capability would eliminate the need to learn more than one program and reduce the funds required to purchase equipment. A reasonably priced video capture system that would be compatible with the existing computer and software was desired. Additionally, the system and its software should work with relatively inexpensive cameras, contain its own analog-to-digital converter, and present a quality image to enable critical extraction of motion data.
The Application
After researching the available systems and software, the Very Nervous System III seemed the most likely application for the proposed project. This hybrid system, developed in the early eighties by David Rokeby, used moderately priced video cameras and dedicated external digitizers to capture, convert, and extract motion information from live video. Additionally, it required the platform of choice to perform its functions. I contacted Mr. Rokeby, who informed me that he had just developed a new, software-only version named softVNS that provided most of the functionality of the hybrid VNSIII. There were, of course, advantages and disadvantages to the new version. Where the VNSIII system required the external converters to receive and digitize video, softVNS used the new ATI Technologies Xclaim VR 128, an internal PCI card, to capture and convert video. Use of a video capture card that fit in an available, internal slot of the Macintosh would eliminate the transport of additional equipment to the performance site. On the other hand, the VNSIII hardware system provided support for multiple cameras while softVNS only supported one video input, a limitation of the Xclaim VR 128 card. Additionally, the softVNS software did not support multiple Xclaim VR 128 cards. An extremely inviting aspect of both VNSIII and softVNS was the fact that they both employed Opcodes MAX as the basic software system. MAX, already installed on the computer and equipped with the MSP object-set, was capable of providing the desired real-time audio processing.
The Piece
In preparatory meetings, an overall theme began to emerge for the project. The underlying "script" involved several sections. First, a street person would be introduced to the audience through some initial activity. After that activity she would fall asleep and begin to dream. As the dream progressed she would enter the dream, first as an observer and then as a full participant. Finally, upon awakening, she would return to her real situation and slowly exit the stage. Due to the clearly defined sections of the projected piece and the implied street life orientation, the initial working title, CityScapes, quickly became StreetScenes. The varied levels of simultaneous activity that might occur during the work's unfolding called for a multi-level electronic approach allowing several different audio events to be mixed at a summing stage. To provide a cohesive work, it was determined that a sound backdrop would have to be employed. Example 1, below, shows the master patch for StreetScenes, Master SSv1-Mix. This patcher provided a summing stage for vocal input (the Sscenes1 object), the audio backdrop (the Prerecorded object), and the dancers input (Dancers 1). It also times the vocal inputs contribution to the work.
The Dancer 1 object in the MasterSSv1 window (Example 1) represents the patcher that contains the softVNS object and provides control for the audio files the dancers would initiate and effect. As shown in Example 2, the softVNS object consists of an input at the top left of the box and eight outputs that provide the indicated information.
Example 1. Master for StreetScenes, v1.

Example 2. The softVNS object with input and output routings identified.

The actual working state of the softVNS objectin this case, as part of the MAX StreetScenes v1 patcheris shown in Example 3. The sensitivity of the softVNS object is adjusted using the "sense $1" message box and a MAX number box to set the level. The total motion value extracted from the incoming video is obtained from output one of the object. The "unpack" object separates the extracted motion data into six separate regions. In StreetScenes v.1 only four regions of the six were monitored. Due to the enormous amount of data extracted from captured video, and the probable high rate of information turnover, a softVNS system "smooth" object was used to filter the data rate and slow the effect on the audio. Finally, before sending the data to the objects that played the soundfiles, the smoothed data was reduced using the division object to bring numbers into a useable range.
Example 3. SoftVNS portion of the Dancers1 patch.

Motion extraction using the softVNS object is configured by double clicking the object, itself. As seen in Example 4, when you initially enter the object you have a number of choices, defined by the tabs at the top of the dialog box. The Presets tab provides a variety of grids, a choice of a number of rows or columns, the opportunity to load and save setups, and an option to move immediately to the head tracking process. The Process tab allows the user to select from Presence, Motion, Light Levels, or RGB processes. Creating the map that will provide trigger areas for specific events is accomplished under the Map tab. There, the user can either select a simple grid or freely draw regions to selectively separate data. The ability to adjust the incoming image for the best resolution and use either the NTSC or PAL system is provided under the Image tab.
In Example 4, we see the available Map tab selections. The image display window reflects a simple grid of one row and six columns. This particular configuration was used for StreetScenes v.1. All columns were active when softVNS was turned on, but in StreetScenes v.1 only the four inner regions were monitored for motion data. This is reflected, in Example 3 by the two unconnected outputs at either end of the "unpack" object. Although there was no adverse reaction in StreetScenes v.1, this situation is noted due to processor impact (shown just below the image display window in Example 4). Although not monitored, motion data was still captured and the processor still taxed during StreetScenes v.1 even though the data was not used in calculations.
Captured motion data in StreetScenes v.1 was routed to four sound stacks to handle the dancers sound production. Example 5 shows the "sfplay~" object responsible for playing designated sound files. Also shown is the processing stack that applied incoming motion data to control the amplitude of the audio sent to the two MSP "* ~" amplifiers from "sfplay~."
Audio files were routed to the "sfplay~" objects in three sets of four files that were sent on the composers initiative and controlled by the "sfctrl~" object. Example 6 contains the area of the Dancer1 patcher responsible for this activity.
Example 4. The Setup Window:

Example 5. The "sfplay~" object and accompanying amplitude stack.

The Performance
StreetScenes v.1 received three performances during the Spring 2000 Annual Seasonal Performance of the CORE Concert Dance Company. While it is impossible to provide specific examples of the resulting installation in a written document, a QuickTime file that uses excerpts of the performance to demonstrate the use softVNS is located on the Dance Center for New Music web-site.
Example 6. The audio file section of the Dancer1 patcher.

Conclusions
SoftVNS is an extremely adaptable, flexible, and powerful tool for capturing and applying motion data to real-time digital audio processing. The program comes with an informative set of examples, a supportive manual, and a painless installation process. While the principal activity in StreetScenes v.1 involved initiating a sound file and then controlling its amplitude, extending the datas influence to control timbre, pitch, and/or location should be relatively easy. SoftVNS is under constant development as David Rokeby uses the program for his own installations. Currently, a beta version of softVNS supports multiple cameras as long as additional ATI Xclaim VR 128 cards are available. The final version of update 1.10 promises increased color support and, possibly, support for additional video capture cards. Version 2.0, already in development, promises increased flexibility through a modular software design.
Related Web Sites:
http://www.interlog.com/~drokeby/vnsII.html Provides historical and technical information concerning the evolution of the Very Nervous System.
http://www..interlog.com/~drokeby/softVNS.html Provides basic softVNS information, an online version of the softVNS manual, and upgrade information.