Bounce Metronome - Visual & Auditory Synchronization

Robert Walker, a.k.a. "Robert the Inventor," developed a graphic software program called that produces both an auditory cueing system similar to a click track and a visual cueing system using digital video. Bounce Metronome offers a selection of vectored shapes that mimic a conductor's baton for the visual cueing system. The visual cues, which can vary from a bouncing ball to a cone-shaped baton, articulate over horizontal planes and along elliptical paths. Different metronomic patterns can be displayed at different tempi and changed almost without limitation. Bounce Metronome is a flexible and nimble product, and I have performed numerous tests with this program for application with polytemporal music.

Since the software program can output video (avi wrapper, NTSC format at 30 fps), my strategy was to use Bounce Metronome to "conduct" each part of a polytemporal composition. The various instrumental parts would then be synchronized using a couple of different strategies I describe below and on the Network Video page.

It is assumed for the purpose of this application, that the audio click track produced with each rendering of a Bouncing Metronome project file would be discarded. I have incorporated the audio click track into the examples below only as a means of testing the accuracy of Bounce Metronome's visual component. It is the application of Bounce Metronome's visual stimuli that favors its adoption for use during the performance of polytemporal music.

I would also point out that the shape of the conducting path can be modified in Bounce Metronome. In most tests shown on this page, I superimposed an elliptical conducting path with a visible circumference because that path uses the least amount of horizontal space. However, in the video shown above, I've confined the path to a more traditional, horizontal plane, which is the path most easily recognized by musicians. The tempo dial and the readout of the elapsed time would be excluded from a final render as they provide no useful information to the performer.

For polytemporal composers using the Elapsed Time with a Stopwatch approach, or for those using an audible Click Track to synchronize the parts, the display of elapsed time and the measure count demonstrated here might be helpful. For composers using the Network Video approach, such as myself, the tempo readout dial and elapsed time would be eliminated during a final render as they are distracting for the performers. These elements are included here simply for testing purposes.

Playback Configurations

The overall approach is to assign a visually conducted graphic display of each part (tempo map) to its own area of the video screen. For example, when composing for a string quartet, I produce four different videos, each containing a separate tempo map appropriate for that part. After rendering each part to its own video file, I have a couple of options for constructing the final, visual playback format.

Option 1-A:

Under Option 1-A I could combine the four separate videos onto a single split screen where each part is occupying one quadrant, as shown above. All of the performing musicians would be watching one, large monitor or projected image and following the moving baton in the quadrant that represents their instrument. Personally, I've never tried this option, but technically it's possible and rather simple to set up.

A "beat number" has been added to the tip of the baton. Such features are optional elements available in Bounce Metronome. There are other ways to impart more precise information to the progress of the instruments during the construction phase, but to create a usable performance configuration, this rendering includes all that is needed.

Option 1-B:

To avoid the problem of positioning a big monitor convenient to the view of all players, a smaller monitor can be assigned to each player and attached to their music stand. In this case, the monitor could be a tablet or mobile phone (providing you have access to the input of that device), a laptop, or just a small separate monitor typically used in video production. This configuration would use the same video file as Option 1-A. To better convey the concept of this option, a diagram is shown on the right and can be downloaded here. This configuration is accomplished by splitting a single video output into four separate monitor display lines, and each of those display lines then feeds a separate monitor attached to the music stand or platform. Each player would be watching the same split-screen video, but a piece of cardboard would be placed over the screen to cover the three inappropriate views not used by that particular player. Splitting monitor feeds is a relatively cheap and common practice. I have used this technique where set-up time is at a minimum. However, I caution against using mobile phones as monitors. Dividing the screen into 4 quadrants produces an image too small for practical use. Instead, a small, 15" monitor (measured diagonally) with a 16:9 aspect ratio produces acceptable results.

Option 2:

In this variation, a Bounce Metronome file is created for each instrument in a string quartet and rendered to a separate video file. Those four video files would are then played back using four computers, which are synchronized over an ethernet network. The layout for this option is shown on the right and can be downloaded as a pdf here.

This option represents the Video Network approach as described on a different page of this website. In the Video Network approach, each musician would have their own computer and monitor (or similar playback device such as a tablet or laptop) which is hardwired (avoid wireless connections) to an ethernet network, referred to in the IT world as a LAN. There are no split screens in use as only that player's part appears on their own monitor driven by their own computer. In this approach to synchronization, a master computer (think of adding one more computer to the configuration) is used to control the other four computers. In other words, the 4 computers chase the timing dictated by that one master computer. The master computer also starts and stops all four computers precisely at the same time. (In actual use, one of the four computers can be designated as the master, which eliminates the need for a separate, fifth computer.)

The terms slave and master are the same words used to identify the relationship between networked computers in the IT world, but the actual protocol is a little different. This synchronization scheme used in a Networked Video configuration was perfected during the era of analog video, which reaches back about 30 years. Even the word "sync" has a different meaning in these two fields.

With reference to the Option 2 configuration, Bounce Metronome does not have the capability to link all of the performers' computers together via an ethernet controller. Bounce Metronome could, however, be used to create the four independent video files which would eventually be shown over the ethernet video network. The advantage of the Video Network approach lies in its capability for unlimited expansion and the flexibility it offers composers for being able to accommodate almost any requirement they might conceive.

I have not produced a Networked Video project where Bounce Metronome was used to create the complete and final synchronizing video files. Instead, I have always conducted the visual cues myself, recording each part separately, and then forming them into an Option 1 or Option 2 configuration. When conducting each part, I follow the printed score and listen to the click track from the midi file for that instrument. When I tried using Bouncing Metronome for this phase, I found myself spending too much time programming each part in the many sub-menus of Bounce Metronome. This is likely a comment on my own shortcomings, where someone else might have a much shorter learning curve. Once I started composing with nested ramps, disassociated tempi and time signatures, it was much quicker to simply conduct the parts myself.

Because Networked Video was the most flexible system to work with when I started explorying polytemporal composition, it has become my preferred approach to synchronizing polytemporal compositions. If Robert Walker ever changes his coding to allow midi input, his software would be a welcome component to the field of polytemporal music. As you can probably imagine, he has been asked this midi-input question so many times the response is now published on his website. More on this topic can be found below and on the Research - Visual Graphics for Synchronizing page.

So . . .

What would the visual synchronizing program for an Option 1 project generated by Bounce Metronome look like? I have performed several tests along this line, a small sample of which are included herein. Robert Walker has an example of a phasing program on his own website, although it was not designed to be used as a synchronizing system for live performance.

The image above is a screen capture from Pro Bounce Metronome where two instruments are involved in the phasing technique. The aspect ratio of the finished product is a standard 16:9. The program was constructed with a 12/8 time signature, but it is conducted in 6/4. The program is nothing more than a repetition of 16-measure tempo ramps with 8 measures of acceleration followed by 8 measures of deceleration. (See the section on Tempo Maps for an understanding of variables used in constructing tempo ramps).

In this program, at the end of a 16-measure tempo ramp, the accelerating part would be precisely one eighth note ahead of the non-accelerating part. You can also see that I have stripped some of the informative elements from Bounce Metronome's output, e.g. the tempo dial, to simplify the design and allow the two finished renderings - one for each instrumental part in the score - to be combined onto a split screen.


Unlike the screen capture image of the Option 1 program, the rendered video in the above example is using 8-measure tempo ramps rather than 16-measure ramps. By turning off the audio click track, you and a friend can play along with the .mp4 video above to test the performance of Bounce Metronome's visual cueing system and feel what it's like to play a "phasing tempo ramp." You can play any rhythmical pattern you want or use the pattern Reich's used in "Clapping Music" : Depending on the streaming speed of your internet connection and our server, it may be best to download the video and play it from your own computer.

A title is incorporated at the beginning for just a few seconds because after a while, all of these variations start to look alike. A two measure count-in has been added to the beginning to assure a clean start. You will notice the Measure Count does not advance until the two count-in measures are completed.

The Devil's Details

The above screen capture comes from a two-part composition where the ictus is clearly visible in both parts. When constructing tests of Bounce Metronome, I often chose a 6 beat conducting metric just to complicate the image and see if the baton was easy to follow. I also added the beat numbers to the end of the baton along with an ictus for each beat, both of which are shown here. This screen capture demonstrates how tricky the interpretation of a visual cueing system can be. The Accelerating Part on the right has reached the 1st Beat of the 17th Measure while the Non-Accelerating Part on the left has just reached the start of the 6th Beat of Measure 16. This does not mean the Accelerating Part is now 1 full measure ahead of the Non-Accelerating part, because the Start of the 17th Measure is actually the End of the 16th Measure. Therefore, the Accelerating Part is only 1 Beat, i.e. one eighth note in 12/8, ahead of the Non-Accelerating Part, which is the desired effect for this formula. If you were to listen to the audio track at this point, you would hear the completed formation of a new rhythm according to the "phasing technique" as defined by Reich.

The image above and the following image are screen captures from a video editing program showing side-by-side (i.e. split-screen) comparison of two Bounce Metronome files involved in the same two-part composition. The image above was taken at Measure 25 while the image below was taken at Measure 145. Any number of things could have taken place between the first and final measures, including tempo ramps of different lengths designed to "phase" the two parts as well as some disassociated tempi and time signatures applied to one or both instruments.

What we see in the screen capture above is that the program has reached Measure 25, where the "accelerated" instrument on the right is now precisely 1 full beat ahead of the "steady" instrument on the left. If you were looking at the score for both parts, the bar lines would no longer line up. (In other words, the instrument on the right has just reached Beat 2 of Measure 25, while the instrument on the left has just reached Beat 1 of Measure 25. What's important to notice is that the elapsed time for both instruments is exactly the same: 54.017 seconds. This indicates that Bounce Metronome is working correctly.

The screen capture above occurs at Measure 145 and indicates that the "accelerated" instrument is now a full measure ahead of the instrument on the left, while the elapsed time for both programs has been 324.017 seconds. This is what you want to see, some verification that your math was not only correct (assuming this is what you wanted to happen at this point in time), your entries in the various sub-menus of Bounce Metronome were correct, and that Bounce Metronome itself has performed flawlessly.

In both of these screen capture images, you will notice that the ictus in each part is slightly behind the audio cue associated with that visual. (Or, the audio cues are slightly ahead of the visual cues.) This artifact could suggest a couple of things: 1) Bounce Metronome takes a little bit longer time to display the visual cues as opposed to the audio cues; 2) in standard video you are working at 30 frames per second, so you're going to be off by just a hair due to that tolerance. It may be possible to improve this tolerance simply by switching to 60 FPS video. The American composer Milton Babbit once suggested that the computations and performance notes of his own work needed to be adjusted to this degree of precision.

With reference to video resolution, I haven't found any practical benefit by increasing the video resolution above 720p, and that stipulation would apply to Options 1 and 2 cited earlier. However, with reference to frames-per-second, there may be a benefit to upgrade to 60 FPS. Unfortunately, this upgrade increases equipment costs and file size, which further burdens the synchronizing demands of Networked Video as a production platform. From my own experience, I can see that recording and rendering at 60 FPS would help in the perception of baton movement, whether that baton is produced by a vectored object, as in Bounce Metronome, or comes from a video camera and editing system pointed at a live conductor. At the moment, since I am still in the experimental phase, the costs of moving to 60 FPS are tough to justify. But if you have the money and the hardware capacity to work at 60 FPS, do so and let me know what happens.

DAW Dodging

It tells you something about the state of composing with DAWs (digital audio workstations) that only two commercial manufacturers (Logic and Abelton Live) have decided to at least acknowledge the demands of polytemporal music by interfacing with VPL (Visual Programming Language) products such as Max/MSP. But this interfacing doesn't involve using the compositional functions at the heart of their DAW programs, like piano roll and sub-menus. I know of no DAW program that provides for the independent adjustment of tempo for each track involved in a composition constructed with its own software, regardless of whether those tracks consists of modified .wav files or midi code. Instead, the DAW industry has been moving in the opposite direction: quantitization, i.e. making those bar lines line up, forcing those tempi to match, eliminating any variation in tempo so measures can be duplicated ad infinitum (looping), and sampled -- i.e. stolen and reused as .wav files -- allowing them to be hijacked and seamlessly woven into new compositions, a practice embraced and supported under the misnomer of "creativity." "Celebrity with a laptop" is the way the trend is described. Accommodating the quest for quantitizing and sampling has been the demand up to this point, and the manufacturers have willingly obliged. And why shouldn't they. Since when is it their role to enforce the ethical and artistic standards of its users?

Some time ago I discussed the lack of polytemporal applications with the representative of a large music software company, and he quickly pointed out the lack of market demand for such a product. He was right, of course. There is little monetary incentive for the major players to make an investment in this obscure, esoteric and elitist market. If there has been any movement in supporting the evolution of polytemporal music, it's been in electronically producing/rendering/distributing polytemporal music, not in composing and performing live examples thereof.

Even the top notation software companies like Sibelius, Finale and Dorico have avoided polytemporal music because the math gets too complicated so quickly. How can you have independent parts represented in a master score where the bar lines don't line up and the elapsed time is different for different parts? To what extent can you continue to make tuplet relationships mathematically rational? To satisfy the demands of polytemporal music, you'd have to have X number of variable clocks, each assigned to a separate instrumental part or track, all working under a master clock for synchronization. How can you compose AND playback something with independently adjustable tempi assigned to each part while generating something that represents a master score?

I have to acknowledge the slow infiltration of VPL (Visual Programming Language, e.g. Max/MSP and PureData) software programs into the field of polytemoral composition and performance. Such products have been championed by students and music departments in academia for at least a decade -- since they get the educational/student discounts from Apple -- and where the cost of research is partially absorbed by tuition and the pursuit of master and doctoral dissertations. Hardware/software companies like ProTools, which presents itself as an "industry standard" for digitally recording and editing audio of all kinds, has shown no interest in the creation of polytemporal music itself. Just being compatible with the final file format is enough for them. How can you blame them? When was the last time something developed for classical music and jazz became a profitable investment? After all, haven't we moved on from atonal abstractionism? Isn't minimalism new enough? Why do we have to keep introducing something we can't aesthetically understand or appreciate? Who ordered the invention of deliberate confusion and the perfection of unwanted chaos? Since when is cacophony art?

Almost Bounce Metronome

The most discouraging aspect of using Bounce Metronome as a platform for synchronizing polytemporal music lies in the inability to use midi as a source for constructing/inputting the varying tempi of each part. Being able to input midi would expedite the construction of the visual cueing system enormously. Unlike digital audio (.wav, .mp3., ogg., FLAC. AAC., .ALAC), midi files are small and malleable to changes in tempo. You can vary the tempo of midi files without affecting the tonal qualities of any sound and with a minimum amount of computer code, because midi files do not capture/sample an actual sound and store it as a .wave file. It's only been in the last few years that you've been able to change the timing/tempo of an audio track without affecting the pitch and quality of that sound, and even then the degree of modification has its limits. Digital audio files are much larger than midi files and are subject to all kinds of speed and bandwidth limitations. They consistently stress the computer's processing power and demand much more memory. And if you happen to compose using a notation platform, e.g. Sibelius, Finale or Dorico, extracting midi files from the score is easily accomplished.

Nevertheless, to test the viability of using Bounce Metronome for tempo ramps with disassociated time signatures and tempi, I started inputting the varying tempi for each part of a simple duet using Bounce Metronome's scripting process. While examining the rendered files, I began to notice some peculiarities in the results. I must admit that the anomalies I found are small, almost imperceptible unless you're looking for them, with the largest offset to the predicted synchronization being something in the neighborhood of 1/2 second. Since I have no insight into the structure of the Bounce Metronome program, my reasoning as to what might be causing these anomalies could be far off. Regardless, I found that after a few minutes of use, there can be a delay of approximately 17-19 frames (SMPTE time code at 30 fps) that sporadically occurs when a tempo ramp starts, stops or changes direction. This delay seems to be reconciled later on in the program as the video output continues. In other words, there seems to be a "catch-up" phase which rectifies small offsets during the tempo ramping.

In trying to understand what was happening, and considering my novice grasp of the inner workings of Bounce Metronome, I could only surmise that the program was constructing its acceleration and deceleration tempo maps on a measure-by-measure basis. This may be producing a "stair-step" effect during the execution of the ramp. In other words, adjustments in tempo are not continuous throughout the measure, or perhaps even between beats. It may be that adjustments to tempo are only made at the end of full measures, causing a stair-step pattern to emerge. Regardless of the cause, my research found that at the end of a tempo ramp, or at a point where the ramp changes from acceleration to deceleration, there is a short lag in the video display as if the program were making corrections for any accumulated delays -- possibly caused by the "stair-stepping" effect.

The cause of this anomaly is purely a guess on my part. The inconsistent appearance of the phenomenon is also puzzling. Is there something wrong with the configuration of my computer? Is there something incorrect in the construction of the scripts I entered into Bounce Metronome?

I haven't yet mentioned that Bounce Metronome allows for the construction of "straight/linear" and "curved" tempo ramps. In a "straight" tempo ramp, the acceleration and deceleration are at a constant rate, i.e. the shape of the ramp is linear, or appears to be linear taking into account the stair-stepping effect described above. In a "curved" ramp, the beginning acceleration or ending deceleration speeds are bent along a curve at one end of the ramp. I tested both "straight" and "curved" ramps and observed the same "catching up" phenomenon described earlier. This "catching-up" action produces a subtle "hiccup" or delay in the movement of the conductor's baton that can be observed after the file is rendered.

It may be that at the tempi I was testing, the hiccup was more noticeable. Or it may be that my computer (quad-core Xeon with 12 Gig RAM) was just not fast enough to handle the code, or I had not adjusted the memory buffer in my RAM or on my video card to compensate for the "catching-up." And I may have been able compensate for those hiccups by stretching or contracting (called warping in DAW lingo) the rendered .avi video file of each part to achieve a smoother and more precise result. And, considering all the nuances involved in building a visual cueing system, I may be describing a phenomenon that has no significant impact when used for synchronizing polytemporal music files.

In my search for a solution to synchronizing polytemporal music, I contacted several software developers in the field of video synchronization, and each told me that part of the problem lies in the way Microsoft's Operating System handles video. Everyone agreed that programming video with Apple's OS was much easier. The more time I spent with Bounce Metronome the more I realized how much processing speed was required for such a graphic endeavor, even with today's powerful computers. The program seeks to achieve "zero latency," which, as we know, does not actually exist. What we're really after is the absence of any "perceptible delay." I used to think "perceptible delay" was a line somewhere around 1/30th of a second, but now I believe 1/60th of a second is a better goal. That degree of video accuracy is achievable with today's equipment, but at a much higher cost.

Bounce Metronome comes with another program called "Tune Smithy," which is described as a fractal tune generator, an automated process for generating variations in melodic structures. "Tune Smithy" also allows the application of micro tuning in your music, plus a program called "Audio Pitch Tracer," which I haven't had time to explore. As such, Bounce Metronome doesn't really need to fill this polytemporal void to be a viable product with many applications. It's just that it's so close.

By the way, I saw that Robert Walker had decided to port his software for operation on a few of Apple's many OS versions. I have no information as to the success of that venture.