Vol. 39 Issue 2 Reviews
The Second International Workshop on Cross-disciplinary and Multicultural Perspectives on Musical Rhythm and Improvisation

The Second International Workshop on Cross-disciplinary and Multicultural Perspectives on Musical Rhythm and Improvisation, October 12-15, 2014, New York University Abu Dhabi (NYUAD), Abu Dhabi, United Arab Emirates. Information about the conference is available at https://sites.google.com/site/nyurhythmworkshop2014/program/.

NYUADReviewed by Andrew Lambert and Florian Krebs
London, United Kingdom and Linz, Austria

The second International Workshop on Cross-disciplinary and Multicultural Perspectives on Musical Rhythm and Improvisation was sponsored and hosted by New York University Abu Dhabi (NYUAD), United Arab Emirates and took place in mid October of 2014. Balancing theory and practice, the workshop provided a forum for a cross-disciplinary discussion of approaches to musical analysis and creation. The main aim of the workshop was to examine musical rhythm from four engaged viewpoints: music theory, musicology, and ethnomusicology; music modeling through computation; music perception and cognition; and neuroscience. Of the 41 participants that took part in this workshop, 24 were invited to give lectures about their work. The participants came from a variety of research backgrounds including musicology, psychology, neuroscience, computer science, music information retrieval (MIR), philosophy, and music composition.

The four-day program contained 24 lectures and three breakout sessions, whose topics were defined by the participants during the workshop. The participants discussed topics including: creative collaborations, datasets and evaluation, interfaces for rhythmic creation and transformation, and many more. A highlight of the conference was a concert by saxophonist Barak Schmool, percussionists Akshay Anantapadmanabhan and Gideon Alorwoyie, and computer musicians Jaime Oliver and Gérard Assayag during the first evening.

The workshop covered topics that can be broadly categorized into data and evaluation, psychology and rhythm, automatic description of musical rhythm, computer improvisation and composition, rhythm in non-Western music, and rhythm outside of music. In the following we summarize selected contributions to these categories, omitting the category “rhythm in non-Western music” which is not within the scope of this review.

Data and Evaluation

Most research on musicology and MIR is data driven. The choice of the right data for an experiment is crucial and often determines the outcome of an experiment. There were two breakout sessions that discussed this.

According to Xavier Serra, it is important to distinguish the terms “research corpus” and “test dataset.” A research corpus is created to enable research for a set of problems and therefore is more general purpose than a test dataset, which is created for a specific research question in a restricted context. It was noted that most existing corpora and datasets are not created in a consistent and standardized way. Best practices for generating annotated datasets (Peeters and Fort, 2012) as well as a JavaScript Object Notation (JSON) format for storing multiple annotations (Humphrey et al., 2014) have been published previously.

Apart from data, researchers also need methods to evaluate their systems and theories. This was the topic of a second breakout session. A recent paper assessed the evaluation measures currently used for audio beat tracking tasks and showed that many measures can be improved to better match the perception of human listeners (Davies and Böck 2014). Additional research is therefore necessary to revisit current evaluation methods and re-evaluate them using meaningful user studies.

Juan Pablo Bello talked about potential problems when using genre-annotated datasets for research on rhythmic similarity. Often, genre classification is used as a proxy for rhythmic similarity because it is hard to collect ground-truth for the latter. However, a study on the Latin Music Database showed that a high genre classification performance does not always translate into high rhythmic similarity. If the variability of the dataset within a genre is too low, genre classification systems will often model rhythmically irrelevant features.

Psychology and Rhythm

The workshop had two major contributions from the fields of psychology and neuroscience. Psychologist Guy Madison presented his work on deterministic multi-level patterns (MLP) and their uses in rhythm research. MLPs are rhythmic patterns containing multiple metrical levels: hierarchies of binary or ternary subdivisions of time. MLPs are isochronous, containing events at each associated metrical level. Where multiple levels stack up, a louder event is created. Madison has used these patterns in many psychological listening experiments. One such experiment investigated tapping variability in relation to the number of metrical levels (Madison, 2014). Several MLPs were created, each containing a different number of metrical levels. Participants were asked to tap to the pattern and their micro-timing variations were recorded. The study found that MLPs with more metrical levels reduced timing variability, suggesting an underlying neural mechanism capable of integrating information across multiple levels.

Edward Large presented his latest research on the neurodynamics of music and the nonlinear resonance model of rhythm perception. Nonlinear resonance models the cochlear and populations of neurons as interconnected banks of resonating nonlinear oscillators (Large, 2010). The oscillators have natural frequencies distributed across a spectrum and when stimulated by a signal they resonate nonlinearly, producing larger amplitude responses at certain frequencies. These amplitude responses occur at integer ratios to the pulse and can account for pattern completion, tonal relationships, and the perception of meter. Large also took the opportunity to launch his lab’s new MATLAB software toolbox, which implements the nonlinear resonance model in a neural network and incorporates Hebbian learning in the oscillators’ connections.

Godfried Toussaint investigated several mathematical models to explain the perception of complexity in monophonic rhythmical patterns in human listeners. He found that the more sub-symmetries a pattern has the more it seems to be perceived as simple (Toussaint and Beltran, 2013). In his experiments the number of sub-symmetries correlated better with human cognitive complexity than any other methods such as the Papentin’s L1 complexity measure.

Automatic Description of Musical Rhythm

One of the main topics of the workshop was the automatic description and analysis of musical rhythm, which lies at the heart of many MIR systems. Andre Holzapfel presented a system that uses learned rhythmic patterns inside a Hidden Markov Model framework to find beats and downbeats in audio signals. He evaluated the system on a culturally diverse corpus (Turkish, Greek, and Indian music including 5/8, 7/8, 8/8, 9/8, and 10/8 time signatures) and achieved a state-of-the-art performance with one unified model (Holzapfel et al., 2014). Nevertheless, exact inference in this model (which explicitly modeled 14 different rhythmic classes) is computationally very demanding and Monte Carlo methods were proposed to infer the metrical structure in a manageable amount of time.

Anja Volk presented a computational model that assigns a metric weight to each note of a musical piece (Volk 2008). In this model, a piece is presented via onset times extracted from MIDI data. Pulses of different frequencies are then extracted from the onset times. The more the pulses coincide with an onset time the higher will be its metric weight. This method can be used to analyze the metric characteristics of different genres and was employed to analyze Ragtime rhythms.

Hema Murthy spoke about the recognition of the stroke type in Carnatic percussion music. Her system performs onset (stroke) detection, combining the amplitude of the signal with a modified group delay function. Then stroke recognition is performed using HMMs trained on the isolated strokes.

Computer Improvisation and Composition

There was a large cohort of computer musicians contributing to the workshop. Gérard Assayag and Marc Chemillier, from L'Institut de Recherche et Coordination Acoustique/Musique (IRCAM), presented updates on the OMax software and its offshoot, ImproteK, with a focus on how it handles rhythm. OMax detects incoming audio and creates a model of the musical structure. When this structure is learned, non-contiguous jumps can be made within it to generate new material. This is known as a stylistic reinjection. The result is a real-time interactive musical agent which plays with the same musical “logic” and sound as the performer, a kind of musical clone. OMax does not model rhythm explicitly, but is able to capture rhythmic similarity using structural modeling techniques.

ImproteK follows the same stylistic reinjection model as OMax, but introduces modeling of some rhythmic aspects of the input, namely the beat. The software implements a beat tracker based of Large’s nonlinear resonance model, which helps the system decide when to make a reinjection. During the initial listening phase, ImproteK tracks the beat of the input signal. The software then takes the responsibility for leading the beat once it starts producing output. This, Chemillier admits, is a simplification of the complex interaction that occurs in group improvisation.

George Lewis presented an overview of how rhythm is handled within his improvising system Voyager. The system is a complex interaction of multiple agents, each responsible for a small part of the musical whole. The way these different systems interact with each other (and the input signal) results in an emergent rhythmicity that has been fine-tuned by Lewis over years of development. Voyager does not require a musical input to function and is able to self-generate musical events.

In contrast, Jaime Oliver’s systems are entirely dependent upon performers in that they are interactive computer music systems following the paradigm of a musical instrument. The music created with such systems must be performed, so the ability to perform rhythmic material was designed into the instrument itself. Oliver developed a video input system which tracks complex physical gestures on a tabletop and maps them to musical gestures in sound. The outputted gestures can be simple percussive clicks or complex generative phrases, and are performer driven.

In addition to their research Assayag and Oliver also performed in concert. OMax was used in a duet with a live saxophone improvisation of Saxophone, and Oliver performed one of his own compositions. The two then performed together with the other performers from the night in a large ensemble improvisation.

Rhythm Outside of Music

Rhythm is not an exclusive domain of music, but exists in every other temporal phenomena. During the workshop William Sethares presented his research which looks at rhythm in speech and the question of how to measure the difference between two given speakers. Sethares uses a method known as Dynamic Time Warping (DTW) on the audio features of a spoken sentence’s spectrogram to align two utterances. DTW produces a table of aligned time points which can then be used in a resynthesis of the sentence to produce time-aligned audio. Sethares intends to use this temporal information to see if an origin of an accent be determined using machine learning techniques. He also intends to examine the micro-timing of dramatic performances, such as multiple renditions of a Hamlet soliloquy, to study the emotional effects and personal differences between them.


The verbose title of this workshop does not do justice to the myriad perspectives and disciplines branched by the deceptively simple term, rhythm. By the end of the proceedings, we had barely scratched the surface of the current state of knowledge and future research directions. Though, with new collaborations forged such as the production of a new annotated dataset of Indian Tala vocalizations and talk of an accompanying book on the subject, we can undoubtedly say that the workshop was a success.

Often during the discussion groups, the question arose of what we all mean when we speak of rhythm. Two breakout sessions specifically dealt with what rhythm means in terms of musical universalities, other musical contexts, and parameters such as timbre and pitch. It was widely agreed that although there is no rhythmic universal, there is much common ground. With all the varying perspectives on rhythm, perhaps a comprehensive definition of what it is cannot be found. One thing is certain, however: when studying rhythm we are studying an epiphenomenon. It is an aspect of the temporal domain which when taken by itself does not seem to be anything but a useful construct for speaking in temporal terms.


Davies, M. E., and S. Böck. 2014. “Evaluating the Evaluation Measures for Beat Tracking.” Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR).

Holzapfel, A., F. Krebs, and A. Srinivasamurthy. 2014. “Tracking the “odd”: Meter inference in a culturally diverse music corpus.” Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR).

Humphrey, E. J., et al. 2014. “JAMS: A JSON Annotated Music Specification for Reproducible MIR Research.” Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR).

Large, E. W. 2010. “Neurodynamics of Music.” Music Perception 36: 201–231.

Madison, G. 2014. “Sensori-motor synchronisation variability decreases with the number of metrical levels in the stimulus signal.” Acta Psychologica 147: 10–16.

Peeters, G., and K. Fort. 2012. "Towards a (better) Definition of Annotated MIR Corpora." Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR).

Toussaint, G. T., and J. F. Beltran. 2013. "Subsymmetries predict auditory and visual pattern complexity.” Perception 42(10): 1095–1100.

Volk A. 2008. “The Study of Syncopation using Inner Metric Analysis: Linking Theoretical and Experimental Analysis of Metre in Music.” Journal of New Music Research 37(4): 259–273.