/work/Shun Ishiwaka + YCAM "Echoes for unknown egos ― manifestations of sound" - Agents

Shun Ishiwaka + YCAM "Echoes for unknown egos ― manifestations of sound" - Agents

Category:PerformanceTags:

Published: 2022 - 6 - 4

This is an improvisation percussion performance designed as a joint project of Japanese Drummer/Percussionist Shun Ishiwaka and YCAM (Yamaguchi Center for Art and Media) InterLab.

Teaser / Making

Agents

I was in charge of the development of several software systems used in the performance. We have taken several approaches to extend the musical creativity in improvisation by creating "alter egos" of Ishiwaka. My work in this project was developing some Agents that respond to Ishiwaka's play in real time, and "Meta-Agents" that observe the musical state of Ishiwaka's play and orchestrate other Agents based on the situation.

Melody
Agent

Melody agent plays generated melody in real-time along with Ishiwaka's drums play.

Melody generation uses Google Magenta's MusicVAE (16-bar Trio model). During performance, generation runs in parallel at three different temperatures (0.5 / 1.0 / 1.5), and combined with post-generation effect settings, 16 playback patterns are prepared, switched in real time by the Meta Agent. The effect variations were designed by Ishiwaka himself, including:

Two voices: randomly stacking intervals of a 3rd or 5th
Expressive: applying pitch shift and delay
Random chord: stacking random chords
High / low register only: restricting pitch range
Unison two octaves below: doubling the melody two octaves down
Repeated notes: tremolo-style 16th notes at a random tempo

The system is split into two processes: a Python UDP server for melody generation with MusicVAE, and a client that sends MIDI to a piano/synthesizer and handles playback.

Sampler
Agent

Sampler agent plays recorded samples of Ishiwaka's past Piano play along with drums play.

As the agent with the highest fidelity to Ishiwaka's own identity, the Sampler uses actual recordings of his performances. Three variants were implemented:

Piano Sampler: uses ~40 min of Ishiwaka's solo piano improvisation over his own drums as samples
Sax Sampler: searches for saxophone phrases from Ishiwaka and Kei Matsumaru's improvisation sessions (~40 min) that match the current drum playing
Drums Sampler: retrieves drum pattern samples triggered by saxophone input

Similarity search is performed using Librosa audio features (Zero Crossing Rate, Melspectrogram, Tempogram) and FLANN (Fast Approximate Nearest Neighbor Search) for real-time performance. Sample playback is triggered by Rhythm AI's generated rhythm patterns. To distinguish the agent's sound from live playing, an intentional pitch-modulation effect is applied.

Meta
Agent

Meta-Agent is an agent that controls other agents play style and designs combination of agents.

The Meta Agent was not part of the original plan, but emerged from a request by Ishiwaka during a residency in February 2022: "I want an 'ear' that listens to my playing and decides what comes next."

Audio features (ZCR, Melspectrogram, Tempogram) from the microphone and MIDI signals (note count, velocity, number of drums) are accumulated per second and projected onto a 2D coordinate via PCA. Ishiwaka monitored the coordinate space during rehearsals and used a spreadsheet to define which agents to activate—and how—at each coordinate region. This allowed even agents without microphone input (like Pongo and Cymbal) to respond to Ishiwaka's playing state, and enabled Ishiwaka himself to design the orchestration of the entire performance semi-autonomously.

Reactions
and
Discoveries

Sampler:
The
Paradox
of
"Sounding
Non-human"

My initial idea for using the sampler was to capture the raw, human feeling—the subtle fluctuations and textural nuances that only a person can produce. It was fascinating that what I originally aimed to make human-sounding ended up sounding non-human.

— TOKION: "A Technological Co-creation of AI and Improvisation — Shun Ishiwaka × Kei Matsumaru, Part 1"

The Sampler, designed as the most faithful alter ego, paradoxically produced a non-human quality through the process of search and playback—an unexpected discovery for the development team as well. In response, intentional pitch variation was added as a sound design choice to highlight the agent's distinct presence during performance.

Meta
Agent:
The
Feeling
of
Having
an
"Ear"

After testing the prototype, Ishiwaka remarked that it felt like the system had "gained an ear," and proposed using it to coordinate the entire performance. Rather than improvising with each agent in sequence, the Meta Agent enabled multiple agents to play simultaneously, with Ishiwaka entering into a full ensemble. Since Ishiwaka personally designed the relationship between coordinate space and agent behavior via a spreadsheet, the orchestration of the whole performance was effectively carried out by his own "alter egos."