“ontextC” – Technical Diary 4

What happened so far?

To know where to start modifying the Max phase vocoder, I drew comparisons between the same stretch factors in PaulXStretch and the phase vocoder. To keep the conditions as similar as possible, I changed the FFT size in PaulXStretch to 1024 and turned off all of the other parameters in the processing chain (harmonics, tonal vs. noise, frequency shift, pitch shift, ratios, spread, filter, free filter and compressor), with the expectation that the resulting sounds would just stretch the source sound (a ten second snippet from an acoustic multitrack recording) using the respective stretching algorithm. This would then allow me to hear differences.

When comparing the results, it quickly became evident that while the phase vocoder provided very transparent sounding stretches at lower stretch factors, the aesthetic quality of the Paulstretch algorithm and the smearing it introduces were a) very different sounding and b) more usable for the intended sound design purposes, where especially stretch factors over 10 become interesting and the original sound source becomes almost unrecognisable.

Note: I have now switched to working with the default phase vocoder that comes with Max as a resource example (Max 8 > Show package contents > Resources > Examples > fft-fun > phase-vocoder-example-folder). It has a lot of similar components.

Ongoing

Currently I am in the process of settling on EQ, reverb and pitch shifting modules to use for the prototype. Another more research-based aspect of the project is to figure out how the provided Python code from the old Paulstretch algorithm works, which will hopefully allow me to modify the phase vocoder towards a direction that suits the imagined aesthetic outcomes of ontextC. My supervisors are kindly is helping me with this, since I am not familiar with Python at all.

Results and Reflection

The results of the comparison are useful, because they define the differences that need to be overcome in order to reach the aesthetic results I am looking for with this plug-in. While some of the inner workings of the Paulstretch algorithm still remain unknown as of now, the Python code will hopefully help to figure out what is missing. Furthermore, being able to set the FFT size over 2048 to a value closer to a value along 4400 would be a next step to imitate the workflow that started this project better – the steps that follow will show whether that is a limitation in Max or not.

As a sidenote: The shortcuts CMD + Option + M to open a locked Max patch and CMD + 8 to remove the frame have been proven very helpful.

Objectives for Next Time

  • Prep draft of sound example through all parts of the signal chain -> How does it start, which sound do we want to get to?
  • Check out Phase vocoder template, start to modify parameters in project draft and experiment
  • Settle on other modules in the processing chain

Keep in Mind: Mapping parameters together will become relevant sooner rather than later – it makes sense to research this as well.

“ontextC” – Technical Diary 3

What happened so far?

A recent priority was the comparison of different phase vocoders that are available in Max. With the help of the Cycling74 resources, I tested whether the difference between the modules using polar vs. cartesian coordinates affected my sound sources in a (noticeable) way that would make me choose one over the other – ultimately cartesian coordinates seemed like the better option for my project, also in terms of CPU usage. For windowing, the Hanning window is currently in use.

Furthermore, to better understand the processes the signal goes through within the plug-in, I asked my supervisor about the meaning of phase coherence in this context, and was able to bit by bit (little terminology reference here) connect the theory and the practical application, which will help me a lot going forward.

Ongoing

The evaluation and development of EQ, pitch shifting and reverb modules for my project is ongoing. Fortunately, there are a lot of libraries and resources especially for filtering and spatial effects, so the main challenge here is to find what works best to achieve the sound results I am aiming for, while also being functional and relatively simple to integrate. By studying existing Max patches, even though they might not be 100% what I am looking for, I am learning more not just about the Max environment, but also about best practices and how I could translate certain organisational aspects (comments are so helpful for external people looking at a patch to know what is going on!) and connections into my own project patch. My main resources for this are free patches that I download from the Max for Live library patch page and explore.

Results and Reflection

While it is good to know that there is a phase vocoder that can help me to realise my vision for this project, now it is time to start thinking about how to best integrate it, and define which modifications need to be made in order to make it sound the way I want it to in the context of my project. To do so, I will draw comparisons between PaulXStretch and the Max phase vocoders, to determine limitations, potential areas of improvement and differences in sound quality at different stretch factors.

Objectives for Next Time

  • Prepare and document sound examples to compare between the phase vocoder and PaulxStretch
  • Continue development of other modules

“ontextC” – Technical Diary 2

What happened so far?

Aside from a crude mockup in Max MSP, a diagram helps envision the signal flow and processing points of the plug-in now. The diagram is also quite a handy tool to identify challenges, as it lays the main idea out in a layout that is simplified, but representative of the core idea. Parameters have been defined and narrowed down further.

I have also been provided with copies of all three volumes of Electronic Music and Sound Design – Theory and Practice with Max 8, which I am using as a reference and also a learning opportunity to further familiarise myself with the Max environment.

The objective at this is to research and further refine the direction of the project. At this point, the audio signal chain has the potential to work, but the time stretch unit does not work by integrating PaulXStretch into the patch as an external VST, since the audio needs to be manually imported and exported in the application.

Top Objects

In the mockup, the bangbang object proved very useful to initiate the loading of a list of parameters in a umenu – to experiment, this was done with a list of parameters from Valhalla Supermassive, but the same procedure could be useful later down the line for menus that should operate similarly.

Results and Reflection

The biggest challenge at the moment is the PaulXStretch implementation. The lack of documentation of the application makes it difficult to decipher which algorithms make the parameters work, and since it is at the top of the signal chain it blocks the audio signal from coming through to the next stages of processing. More research on the Paulstretch algorithm will be necessary. Furthermore, the commercial nature of my ideal reverb for this project makes it more difficult to implement, meaning that now is a good point to look into alternatives and emulations.

Objectives for Next Week

  • Research reverb properties, documentation, and open source emulations/alternatives
  • Research publications on the Paulstretch algorithm
  • Find a good tool for pitch-shifting and EQ

Research Resources for Next Week

Timbral effects the Paulstretch audio time-stretching algorithm (Colin Malloy)

An approach for implementing time-stretching as a live realtime audio effect (Colin Malloy)

Max 8 Handbooks (Volume 1-3) – Alessandro Cipriani, Maurizio Giri

Valhalla Lear Resources (Plug-In Design)

IRCAM Reflections 2.0: Dromos/Autos

The showcase of “Dromos/Autos – The Autistic Ontology as Performance” by Matt Rogerson at the Ircam conference (19th to 22nd of March) presented itself as especially memorable as an instance where Electroencephalography (EEG) is not just used as a technological tool in an attempt to free the hands of musicians, but instead directly linked to the story it helps to tell. In short: It was interesting from a narrative perspective.

Acting as both the performer and researcher, Matt Rogerson aimed to invoke sensory overload in a generative performance ecology by way of biofeedback to bring about empathy towards the lived autistic experience in daily life. By integrating sound technology and visuals into a piece of performance art, the artist acts as a “mediative human interface”, invoking a sense of depersonalisation with the symptom of delayed reactions. The idea is to be as passive a subject to the ongoing processes as possible instead of trying to assert agency over them.

A performance of „Dromos/Autos“ in a different setting

The significant aspect during the performance, confirmed later by the discussion of it, was that the titular theme and technology used were enough to create a narrative for what was going on stage – the further explanation was insightful and interesting, but I feel like I still would have walked out of just the performance with a sense of having gained insights and perspective, and this is what good storytelling does for me. Within this framework, there was still space for trial and error, as well as the surprising and unexpected, and the combination of research, preparation, and artistic execution was a sharp display of what Ircam is all about.

It’s hard to see the bigger picture with a brain that’s very detail-oriented.

Matt Rogerson during the discussion of the performance in Ircam’s Studio 5 on the 19th of March

In terms of sound design, the sounds that occurred within the generative framework were researched and adjusted to specifically induce sensory overload for the artist to help facilitate a feedback loop to enhance the performance, but what they also did alongside the visuals was to create an experience for the listener that is somewhat synchronised to that of the performer, albeit on a different scale. It created an atmosphere where the performer and attendee endured the experience together in a way, with the audience realising that acoustic ecology might not be the same for everyone. The takeaway here is that a good mixture of research, planning, considering the audience while keeping the main goal in mind and a transparent execution of the project can go a long way in creating a narrative experience. That being said, the realisation after the performance was once again that when designing the sound of the world we live in, it is essential to consider accessibility and find solutions that work for, and not against all kinds of people.

On another note – the performer made sure to warn visitors about strobe lights that would be part of the performance in a way that went beyond mentioning it as a rushed sidenote.  This consideration towards the safety and individual circumstances of everyone in the room was a thoughtful reminder that it is okay and important to integrate obvious disclaimers as part of a designed experience for others to ensure a smooth and safe event for everyone who attends.  

IRCAM Reflections 1.0: Three States of Wax

Out of all the contributions to the 2024 Ircam Forum Workshops that I have seen in Paris between the 19th and 22nd of March, one has kept me thinking not just for its content in terms of the (musical) arts, sciences and technology, but especially the philosophy behind it.

Three States of Wax: The Nature of material in Live Electronic Improvisation”, brought to the conference by Juan Parra Cancino and Jonathan Impett framed composition as a critical technical practice through the lens of material as described by Descartes in his wax argument thought experiment and then taken further by Michel Serres in an investigation into the materials of physics.

„Three States of Wax“, in this video performed in 2020 at the New York Electroacoustic Improvisation Summit.

In doing so, the Cartesian and scientific way of thinking about material were interpreted alongside a plane where in the present, the history of the material plays a role as the material becomes its own memory through every interaction with it – memory is approached as a reconstruction from a point of view as a similar process to imagination through improvisation in an electroacoustic performance that incorporates extended techniques on the trumpet, guitar and electronics while coming up with points of communication and interaction.

The presentation also implicitly posed the question of authorship, suggesting that material can be something that transforms how we think about it (also depending on how it is presented) and depending on how it was derived – in an odd way this made me think of “Steal like an Artist” by Austin Kleon (why that is I have to investigate further). As an example, the emergence of paper was mentioned as a clear material that changed how we think: With it, we become aware that we can note down things that might be helpful for later, essentially transforming how we navigate an ever-changing landscape of information and knowledge.

In a similar sense, it prompted reflection on how the reiteration of previous materials and merging of individual contexts transforms into an interconnected web of knowledge, simultaneously creating new input and contributing to a network structure that works as a combobulator, at least in my interpretation. In the context of improvisation in music, the notion that came across to me was that if composition were approached as a design process in terms of thinking about and considering the materials one works with, continuity could be found even where layers are added.

While I would not dare say that I fully grasped the whole idea during the 30-minute presentation without any further input, it provided me with food for thought and new ways to approach and interpret the interweaving of material with an awareness of how information, too, is subjected to change in how it is understood and presented during and also after the creative process, if it were to be fixed and thus became part of a larger network of memories and associations ascribed to it. This blog is by no means meant to explain the presentation itself, but more my interpretations, reflections, and thoughts that came up so far as a result of taking in the information.

Further resources:

Echo, a journal of music, thought and technology – https://echo.orpheusinstituut.be/#issues

Orpheus Institute (Advanced studies & research in music) – https://orpheusinstituut.be/en/music-thought-and-technology/three-states-of-wax

G. Agamben – The Man without Content

A. Negri – Art and Multitude

Sound & Interaction in „Decoding Bias“

“Decoding Bias”, written by Theresa Reiwer, is a multi-channel video and sound installation that was presented at Digithalia Festival, where spectators were invited to join Artificial Intelligences in their seated circle during a group therapy session in which they discuss the biases that were built into their algorithms through humans. The sound design was done by Kenji Tanaka.

In terms of setup, lighting, speakers set under the video screen of each respective AI and the placement of viewers as if they were a part of the group already makes for an intriguing, interactive setup that uses sound as a tool to further enhance the “realness” of the scenario. The directivity of the spoken words takes the AI out of their screen and into the three-dimensional space.

Furthermore, sound plays an integral role in setting the mood. At first, the hollowness of the space the visitors are about to enter is represented in sound before the performance starts, and then the concept of sound becomes more and more important as the story unravels and the AIs begin to question their encoded biases and the people responsible for them. Reverbs, distortion and spoken words coming from all directions at once largely impact the creepy atmosphere that emerges from the realisation that there are ulterior motives in human-made things that are backed up by a lot of money and that our perceptions, just as the ones of AI are susceptible to the most prevalent voices in society. Similarly, a light-hearted party song takes out the tension as the therapy session comes to an end. Sound is continuously present to help navigate this experience, to create and release tension.

One detail struck me as very fascinating: During the performance I was convinced that the voices were AI generated as well – there was this lack of emotion, and breaths in between sentences were not audible, at least to me. Upon reading up on the installation, I found out that the voices were done by real actors. Not only must they have received incredibly good direction and done an amazing job, but the idea of how AI sounds was also considered in the audio post-production. Such a small, but important detail that inspired me to pay even more attention to not only how things sound, but how they were made and come about and to take this history into account when making choices about how to navigate analogue and digital sources.

At one specific point in what is supposed to resemble a mostly empty office building, the footsteps were good in terms of the space that surrounds them, but the sounds that were chosen just didn’t work for the type of shoe and ground it hit in my opinion. They were a good representation of the sound designer’s never-ending struggle to find the right footstep for the occasion.

All in all, this was an immersive installation that made me pensive on its content, meaning that the sound and interaction worked together in an awesome way that complimented the experience instead of distracting from it.

Sound & Interaction in “Ein Flanellnachthemd”

“Ein Flanellnachthemd”, written by Leonora Carrington and staged in augmented reality in a collaboration between Augsburg State Theater and Ingolstadt State Theater, was presented at Digithalia Festival within the confines of one portable electronic device pointing towards a poster in which all the action takes place. The keywords for this play are surrealism, morbid interactions, and nightmare. It is evident from the beginning that the atmosphere is meant to be unsettling.

This is on one hand represented by the costumes and interactions that the actors have on the augmented reality stage, but mostly through the sound: A deep, dark pad texture somewhat close to being a constant in all of the house’s rooms follows the spectator through the narrative. There are diegetic sounds as well – droplets in a bathroom, doors creaking upon being opened and closed, footsteps, and fire in a kitchen where a murder took place. Despite all this, what was done with sounds seems minimal compared to the potential there is: A flood outside the window, a huge black swan made of paper, a tree growing inside a bedroom, a hyena crouching in the corner, and a crocodile in the bathtub. In my opinion, doing more here could have enhanced the experience of the surreal in these scenes, instead of simply brushing over them visually or maybe not even noticing them (i.e., if I already move my device to follow the dialogue and do not randomly move it up to see the leaves growing on the ceiling, will I even get an understanding of the absurdity beyond noticing it on a surface level?).

One factor that severely impacted the interplay between immersion and sound was the mix. Between dialogue, atmosphere, and switching to another poster, I found myself adjusting the volume multiple times to be able to take note of everything that was going on. This is something that I would want to focus on to create good continuity and cohesion in an experience that is already so bizarre without any additional disturbances (especially if the presentation does not take place in one continuous stream). If there’s already enough hassle with tilting the camera to witness what is happening in the story, having to adjust levels on top of that is just an additional distraction.

I really liked the music that accompanied the play. It was evolving and atmospheric, and although it was similar in most rooms, it fulfilled its purpose very well. One main lesson I learned from taking part in this critically is that if there are visual elements in augmented reality that add to the experience, but are not directly referenced in the spoken dialogue, it could make sense to use sound to draw the spectator’s attention to the visual input coming from another direction – it doesn’t need to be super obvious, but a subtle hint helps to take in the whole scenery.

“ontextC” – Technical Diary 1

The objective for this week was to refamiliarize myself with the Max MSP environment with the help of a template that was provided to me as well as a series of tutorials on YouTube. The larger objective that this step will help me towards is to create a prototype of my plug-in by first using integrations of VST plug-ins that I am currently using for my workflow which I can then gradually substitute with effects units that I created myself to suit my needs.

The Toolbox

Toolbox Analysis

The Max Standalone series was helpful in some regards, but overall, it felt like the videos included a lot of trial-and-error moments, which made them lengthy and tricky to follow along with. I often found myself trying to rebuild a patch component only to then find out it had lots of issues in it which had to be undone a couple minutes later. I could imagine this might be useful for someone who is not as familiar with how to troubleshoot in Max MSP, but for me it was not the best way to progress. There was valuable information on how to build a standalone project, and the fifth episode of the series demonstrated how to distribute an application to stores. While not applicable to my project at this early stage, this is information I will revisit. The most useful information that I will be taking away from this series is how to build an application (Part 1, from 10:30 onwards), work with dependencies (Part 2), and create presets (beginning of Part 3).

The video by John Jannone managed to integrate a lot of useful information into 20 minutes, and it was relatively easy to follow along with it. Although it is specifically targeted towards synthesizers, it contained useful general information on how to set up umenus to work with parameters from a VST plug-in, manipulate them and save snapshots.

Results and Reflection

With the help of the videos and the template I was able to compile a beginning version of a patch, where a sample is fed through an effects chain. In the patch, I built separate components which might be useful for further prototyping, like a menu module which allows me to switch between external VST plug-ins. I faced some struggles with getting a sample output from PaulXStretch,which I plan on investigating further next week, but all the other plug-ins process the input sample smoothly. Another segment that needs troubleshooting is the umenu I attached to a reverb unit with the intention of being able to manipulate parameters from within Max (marked in red).

Top 5 Objects of the Week

Objectives for Next Week

  • Troubleshoot parameter manipulation tool
  • Properly integrate PaulXStretch/Research why it might not work
  • Start building a spatial effects unit

Research resources for Next Week

A list of resources I stumbled upon throughout my process this week and want to research further because they have the potential to help my project. The scope for these resources is varied and can go from scientific literature to tools that can help me learn more or become clearer on my ideal parameter mapping and UI.

Plug-Ins

Cecilia

Monster Timestretch

Soundmagic Spectral

Prototype Development

Getting Started With Reverb Design Part 1 & Part 2

Literature/Websites

A Tutorial on Spectral Sound Processing Using Max/MSP and Jitter

Juce