“ontextC” – Technical Diary 9

What happened so far?

Recently, time spent working on the project was dedicated to figuring out how to best turn it into an exhibit that is both somewhat valuable for the user, as well as for research purposes. I knew that it would be important to keep the interface intuitive, and at the same time not to clutter it with information. Furthermore, a good solution was needed to collect parameter data – after some research and experiments I found that the coll object would work best for my purpose, with its ability to capture an index number and separate data input with commas, allowing me to then export the anonymous results as a CSV file. The save button and volume adjustments were non-negotiable, but I struggled a bit with how to best implement options to play back the source sound as well as the processed sound in a way that made sense just from looking at the interface. Another aspect I considered was that I would need a “phantom” slider for the visible interface for the user, meaning that after the previous person saves it always jumps to a random value, but looks as if the slider is back at the center. Like this, test subjects cannot copy the results from the previous person and really have to rely on their hearing to match the processed audio as closely as possible to the source sound.

Preliminary interface for the exhibition/survey

Ongoing

During a supervisor meeting, we tried think of a way to improve the playback situation – ideally three buttons at the centre of the screen would be enough. One option would be to have the playback of the original sound be gated, so that whenever it stops playing, the processed sound starts automatically. It is definitely something that still needs more thought and a better practical solution.

Results and Reflection

That this part of the project will be shown to the public definitely added a new challenge, because now it is not just about whether the software makes sense to me, but also whether it can be translated to a first-time user with little to no experience. The idea of people using their hearing to adjust the parameter in a sort of audioscope-like manner is very interesting to me though, and I look forward to seeing the results – I wonder how accurate the resolution of the parameter has to be for people to not notice a significant difference anymore, and how much it varies between people.

Objectives for Next Time

  • Finalise exhibit version (software)
  • Figure out physical exhibition setup
  • Write guideline how to set up/turn the exhibit off and on for the showcase supervisors

“ontextC” – Technical Diary 8

What happened so far?

After building a working signal chain with the vb.stretch~ external, I worked on fine-tuning some bugs that I had noticed in the patch, but so far had not been given priority treatment because the signal chain had not been fully functional previously. This included adjusting the filter indexes in the parametric EQ to reflect the features I wanted for my production process (1 – low shelf, high pass, 2 – bell, 3 – bell, 4 – high shelf, low pass), correcting the units and patching in the pitch shift unit to reflect semitone and cent adjustments separately, and implementing a line object on the reverb faders to remove crackling while changing a parameter. Then I started working on the patch in presentation mode to represent only the parts of it which I also wanted accessible during my production process. To do this, I worked with my initial sketch from the first semester, the GUI capabilities within Max and Max4Live for cross referencing the result. I also tried to somewhat make the signal flow (in series) clear through the interface, but it definitely still needs some cleaning up. This necessity was also reflected during my first testing session with a Max4Live export in Ableton Live, but it was good to see that the parameter selection was already working quite well for my production process, as I had hoped. I also managed to set up a simple preset function (but I am hoping to advance that as well with proper dropdown menu presets).

Rudimentary GUI loosely based on my original sketch, using internal Max GUI tools.

Ongoing

Off the basis of this patch, I am starting to plan out the look and feel of the exhibit version, where only one parameter will be adjustable (probably the stretch factor). Considerations for this endeavour are: usability, how playback of the source sound and the processed sound should be triggered, an index number for survey content and a volume adjustment to cater to individual hearing sensitivity.

Results and Reflection

This stage of the process was very exciting! The testing stage made me remember why I had wanted to set out on this process in the first place, and it was very satisfying to hear the first working results playing back through my DAW. Since it was also my first time seriously working on a graphical user interfaces, that came with new challenges and insights, and I look forward to where my GUI research and testing will lead me.  

Objectives for Next Time

  • create mockup for exhibit version
  • figure out an effective play/stop mechanism for alternating between the processed and original sound
  • test GUI and figure out which changes to make in which order (also consider typography, style…)

“ontextC” – Technical Diary 7

What happened so far?

While I managed to get a (very imperfect, but at least audible) signal through my phase vocoder pfft patch, changing the FFT size manually and incrementally while playing the audio was not possible within its framework. I researched options for this, and found that something similar to the block~ object in pure data might help fix this problem, but unfortunately all the equivalents or similar objects I found during my search did not work for this purpose, so I had to look into other options. I briefly considered writing an external, but quickly realized that this would require a whole new toolbox and set of skills, which would not work within the timeframe I had set for myself. But during the time I studied max patches from others I stumbled across a promising option: Volker Böhm’s vb.stretch~, an external which is based on the Paulstretch algorithm and provides the parameters I had wanted to include in my compiled plug-in anyways. I was not entirely sure why I had not stumbled across it earlier during my research, because I had already looked for externals once, but decided to try it out in the context of my patch and came up with sounds results that were so far the most similar to what I was looking (or in this case listening) for.

Exploring the parameter options of the external

Ongoing

With a working patch, now the plan is to fine tune parameters, iron out inconsistencies and get a more refined prototype with a simple GUI working.

Results and Reflection

Honestly, while I was glad to have found a solution with sound results I liked, I initially felt a bit disappointed and discouraged that my intended solution did not work out the way I had wanted it to, since I had already put so many hours into exploring and setting it up. But that is part of an iterative process, and it is a process I have learned a lot from – much more than had I immediately found the external. The current setup allows me to more freely explore and improve other aspects of the patch, and gives me more time to work on usability and actually using and testing the patch in my own productions.

Objectives for Next Time

  • fix EQ inconsistencies and pitch shift units
  • look into and start setting up a (simplified) GUI for testing in the form of a max4live device
  • plan which parameters might be best to explore for the exhibit

“ontextC” – Technical Diary 6

What happened so far?

One thing that I found has helped me quite a lot when building my setup was to study and learn from other plug-in constructions that worked with effects similar to the one I am trying to achieve (there’s lots of them available to download for free) for practice. Of course, you could always just look at the connections and figure out what is going on, but for me personally copying them into a new patch object by object and really having to think about which connection was made why helped significantly improve my general understanding of the Max environment, and how I could best organise my complex, growing patches for my own understanding.

Insight into one of the patches I re-built – here I learned that colouring patcher cables can really be a huge help, especially as patches grow larger and larger. It’s a simple thing really, but it helps!

To get a cross-platform overview of how the problem can be approached, I also looked at some pure data patches and examined what was done differently there.

Here’s a list of the patchers I learned from:

For Future Reference

I found that the block~ object in puredata seemed like a really useful option for working with FFT sizes and especially FFT sizes that are supposed to be changeable through a parameter, so it might be worth looking into a Max equivalent/alternative for this.

Ongoing

I found that if I am to do a version of the patch for the exhibit, I would like to try it with just one or two parameters in order to prevent information overload for the target audience and make the procedure straightforward and easy to understand. I also used my learning experiences to note down GUI designs that I found easy to navigate, and which constructions worked intuitively for me to inform my own GUI once it is time to create that.

Results and Reflection

While studying these patches dedicated to stretching sound, I found a lot of methods and patching ideas to come closer to an extremely time-stretched result – however, I still found that most of the units did sound close enough to what I wanted to achieve for me to adapt them for my prototype, so this will definitely be a priority for the next stage of the project. Nonetheless, this little excursion helped me get to know my preferred Max workflow a lot, helped me to navigate patches made by others better and gave me new perspectives on problem solving and syntax.

Objectives for Next Time

  • look into jitter objects to determine graphical user interface possibilities
  • integrate stretch units into the prototype with working signals
  • research block~ equivalents and alternatives for Max

“ontextC” – Technical Diary 5

What happened so far?

Over the end of the last semester and the summer, implementation became the main topic for the process. I managed to find decent placeholder models for the EQ, pitch shifting and reverb unit in the Max default resource examples: (in the ‚pitch and time‘/‘effects‘ folder, access by right click > object > open.amxd). With these, I did some testing using exports from the original Paulstretch software to make sure the results could work in the context of what I am trying to create.

Although initially I was headed towards just slightly modifying the phase vocoder that is available for Max, I realised that for my understanding of the algorithm and Max itself it might be better to start and troubleshoot from scratch, to get a result that I could fully explain and modify as needed. To do so, I used my Python analysis and the available Github repository to break down the most important steps of the algorithm (to recap in overview terms: Fourier Transform > Windowing Function > Spectral Processing > Inverse Fourier Transform > Interpolation and Smoothing) in terms of understanding, but also mathematically so I would be able to send the signal through the correct processing chain in Max for the output I am looking for. This also required me to go back into my mathematical education a little bit in order to properly understand what I was working with.

Ultimately I aimed for 4 manually changeable parameters for now: Window size (to control spectral resolution), Overlap (to control overlap between windows), Stretch factor (the most important one) and a Smoothing parameter which is supposed to help create a smoother output with some to few artefacts.

For Future Reference

Another new consideration that came up during this process was that it might be useful to have a tuner of some sort integrated tob e able to tune the edited audio as needed for the current project. However, this is not a priority right now.

Ongoing

I am currently also trying to plan first listening experiences, to be able to test my prototype in the future. My supervisor suggested I look into webmushra to set up listening test scenarios, and another idea was to set up a sonified „Find the Mistake“ station at the exhibition so people could playfully get results for me to evaluate, in a less controlled context of course.   

Results and Reflection

The stage of the project I am in right now is not the most rewarding in that I don’t get any immediate results at the moment, as I am setting up and testing the patch based off my notes and the process I noted down fort he audio signal, but I know it is essential to create a sounding prototype and am hopeful that it will pay off. Either way, I have learned a lot about digital signal processing during my research for this phase of the project, which is always useful.

Objectives for Next Time

  • Get sound through the new signal chain
  • Come up with test scenarios and mockups
  • If I get that far: Try to get reloadable presets set up

“ontextC” – Technical Diary 3

What happened so far?

A recent priority was the comparison of different phase vocoders that are available in Max. With the help of the Cycling74 resources, I tested whether the difference between the modules using polar vs. cartesian coordinates affected my sound sources in a (noticeable) way that would make me choose one over the other – ultimately cartesian coordinates seemed like the better option for my project, also in terms of CPU usage. For windowing, the Hanning window is currently in use.

Furthermore, to better understand the processes the signal goes through within the plug-in, I asked my supervisor about the meaning of phase coherence in this context, and was able to bit by bit (little terminology reference here) connect the theory and the practical application, which will help me a lot going forward.

Ongoing

The evaluation and development of EQ, pitch shifting and reverb modules for my project is ongoing. Fortunately, there are a lot of libraries and resources especially for filtering and spatial effects, so the main challenge here is to find what works best to achieve the sound results I am aiming for, while also being functional and relatively simple to integrate. By studying existing Max patches, even though they might not be 100% what I am looking for, I am learning more not just about the Max environment, but also about best practices and how I could translate certain organisational aspects (comments are so helpful for external people looking at a patch to know what is going on!) and connections into my own project patch. My main resources for this are free patches that I download from the Max for Live library patch page and explore.

Results and Reflection

While it is good to know that there is a phase vocoder that can help me to realise my vision for this project, now it is time to start thinking about how to best integrate it, and define which modifications need to be made in order to make it sound the way I want it to in the context of my project. To do so, I will draw comparisons between PaulXStretch and the Max phase vocoders, to determine limitations, potential areas of improvement and differences in sound quality at different stretch factors.

Objectives for Next Time

  • Prepare and document sound examples to compare between the phase vocoder and PaulxStretch
  • Continue development of other modules