Sound demos for "Neural Synthesis of Sound Effects Using Flow-Based Deep Generative Models"
We demonstrate sound effect variations synthesized by the models described in our paper.
Note that the example sounds used to condition the models are qualitatively different from the training and test data
with which the models where trained and evaluated in the paper.
All the original sounds have been obtained from the website Freesound (https://freesound.org/) and downsampled to 16KHz when needed.
1. Examples of explosion sounds
Variations synthesized using mel spectrograms computed from explosion sounds.
1.1 Dimensionality of the mel spectrogram conditioner
Demonstration of results from models using 10 and 30 mel bands spectrograms as conditioners at 50k training iterations
(10ch_50k and 30ch_50k).
Blast [1]
Model 10ch_50k
Model 30ch_50k
Explosion [2]
Model 10ch_50k
Model 30ch_50k
Large explosion [3]
Model 10ch_50k
Model 30ch_50k
Guns explosion [4]
Model 10ch_50k
Model 30ch_50k
1.2 Training iterations
Demonstration of results from a model using 20 mel bands at different training iterations
(20ch_10k, 20ch_50k, 20ch_200k.
Blast [1]
20ch_10k
20ch_50k
20ch_200k
Explosion [2]
20ch_10k
20ch_50k
20ch_200k
Large explosion [3]
20ch_10k
20ch_50k
20ch_200k
Guns explosion [4]
20ch_10k
20ch_50k
20ch_200k
1.3 Examples of post-processing strategies
Demonstration of results from 20ch_50k using different post-processing strategies. We refer to the paper for more details.
Blast [1]
Unprocessed
20ch_50k
Ultraprocessed
Explosion [2]
Unprocessed
20ch_50k
Ultraprocessed
2. Examples of style transfer
Variations synthesized using mel spectrograms computed from non-explosion sounds using the model 20ch_50k.