VGGSound Samples

All the samples are generated by our model, SoundReactor-ECT (NFE=4), on the causal stereo full-band VAE.
The model is trained on the VGGSound[1] dataset.

References

  1. H. Chen, W. Xie, A. Vedaldi and A. Zisserman, , "VGGSound: A Large-scale Audio-Visual Dataset," ICASSP 2020