Generative audio refers to the creation of audio files from databases of audio clips. This technology differs from AI voices such as Apple's Siri or Amazon's Alexa, which use a collection of fragments that are stitched together on demand.
Generative audio works by using neural networks to learn the statistical properties of an audio source, then reproduces those properties.[1]
Implications
With this technology, a person's voice can be replicated to speak phrases that they may have never spoken. This could lead to a synthetic version of a public figure's voice being used against them.[2]
Technology
This method uses generative adversarial network (GAN), a deep machine learning technique where two machine learning models work against each other to create realistic audio.[3]
See also
References
- ↑ "Fake news: you ain't seen nothing yet". The Economist. July 2017. Retrieved 2017-07-01.
- ↑ Zotkin, D. N.; Shamma, S. A.; Ru, P.; Duraiswami, R.; Davis, L. S. (April 2003). "Pitch and timbre manipulations using cortical representation of sound". 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). Vol. 5. pp. V–517–20. doi:10.1109/ICASSP.2003.1200020. ISBN 978-0-7803-7663-2. S2CID 10372569.
- ↑ Mobin, Shariq (October 2016). "Voice Conversion using Convolutional Neural Networks". arXiv:1610.08927 [stat.ML].