Deepfake Audio Detection
A model was developed and trained on the created dataset to detect deepfake audios in a multilingual setting with high accuracy and was developed and trained on the created dataset to detect deepfake audios in a multilingual setting with high accuracy.
Abstract
The evolutions in technology has resulted in crucial developments in AI-generated tools providing an easier and efficient way to streamline workflow. Although these advancements bring forward vast opportunities, they are also introducing new challenges to cybersecurity, privacy and ethical concerns. One such threat is the rise of deepfake and advancement in deepfake generation technologies. Due to limited availability of multilingual datasets in this field, a multilingual dataset was created using TED talks, and 3 audio generation models, for 4 different languages: English, Arabic, Chinese (Mandarin) and Spanish, as they are the most widely and commonly used languages in the world. This, additionally helped in comparing which methods are better at generating deepfake audios. Secondly, a model was developed and trained on the created dataset to detect deepfake audios in a multilingual setting with high accuracy.