Home / Papers / On Speech Datasets in Machine Learning for Healthcare

On Speech Datasets in Machine Learning for Healthcare

3 Citations•2023•

Jekaterina Novikova, Aparna Balagopalan

journal unavailable

The cross-lingual method demonstrates improvements in Aphasia detection over unilingual baselines, and the early results on the newly collected dataset show the promise to achieve a strong baseline in Alzheimer's disease detection.

Abstract

Multi-language speech datasets are scarce and often have small sample sizes in the medical domain. We address this problem by employing the cross-linguistic transfer methods and by collecting the large longitudinal dataset of impaired speech. The cross-lingual method demonstrates improvements in Aphasia detection over unilingual baselines, and the early results on the newly collected dataset show the promise to achieve a strong baseline in Alzheimer’s disease detection