SPEAKER IDENTIFICATION RESEARCH BASED ON CONDITIONAL WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKS

Zhang Gaofeng; Liu Tian; Xie Xiaomin; Ma Qun

doi:10.3969/j.issn.1000-386x.2025.08.029

Zhang Gaofeng, Liu Tian, Xie Xiaomin, Ma Qun. SPEAKER IDENTIFICATION RESEARCH BASED ON CONDITIONAL WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKSJ. Computer Applications and Software, 2025, 42(8): 213-218,241. DOI: 10.3969/j.issn.1000-386x.2025.08.029

Citation:

SPEAKER IDENTIFICATION RESEARCH BASED ON CONDITIONAL WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKS

Abstract

Abstract

In low-resource scenarios, traditional speaker identification methods fail to extract sufficient effective information for network training, leading to model overfitting. Inspired by the success of generative adversarial networks GANs in image processing, we propose a speaker identification method based on conditional Wasserstein GAN C-WGAN. Real-sample FBANK features were fed as conditional input to the generator, controlling the synthesis of specified simulated samples. Wasserstein distance measured the divergence between speech feature distributions, stabilizing training and preventing mode collapse. Experiments show that the method achieves a classification error rate CER of 1. 96%, representing 67. 2% and 53. 9% reductions compared with x-vector and CNN baselines. It also demonstrates strong competitiveness under low sampling rates.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

SPEAKER IDENTIFICATION RESEARCH BASED ON CONDITIONAL WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKS

Abstract

Catalog

Export File

Citation

Format

Content