應用於語者驗證之雙序列門控注意力單元架構;Dual-Sequences Gated Attention Unit Architecture for Speaker Verification

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/85114

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/85114

題名:	應用於語者驗證之雙序列門控注意力單元架構;Dual-Sequences Gated Attention Unit Architecture for Speaker Verification
作者:	陳登國;Khoa, Tran Dang
貢獻者:	電機工程學系
關鍵詞:	應用於語者驗證之雙序列門控注意力單元架構
日期:	2021-01-27
上傳時間:	2021-03-18 17:41:58 (UTC+8)
出版者:	國立中央大學
摘要:	在本文中，我們提出了一種GRU結構的變體，稱為雙序列門控注意單元（DS-GAU），其中計算了x向量基線的每個TDNN層的統計池，並將其通過DS-GAU層傳遞，在訓練為幀級時從輸入要素的不同時間上下文中聚合更多信息。我們提出的架構在VoxCeleb2數據集上進行了訓練，其中特徵向量稱為DSGAU-向量。我們對VoxCeleb1數據集和“野生演說者”（SITW）數據集進行了評估，並將實驗結果與x矢量基線系統進行了比較。結果表明，相對於VoxCeleb1數據集的x向量基線，我們提出的方法在EER相對改進方面最多可存檔11.6％，7.9％和7.6％.;In this thesis, we present a variant of GRU architecture called Dual-Sequences Gated Attention Unit (DS-GAU), in which the statistics pooling from each TDNN layer of the x-vector baseline are computed and passed through the DS-GAU layer, to aggregate more information from the variant temporal context of input features while training as frame-level. Our proposed architecture was trained on the VoxCeleb2 dataset, where the feature vector is referred to as a DSGAU-vector. We made our evaluation on the VoxCeleb1 dataset and the Speakers in the Wild (SITW) dataset and compared the experimental results with the x-vector baseline system. It showed that our proposed method archived up to 11.6%, 7.9%, and 7.6% in EER relative improvements over the x-vector baseline on the VoxCeleb1 dataset.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	100	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....