Chulalongkorn University Theses and Dissertations (Chula ETD)

Advancing voice spoofing detection in Thai : a comprehensive dataset and performance analysis on speaking styles and channel effects

Other Title (Parallel Title in Other Language of ETD)

การยกระดับการตรวจจับเสียงปลอมในภาษาไทยด้วยชุดข้อมูลที่ครอบคลุม พร้อมทั้งวิเคราะห์ผลกระทบของรูปแบบการพูดและช่องสัญญาณต่อประสิทธิภาพของระบบ

Ticho Urai, Faculty of Engineering

Year (A.D.)

2025

Document Type

Thesis

First Advisor

Ekapol Chuangsuwanich

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Department (if any)

Department of Computer Engineering (ภาควิชาวิศวกรรมคอมพิวเตอร์)

Degree Name

Master of Engineering

Degree Level

Master's Degree

Degree Discipline

Computer Engineering

DOI

10.58837/CHULA.THE.2025.188

Abstract

Voice authentication is increasingly used in applications such as banking and call center verification, but it faces serious risks from spoofed voices. Recent advances in text-to-speech (TTS) and voice cloning make it possible to generate highly natural fake speech, creating an urgent need for robust anti-spoofing systems. While most prior work focuses on English, little research addresses the Thai language. To fill this gap, we present the Chula Spoofed Speech (CSS) dataset, a large-scale Thai corpus containing 1.3M utterances of both bona fide and synthetic speech. The synthetic samples are generated using five state-of-the-art TTS systems from the same utterances as the bona fide data, covering a wide range of speakers, ages, and speaking styles. To establish benchmark results, we train strong baseline models, including AASIST and RawNet2, under different conditions. Our experiments reveal that unseen attacks and, importantly, unseen speaking styles significantly degrade performance, highlighting the necessity of style diversity in anti-spoofing datasets. Finally, we evaluate real-world telephony conditions, revealing both the strengths and limitations of current approaches.

Other Abstract (Other language abstract of ETD)

การยืนยันตัวตนด้วยเสียงเป็นเทคโนโลยีที่ได้รับความนิยมใช้อย่างแพร่หลายในแอปพลิเคชัน เช่น การธนาคาร และการยืนยันตัวตนในศูนย์บริการทางโทรศัพท์ แต่ก็ต้องเผชิญกับความเสี่ยงร้ายแรงจากการปลอมแปลงเสียง ด้วยความก้าวหน้าล่าสุดของเทคโนโลยีการแปลงข้อความเป็นเสียงและการโคลนเสียงทำให้สามารถสร้างเสียงพูดปลอมที่มีความเป็นธรรมชาติสูงมาก ซึ่งก่อให้เกิดความจำเป็นเร่งด่วนในการพัฒนาระบบป้องกันการโจมตีด้วยเสียงปลอมที่มีประสิทธิภาพ อย่างไรก็ตาม งานวิจัยส่วนใหญ่ที่ผ่านมาได้มุ่งเน้นความสนใจไปที่ภาษาอังกฤษ ขณะที่งานวิจัยเกี่ยวกับภาษาไทยยังมีอยู่อย่างจำกัด เพื่อแก้ไขปัญหานี้ เราจึงนำเสนอชุดข้อมูล Chula Spoofed Speech (CSS) ซึ่งเป็นคลังข้อมูลเสียงภาษาไทยขนาดใหญ่ที่มีจำนวนมากกว่า 1.3 ล้านประโยค ครอบคลุมทั้งเสียงจริงและเสียงสังเคราะห์ ตัวอย่างเสียงสังเคราะห์ถูกสร้างขึ้นโดยใช้ระบบแปลงข้อความเป็นเสียงที่ล้ำสมัยจำนวน 5 ระบบ จากตัวอย่างเสียงพูดเดียวกับข้อมูลเสียงจริง ซึ่งครอบคลุมผู้พูด เพศ วัย และรูปแบบการพูดที่หลากหลาย เพื่อสร้างผลลัพธ์สำหรับเป็นเกณฑ์มาตรฐาน เราได้ฝึกแบบจำลองพื้นฐานที่มีประสิทธิภาพสูง ได้แก่ AASIST และ RawNet2 ภายใต้เงื่อนไขที่แตกต่างกัน ผลการทดลองของเราเผยให้เห็นว่าการโจมตีในรูปแบบใหม่และที่สำคัญคือรูปแบบการพูดที่ไม่เคยปรากฏมาก่อน ทำให้ประสิทธิภาพของระบบตรวจจับเสียงปลอมลดลงอย่างมีนัยสำคัญ ซึ่งชี้ให้เห็นถึงความจำเป็นของความหลากหลายด้านรูปแบบการพูดในชุดข้อมูลสำหรับป้องกันการปลอมแปลงเสียง สุดท้ายนี้ เราได้ประเมินระบบภายใต้สภาวะการใช้งานจริงบนเครือข่ายโทรศัพท์ ซึ่งเผยให้เห็นทั้งจุดแข็งและข้อจำกัดของแนวทางที่ใช้ในปัจจุบัน

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Urai, Ticho, "Advancing voice spoofing detection in Thai : a comprehensive dataset and performance analysis on speaking styles and channel effects" (2025). Chulalongkorn University Theses and Dissertations (Chula ETD). 75120.
https://digital.car.chula.ac.th/chulaetd/75120

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

Advancing voice spoofing detection in Thai : a comprehensive dataset and performance analysis on speaking styles and channel effects

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

Advancing voice spoofing detection in Thai : a comprehensive dataset and performance analysis on speaking styles and channel effects

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner