Chulalongkorn University Theses and Dissertations (Chula ETD)

Diffusion model-based synthetic image augmentation for enhancing vehicle detection model performance

Other Title (Parallel Title in Other Language of ETD)

การเพิ่มประสิทธิภาพของแบบจำลองตรวจจับยานพาหนะด้วยการขยายข้อมูลภาพสังเคราะห์จากแบบจำลองการแพร่

Pawaris Parnphotong, Faculty of Engineering

Year (A.D.)

2025

Document Type

Thesis

First Advisor

Punnarai Siricharoen

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Department (if any)

Department of Computer Engineering (ภาควิชาวิศวกรรมคอมพิวเตอร์)

Degree Name

Master of Science

Degree Level

Master's Degree

Degree Discipline

Computer Science

DOI

10.58837/CHULA.THE.2025.175

Abstract

A critical challenge for modern object detection models is the significant performance degradation encountered in adverse weather or poor lighting conditions. This issue primarily stems from the imbalance of training datasets, which are predominantly composed of normal daytime scenes, resulting in reduced robustness when models face challenging scenarios. Although traditional data augmentation techniques exist, they are limited to pixel-level adjustments and inherently restricted in generating data diversity. Consequently, this research proposes an approach utilizing diffusion models to generate realistic synthetic images that simulate new atmospheric and environmental conditions. To maximize the efficiency of synthetic data generation, we investigated a crucial factor: the quality of the reference images. We introduce a Quality-Aware Reference Selection method that filters candidates based on resolution, blur levels, and object clarity. Experiments conducted on a Thailand highway traffic surveillance dataset (14,856 images covering 14 vehicle classes) using the Qwen-Image-Edit model demonstrate that our Quality-Aware Selection method, utilizing 2,768 synthetic images, achieved a mean mAP50 of 84.5% ± 0.6%. This represents a statistically significant improvement (p < 0.05) compared to the Baseline (83.2% ± 1.0%), Random Reference Selection (83.5%), and Traditional Augmentation (84.4%), despite the latter employing a much larger set of 10,788 images. The findings of this research confirm that data augmentation with synthetic images is most effective when reference image quality is appropriately selected. This selection process is a key factor in significantly enhancing the mean average precision (mAP50) of object detection models (p < 0.05), achieving superior performance while utilizing 74% fewer synthetic images than traditional methods.

Other Abstract (Other language abstract of ETD)

ปัญหาสำคัญของแบบจำลองการตรวจจับวัตถุในปัจจุบัน คือประสิทธิภาพที่ลดลงอย่างชัดเจนเมื่อต้องทำงานในสภาพอากาศที่เลวร้ายหรือสภาพแสงที่ไม่เอื้ออำนวย สาเหตุหลักเกิดจากความไม่สมดุลของชุดข้อมูลที่ใช้ฝึกสอนแบบจำลอง ซึ่งมักประกอบด้วยภาพเหตุการณ์ปกติในเวลากลางวันเป็นส่วนใหญ่ ส่งผลให้แบบจำลองมีประสิทธิภาพลดลงเมื่อเจอสถานการณ์ที่หลากหลาย แม้จะมีเทคนิคการเพิ่มข้อมูลแบบดั้งเดิมแต่ก็ทำได้เพียงปรับแต่งในระดับพิกเซลซึ่งมีข้อจำกัดในการสร้างความหลากหลายของข้อมูล งานวิจัยนี้จึงนำเสนอแนวทางการใช้แบบจำลองการแพร่ เพื่อสร้างภาพสังเคราะห์ที่จำลองบรรยากาศและสภาพแวดล้อมใหม่ได้อย่างสมจริง เพื่อให้การสร้างข้อมูลสังเคราะห์มีประสิทธิภาพสูงสุด งานวิจัยชิ้นนี้จึงได้ศึกษาปัจจัยสำคัญคือคุณภาพของภาพต้นฉบับ โดยงานวิจัยนี้ได้นำเสนอวิธีการคัดกรองคุณภาพที่พิจารณาจากความละเอียด ความเบลอ และความชัดเจนของวัตถุ จากการทดสอบกับชุดข้อมูลกล้องวงจรปิดบนทางหลวงในประเทศไทย (จำนวน 14,856 ภาพ ครอบคลุมยานพาหนะ 14 ประเภท) ด้วย Qwen-Image-Edit ผลการทดลองชี้ให้เห็นว่า วิธีการคัดเลือกภาพต้นฉบับที่มีคุณภาพ โดยใช้ภาพสังเคราะห์จำนวน 2,768 ภาพ สามารถเพิ่มค่าความแม่นยำเฉลี่ย (mAP50) ได้ถึง 84.5% ± 0.6% ซึ่งเป็นการเพิ่มขึ้นอย่างมีนัยสำคัญทางสถิติ (p < 0.05) เมื่อเทียบกับวิธีที่ไม่ใช้ภาพสังเคราะห์ (83.2% ± 1.0%) วิธีการสังเคราะห์ภาพแบบสุ่มภาพต้นฉบับ (83.5%) และการเพิ่มข้อมูลแบบดั้งเดิมที่ใช้จำนวนภาพมากถึง 10,788 ภาพ (84.4%) บทสรุปของงานวิจัยนี้ยืนยันว่า การเพิ่มข้อมูลด้วยภาพสังเคราะห์จะมีประสิทธิภาพสูงสุดเมื่อมีการคัดเลือกคุณภาพของภาพต้นฉบับอย่างเหมาะสม ซึ่งถือเป็นปัจจัยสำคัญในการเพิ่มค่าความแม่นยำเฉลี่ย (mAP50) ของแบบจำลองการตรวจจับวัตถุอย่างมีนัยสำคัญทางสถิติ (p < 0.05) โดยใช้ภาพสังเคราะห์น้อยกว่าวิธีการแบบดั้งเดิมถึง 74%

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Parnphotong, Pawaris, "Diffusion model-based synthetic image augmentation for enhancing vehicle detection model performance" (2025). Chulalongkorn University Theses and Dissertations (Chula ETD). 75094.
https://digital.car.chula.ac.th/chulaetd/75094

Download

Included in

Computer Sciences Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

Diffusion model-based synthetic image augmentation for enhancing vehicle detection model performance

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

Diffusion model-based synthetic image augmentation for enhancing vehicle detection model performance

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner