Chulalongkorn University Theses and Dissertations (Chula ETD)

การใช้กลุ่มของภาพฉากเพื่อจำแนกวิดีโอจากรายการโทรทัศน์

Other Title (Parallel Title in Other Language of ETD)

Using clustered frames to classify videos from television programs

อิทธิศักดิ์ เผือกศรี, คณะวิศวกรรมศาสตร์

Year (A.D.)

2019

Document Type

Thesis

First Advisor

สุกรี สินธุภิญโญ

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Department (if any)

Department of Computer Engineering (ภาควิชาวิศวกรรมคอมพิวเตอร์)

Degree Name

วิทยาศาสตรมหาบัณฑิต

Degree Level

ปริญญาโท

Degree Discipline

วิทยาศาสตร์คอมพิวเตอร์

DOI

10.58837/CHULA.THE.2019.1147

Abstract

งานวิจัยนี้นำเสนอวิธีการจำแนกวิดีโอ ด้วยเทคนิคแบบจำลองคอนโวลูชันสองมิติ และการเรียนรู้แบบกึ่งกำกับ โดยทั่วไปการจำแนกวิดีโอที่มีประสิทธิภาพสูง ถูกนำเสนอโดยใช้วิธีการเรียนรู้แบบลึก อย่างไรก็ตามจากการเพิ่มขึ้นของจำนวนวิดีโอในปัจจุบัน การเรียนรู้ของแบบจำลองเพื่อจำแนกวิดีโอจำเป็นต้องใช้ประสิทธิภาพในการประมวลผลสูง งานวิจัยนี้จึงนำเสนอวิธีการเรียนรู้ด้วยแบบจำลองคอนโวลูชันสองมิติโดยใช้การซ้อนทับกันของภาพฉาก และการจัดกลุ่มของภาพฉากด้วยแผนที่จัดระเบียบด้วยตนเองก่อนนำไปสร้างแบบจำลองจำแนกประเภทรายการ โดยการสร้างแบบจำลองประเภทรายการถูกนำเสนอใน 4 รูปแบบ ประกอบด้วย การออกเสียง การคำนวณค่าความวุ่นวาย การเรียนรู้ด้วยแบบจำลองโครงข่ายประสาทเทียม การเรียนรู้ด้วยหน่วยความจำระยะสั้นแบบยาว อีกทั้งยังประเมินจำนวนภาพฉากสำหรับการประมวลผลในการจัดกลุ่มโดยเปรียบเทียบระหว่างระยะเวลาการเรียนรู้และความแม่นยำ วิธีการในงานวิจัยนี้ถูกนำเสนอด้วยประเมินจากการเรียนรู้ด้วยชุดข้อมูลวิดีโอจำนวน 18 ประเภท 912 วิดีโอ จากรายการโทรทัศน์ ในการประเมินด้วยการประเมินผลแบบไขว้ จำนวน 5 โฟลด์ วิธีการในงานวิจัยนี้มีความแม่นยำเฉลี่ยร้อยละ 71.98 และใช้เวลาในการเรียนรู้โดยเฉลี่ยประมาณ 40 นาที นอกจากนี้ยังเปรียบเทียบกับการเรียนรู้ด้วยแบบจำลองอื่นๆ อาทิ แบบจำลองคอนโวลูชันสามมิติ และแบบจำลองคอนโวลูชันร่วมกับหน่วยความจำระยะสั้นแบบยาว รวมถึงประเมินผลกับชุดข้อมูลพื้นฐาน Hollywood2 ซึ่งการเรียนรู้มีความแม่นยำเฉลี่ยร้อยละ 93.72

Other Abstract (Other language abstract of ETD)

This research presents techniques, including Convolutional Neural Network and Semi-Supervised Learning, to classify video clips. Usually, many tasks are done by categorizing video clips using deep learning techniques. However, based on the number of online videos today, it is necessary to use high computing power to accomplish this task. We present a traditional technique using a two-dimensional Convolutional Neural Network by stacking frames and propose using the Self-Organizing Map (SOM) to cluster video frames. We then classified them using simple voting, calculating entropy, neural networks, and Long-Short Term Memory (LSTM). We also show finding frame numbers that are used to cluster video frames according to accuracy and training time. The results of this approach are presented based on testing 18 specific classes of real-world datasets from TV-programs containing 912 videos. The authors evaluated the techniques using five-fold cross-validation that our method archived 71.98% of average accuracy. Their computing time was then assessed, which achieved approximately 40 minutes of average computing time. Moreover, we also compared the present proposal to other baseline models, including C3D and CNN-LSTM, and also evaluate the technique with Hollywood2 that archived 93.72% of average accuracy.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

เผือกศรี, อิทธิศักดิ์, "การใช้กลุ่มของภาพฉากเพื่อจำแนกวิดีโอจากรายการโทรทัศน์" (2019). Chulalongkorn University Theses and Dissertations (Chula ETD). 9523.
https://digital.car.chula.ac.th/chulaetd/9523

Download

Included in

Computer Sciences Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

การใช้กลุ่มของภาพฉากเพื่อจำแนกวิดีโอจากรายการโทรทัศน์

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

การใช้กลุ่มของภาพฉากเพื่อจำแนกวิดีโอจากรายการโทรทัศน์

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner