Chulalongkorn University Theses and Dissertations (Chula ETD)

Text localization and extraction from background with texture and noise in digital images using adaptive thresholding and convolutional neural network

Other Title (Parallel Title in Other Language of ETD)

การระบุและคัดแยกข้อความออกจากพื้นหลังที่มีพรรณลักษณ์และสัญญาณรบกวนในภาพดิจิทัลโดยใช้การขีดแบ่งปรับตัวและโครงข่ายประสาทแบบสังวัฒนาการ

Pukjira Pattaranuprawat, Faculty of Science

Year (A.D.)

2019

Document Type

Thesis

First Advisor

Rajalida Lipikorn

Faculty/College

Faculty of Science (คณะวิทยาศาสตร์)

Department (if any)

Department of Mathematics and Computer Science (ภาควิชาคณิตศาสตร์และวิทยาการคอมพิวเตอร์)

Degree Name

Master of Science

Degree Level

Master's Degree

Degree Discipline

Applied Mathematics and Computational Science

DOI

10.58837/CHULA.THE.2019.16

Abstract

For the past few years, research topics on finding position of text have received more and more attention from researchers because there are still lots of problems that are needed to be solven. We propose a novel method to find the position of text in an image. The first step of the proposed method is to adjust an image by converting a color image to a gray scale image and then use an average filter to improve an image. An average filter is used to make the background smooth and reduce noise. After that an adaptive thresholding is used to convert a gray scale image to a binary image. In the second step, the 8-neighbor connected component is used to generate bounding boxes to find the position of text candidates and the size of a bounding box is used to classify a bounding box as a text candidate or a non-text candidate. In the third step, the width-to-height ratio and Ostu’s thresholding are used to divide text candidate into character candidates. Then convolutional neural network is used to categorize a character candidate as character or non-character. The proposed method was evaluated on a data set from ICDAR 2013 which contains 229 training image data and 233 test image data. The classification revealed that the proposed method yields 71.87% of recall and 36.59% of precision. This research can find bounding box of character effectively but it can not delete bounding box of non-character well.

Other Abstract (Other language abstract of ETD)

ในช่วงไม่กี่ปีที่ผ่านมางานวิจัยเกี่ยวกับการหาตำแหน่งของข้อความได้รับความสนใจเพิ่มมากขึ้นเนื่องจากยังมีปัญหาเกี่ยวกับการหาตำแหน่งข้อความอีกหลายปัญหาที่ยังไม่สามารถแก้ไขได้ ในงานวิจัยนี้นำเสนอวิธีใหม่ในการหาตำแหน่งของข้อความในภาพ ขั้นตอนแรกของวิธีที่นำเสนอคือการปรับปรุงรูปภาพ โดยการแปลงภาพสีให้เป็นภาพระดับสีเทา จากนั้นใช้ตัวกรองค่าเฉลี่ยเพื่อปรับปรุงภาพ โดยตัวกรองค่าเฉลี่ยจะทำให้พื้นหลังกลมกลืมและทำให้สัญญาณรบกวนลดลง หลังจากนั้นจะใช้วิธีขีดแบ่งปรับตัวเพื่อแปลงภาพระดับสีเทาให้เป็นภาพขาวดำ ขั้นตอนที่สองจะเป็นการใช้ส่วนประกอบที่เชื่อมติดกันแบบแปดเพื่อนบ้านเพื่อสร้างกล่องสี่เหลี่ยม เพื่อหาตำแหน่งของข้อความและใช้ขนาดของกล่องข้อความเพื่อจำแนกส่วนประกอบในกล่องสี่เหลี่ยมว่าส่วนประกอบที่น่าจะเป็นข้อความหรือส่วนประกอบที่ไม่ใช่ข้อความ ขั้นตอนที่สามจะใช้อัตราส่วนความกว้างต่อความสูงและวิธีขีดแบ่งแบบออสซุ เพื่อแยกส่วนประกอบที่น่าจะเป็นข้อความให้เป็นส่วนประกอบที่น่าจะเป็นตัวอักษร จากนั้นใช้โครงข่ายประสาทแบบสังวัตนาการเพื่อทำการจำแนกส่วนประกอบว่าใช่ตัวอักษรหรือไม่ใช่ตัวอักษร ข้อมูลที่ใช้ในการประเมินวิธีที่นำเสนอเป็นข้อมูลที่ได้มาจาก ICDAR 2013 ซึ่งประกอบไปด้วยภาพที่ใช้สำหรับสอนโครงข่าย 229 ภาพและภาพที่ใช้เป็นตัวทดสอบ 233 ภาพ ผลจากการจำแนกแสดงให้เห็นว่าวิธีที่นำเสนอให้ค่าการเรียกคืนเท่ากับ 71.87% และค่าเที่ยงเท่ากับ 36.59% จากผลลัพธ์ค่าการเรียกคืนและค่าเที่ยงเท่า งานวิจัยนี้สามารถหากล่องข้อความของตัวอักษรได้อย่างมีประสิทธิภาพแต่ยังไม่สามารถลบกล่องข้อความที่ไม่ใช่ตัวอักษรได้ดีพอ

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Pattaranuprawat, Pukjira, "Text localization and extraction from background with texture and noise in digital images using adaptive thresholding and convolutional neural network" (2019). Chulalongkorn University Theses and Dissertations (Chula ETD). 8392.
https://digital.car.chula.ac.th/chulaetd/8392

Download

Included in

Mathematics Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

Text localization and extraction from background with texture and noise in digital images using adaptive thresholding and convolutional neural network

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

Text localization and extraction from background with texture and noise in digital images using adaptive thresholding and convolutional neural network

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner