Chulalongkorn University Theses and Dissertations (Chula ETD)

การศึกษาการรู้จำตัวอักษรพิมพ์ภาษาไทยโดยวิธีซินแทกติก

Other Title (Parallel Title in Other Language of ETD)

A study on recognition of printed Thai characters by the syntactic method

สนธยา เมรินทร์, คณะวิศวกรรมศาสตร์

Year (A.D.)

1994

Document Type

Thesis

First Advisor

สมชาย จิตะพันธ์กุล

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Degree Name

วิศวกรรมศาสตรมหาบัณฑิต

Degree Level

ปริญญาโท

Degree Discipline

วิศวกรรมไฟฟ้า

DOI

10.58837/CHULA.THE.1994.3

Abstract

การวิจัยครั้งนี้มีจุดมุ่งหมาย เพื่อศึกษาการรู้จำตัวอักษรพิมพ์ภาษาไทยโดยวิธีซินแทกติก วิธีที่ใช้แบ่งออกเป็น 2 ขั้นตอนคือ การวิเคราะห์โครงร่างของต้นไม้ และการวิเคราะห์ทาง feature หลังจากที่ข้อมูลภาพถูกทำให้เป็นเวกเตอร์เรียบร้อยแล้ว จะถูกนำไปทำการรู้จำ ขั้นตอนแรกเป็นการจำแนกขั้นต้น ประกอบด้วยการแปลงเวกเตอร์ให้เป็นต้นไม้ของ primitive, การวัดค่าระยะระหว่างต้นไม้ของตัวอักษรที่ต้องการรู้จำ กับตัวอักษรต้นแบบ ส่วนขั้นตอนหลังเป็นการจำแนกโดยละเอียด โดยนำเอาลักษณะเด่นของตัวอักษรมาวิเคราะห์เพื่อเพิ่มความถูกต้อง ของการรู้จำให้มากขึ้น หากผลการรู้จำไม่อยู่ในเกณฑ์ที่ยอมรับได้ เวกเตอร์ของตัวอักษรจะถูกนำไปปรับปรุงเพื่อตัดส่วนเกินออก หรือต่อส่วนขาดของตัวอักษรเข้าด้วยกัน จากนั้นจึงถูกนำไปรู้จำโดยวิธีเดิมอีก จนกว่าผลการรู้จำจะอยู่ในเกณฑ์ที่น่าพอใจ หรือจนกว่าไม่สามารถปรับปรุงเวกเตอร์ได้อีก ผลการรู้จำที่ได้มีค่าประมาณ 97% จากจำนวนตัวอักษรที่ทำการทดสอบ 966 ตัว (ใช้เป็นตัวอักษรต้นแบบ 101 ตัว) ใช้เวลาในการประมวลผลโดยเฉลี่ย 1.09 วินาทีต่อตัวอักษรบนเครื่อง IBM-PC 486sx ที่ความเร็ว 33 เมกกะเฮิรตซ์

Other Abstract (Other language abstract of ETD)

This study proposes a method for recognition of Thai printed characters by the syntactic method. The recognition method combines techniques of syntactic tree structure analysis and global feature analysis. After vectorization, the former method is used for rough classification, including vector to postfix tree of primitive transformation and the distance between tree of input character and template character that the head of character addresses in the same zone as input character. Next, the latter method is used for fine classification. Dominant characteristics of character are analyzed to increase accuracy of recognition. If a character is not recognized, the vector is enhanced by using the joint of two closely broken lines or the cutting of shortest line of input character. Then the recognition procedure is processed again until result of recognition is accepted otherwise it is rejected. The recognition rate is about 97% of 966 test characters, 101 characters are used for templates, and average recognition time is 1.09 second per character on the IBM-PC 486sx, 33 MHz clock speed machine

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

เมรินทร์, สนธยา, "การศึกษาการรู้จำตัวอักษรพิมพ์ภาษาไทยโดยวิธีซินแทกติก" (1994). Chulalongkorn University Theses and Dissertations (Chula ETD). 63312.
https://digital.car.chula.ac.th/chulaetd/63312

Link to Full Text

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

การศึกษาการรู้จำตัวอักษรพิมพ์ภาษาไทยโดยวิธีซินแทกติก

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

การศึกษาการรู้จำตัวอักษรพิมพ์ภาษาไทยโดยวิธีซินแทกติก

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Share

Search

Browse

Author Corner