Chulalongkorn University Theses and Dissertations (Chula ETD)

Neural-network based K-value prediction in clustering problems without distance computation

Other Title (Parallel Title in Other Language of ETD)

การทำนายค่า K ตามเครือข่ายประสาทในปัญหาการจัดกลุ่มโดยไม่มีการคำนวณระยะทาง

Rohhan Rabari, Faculty of Science

Year (A.D.)

2025

Document Type

Thesis

First Advisor

Chidchanok Lursinsap

Faculty/College

Faculty of Science (คณะวิทยาศาสตร์)

Department (if any)

Department of Mathematics and Computer Science (ภาควิชาคณิตศาสตร์และวิทยาการคอมพิวเตอร์)

Degree Name

Master of Science

Degree Level

Master's Degree

Degree Discipline

Computer Science and Information Technology

DOI

10.58837/CHULA.THE.2025.227

Abstract

Clustering remains a pivotal component of unsupervised learning, central to tasks such as data exploration and pattern discovery. However, most conventional clustering algorithms are parametric in nature, requiring one or more parameters to be specified in advance—most notably the number of clusters (k). These predefined parameters can drastically alter the outcome of clustering, leading to unstable or misleading results, particularly when the true structure of the data is unknown. This thesis introduces a novel framework that transforms raw input into a latent vector representation, enabling a neural network to automatically predict the optimal number of clusters without any prior parameter specification. Trained on a wide range of synthetically generated datasets spanning from single to multiple cluster scenarios, the model learns to extract and generalize distributional patterns from the data. Evaluated through 10-fold cross-validation, the model achieves validation accuracies exceeding 89% for 2D data and 91% for 3D data, with precision and recall metrics consistently above 86% and 89% respectively. When benchmarked against traditional methods such as DBSCAN, the proposed model demonstrates superior stability and prediction consistency, successfully avoiding the over-segmentation and parameter sensitivity inherent in conventional approaches. This parameter-free approach eliminates the need for manual tuning and offers a more robust and scalable alternative for cluster number estimation across diverse datasets.

Other Abstract (Other language abstract of ETD)

การจัดกลุ่มยังคงเป็นองค์ประกอบสำคัญของการเรียนรู้แบบไม่มีผู้ดูแล ซึ่งเป็นศูนย์กลางของงานต่างๆ เช่น การสำรวจข้อมูลและการค้นพบรูปแบบ อย่างไรก็ตาม อัลกอริธึม การจัดกลุ่มแบบธรรมดาส่วนใหญ่มีลักษณะเป็นพารามิเตอร์ซึ่งต้องระบุพารามิเตอร์หนึ่งตัวหรือมากกว่านั้นล่วงหน้า โดยเฉพาะอย่างยิ่ง จำนวนคลัสเตอร์ (k) พารามิเตอร์ที่กำหนดไว้ล่วงหน้าเหล่านี้สามารถเปลี่ยนผลลัพธ์ของการจัดกลุ่มได้อย่างมาก ส่งผลให้เกิดผลลัพธ์ที่ไม่เสถียรหรือเข้าใจผิด โดยเฉพาะอย่างยิ่งเมื่อไม่ทราบโครงสร้างที่แท้จริงของข้อมูล วิทยานิพนธ์นี้แนะนำกรอบงานใหม่ที่แปลงอินพุตดิบเป็นการแสดงเวกเตอร์แฝง ซึ่งทำให้เครือข่ายประสาทสามารถทำนายจำนวนคลัสเตอร์ที่เหมาะสมโดยอัตโนมัติโดยไม่ต้องระบุพารามิเตอร์ใดๆ ล่วงหน้า โมเดลนี้ได้รับการฝึกฝนจากชุดข้อมูลที่สร้างโดยสังเคราะห์ที่หลากหลายตั้งแต่สถานการณ์คลัสเตอร์เดียวไปจนถึงหลายคลัสเตอร์ โดยจะเรียนรู้ที่จะแยกและสรุปรูปแบบการกระจายจากข้อมูล แนวทางที่ไม่มีพารามิเตอร์นี้ช่วยขจัดความจำเป็นในการปรับแต่งด้วยตนเอง และเสนอทางเลือกที่มั่นคงและปรับขนาดได้สำหรับการประมาณจำนวนคลัสเตอร์ในชุดข้อมูลที่หลากหลาย

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Rabari, Rohhan, "Neural-network based K-value prediction in clustering problems without distance computation" (2025). Chulalongkorn University Theses and Dissertations (Chula ETD). 75239.
https://digital.car.chula.ac.th/chulaetd/75239

Download

Included in

Computer Sciences Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

Neural-network based K-value prediction in clustering problems without distance computation

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

Neural-network based K-value prediction in clustering problems without distance computation

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner