Chulalongkorn University Theses and Dissertations (Chula ETD)

Path exploration with random network distillation on multi-agent reinforcement learning

Other Title (Parallel Title in Other Language of ETD)

การสำรวจเส้นทางด้วยการกลั่นตัวโครงข่ายแบบสุ่มบนการเรียนรู้เสริมกำลังหลายตัวแทน

Korawat Charoenpitaks, Faculty of Engineering

Year (A.D.)

2019

Document Type

Thesis

First Advisor

Yachai Limpiyakorn

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Department (if any)

Department of Computer Engineering (ภาควิชาวิศวกรรมคอมพิวเตอร์)

Degree Name

Master of Science

Degree Level

Master's Degree

Degree Discipline

Computer Science

DOI

10.58837/CHULA.THE.2019.162

Abstract

Intrinsic motivation is one of the potential candidates to help improve performance of reinforcement learning algorithm in complex environments. The method enhances exploration capability without explicitly told by the creator and works on any environment. This is suitable in the case of multi-agent reinforcement learning where the environment complexity is more than usual. The research presents an exploration model using intrinsic motivation built from the random network distillation algorithm to improve the performance of multi-agent reinforcement learning and compare with the benchmark in different scenarios. The concept of clipping ratio is introduced to enforces the limit on optimization magnitude. Based on the extrinsic reward, the limit in the form of clipping ratio helps truncate the excessive magnitude that may cause instability to the optimization. The experiments were carried out on two different multi-agent architectures: 1) Individual Intrinsic Motivation Architecture, and 2) Centralized Intrinsic Motivation Architecture. The experimental results showed that in case of very complex environments, Centralized Intrinsic Motivation Architecture accompanied with a small clipping ratio could gain an increase in performance. The result reported the achievement of up to 70% win-rate in both architectures which is higher than those of the benchmark at the best of 43% in 2s3z environment.

Other Abstract (Other language abstract of ETD)

แรงจูงใจภายในเป็นทางเลือกหนึ่งที่มีศักยภาพช่วยเพิ่มขีดความสามารถของอัลกอรึทึมการเรียนรู้เสริมกำลังในสภาพแวดล้อมที่ซับซ้อน วิธีการดังกล่าวขยายความสามารถในการสำรวจได้ โดยไม่ต้องอาศัยค่าที่ชัดแจ้งจากผู้สร้าง อีกทั้งยังสามารถใช้ได้ทั่วไปกับสภาพแวดล้อมใดๆ ทำให้วิธีการนี้มีความเหมาะสมกับการนำมาใช้ในกรณีของการเรียนรู้แบบเสริมกำลังหลายตัวแทน ซึ่งมีสภาพแวดล้อมซับซ้อนมากกว่าปกติ งานวิจัยนี้ได้เสนอโมเดลการสำรวจโดยใช้แรงจูงใจภายในจากอัลกอริทึมการกลั่นตัวโครงข่ายแบบสุ่มเพื่อเพิ่มสมรรถนะของการเรียนรู้เสริมกำลังหลายตัวแทน และเปรียบเทียบผลลัพธ์กับผลการทดลองจากผลเกณฑ์มาตรฐานในหลายๆ สภาพแวดล้อม ทั้งนี้ ผู้วิจัยได้นำเสนอแนวคิดค่าอัตราส่วนสำหรับตัดออกเพื่อบังคับจำกัดขนาดค่าความเหมาะ โดยอ้างอิงจากอัตราส่วนที่มาจากค่าแรงจูงใจภายนอก การใช้ค่าอัตราส่วนสำหรับตัดออกจะช่วยตัดขนาดค่าส่วนเกินที่อาจทำให้การหาค่าเหมาะสมไม่มีความเสถียร การทดลองได้ดำเนินการบนสถาปัตยกรรมหลายตัวแทนสองแบบที่แตกต่าง ประกอบด้วย สถาปัตยกรรมแรงจูงใจภายในแบบเดี่ยว และสถาปัตยกรรมแรงจูงใจภายในแบบรวมศูนย์ ผลการทดลองแสดงให้เห็นว่า ในกรณีที่สภาพแวดล้อมมีความซับซ้อนมาก สถาปัตยกรรมแรงจูงใจภายในแบบรวมศูนย์ร่วมกับอัตราส่วนสำหรับตัดออกที่มีค่าน้อย จะช่วยเพิ่มสมรรถนะได้มากกว่าปกติ โดยสามารถทำอัตราการชนะได้จนถึง 70% ในทั้งสองสถาปัตยกรรมซึ่งสูงกว่าอัตราที่ดีที่สุด 43% ของเกณฑ์เปรียบเทียบมาตรฐานในงานวิจัยอื่นที่ทดลองบนสภาพแวดล้อม 2s3z

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Charoenpitaks, Korawat, "Path exploration with random network distillation on multi-agent reinforcement learning" (2019). Chulalongkorn University Theses and Dissertations (Chula ETD). 8538.
https://digital.car.chula.ac.th/chulaetd/8538

Download

Included in

Computer Sciences Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

Path exploration with random network distillation on multi-agent reinforcement learning

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

Path exploration with random network distillation on multi-agent reinforcement learning

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner