Chulalongkorn University Theses and Dissertations (Chula ETD)

การประยุกต์ใช้การเรียนรู้แบบเสริมกำลังกับการวางแผนทางการเงิน

Other Title (Parallel Title in Other Language of ETD)

An application of reinforcement learning to financial planning

ภัควัลย์ จันทรศิริภาส, คณะพาณิชยศาสตร์และการบัญชี

Year (A.D.)

2019

Document Type

Thesis

First Advisor

เสกสรร เกียรติสุไพบูลย์

Faculty/College

Faculty of Commerce and Accountancy (คณะพาณิชยศาสตร์และการบัญชี)

Department (if any)

Department of Statistics (ภาควิชาสถิติ)

Degree Name

วิทยาศาสตรมหาบัณฑิต

Degree Level

ปริญญาโท

Degree Discipline

สถิติ

DOI

10.58837/CHULA.THE.2019.1395

Abstract

งานวิจัยนี้มีวัตถุประสงค์ที่จะนำการเรียนรู้แบบเสริมกำลังมาประยุกต์กับการวางแผนทางการเงินเพื่อตัดสินใจเลือกอัตราส่วนของสินทรัพย์ที่ใช้ในการบริโภคและการลงทุนในสินทรัพย์ที่มีความเสี่ยงที่ดีที่สุดในแต่ละช่วงเวลาตลอดช่วงอายุของครัวเรือน ผลลัพธ์ที่ได้จากการเรียนรู้แบบเสริมกำลังซึ่งเป็นค่าประมาณ จะถูกนำมาเปรียบเทียบกับคำตอบที่ถูกต้องจากวิธี MDP สำหรับการเรียนรู้แบบเสริมกำลังในงานวิจัยนี้เป็นอัลกอริธึม SARSA โดยการเลือกการกระทำใช้วิธี ε-greedy ส่วนการประมาณค่าใช้ตัวแบบถดถอยที่มีตัวแปรต้นเป็นฟีเจอร์จากเคอร์เนล Radial Basis Function (RBF) จากการศึกษาพบว่าความผิดพลาดระหว่างค่าประมาณผลลัพธ์ที่ดีที่สุดเทียบกับคำตอบจาก MDP มีแนวโน้มลู่เข้าสู่ศูนย์ แสดงว่าการเรียนรู้แบบเสริมกำลังสามารถประยุกต์กับการวางแผนทางการเงินได้ อย่างไรก็ตาม SARSA แบบดั้งเดิมใช้เวลานานในการเรียนรู้ เมื่อปรับปรุงให้การเลือกการกระทำในช่วงแรกเน้นสำรวจมากขึ้น พบว่า ความผิดพลาดลดลง แสดงให้เห็นว่า SARSA ที่ปรับปรุงให้เน้นการสำรวจในช่วงแรกมีประสิทธิภาพดีขึ้นกว่าแบบดั้งเดิม นอกจากนี้เมื่อพิจารณาผลของการปรับเปลี่ยนปัจจัยต่างๆ สำหรับ SARSA แบบเน้นการสำรวจในช่วงแรก พบว่า ความผิดพลาดระหว่างค่าประมาณผลลัพธ์ที่ดีที่สุดเทียบกับ MDP มีค่าน้อยสุดเมื่อใช้ค่าน้ำหนักเริ่มต้นจากตัวแบบการถดถอยเชิงเส้น, จำนวนฟีเจอร์ 200 ลักษณะ, อัตราการเรียนรู้และความน่าจะเป็นในการเลือกการกระทำแบบสำรวจแบบลดลงตามเวลาที่มีค่าเริ่มต้น 0.1 และ 0.9 ตามลำดับ ในขณะที่การนำคำตอบที่ดีที่สุดไปจำลองใช้จริง ผลของการวางแผนทางการเงินที่ได้มีความแตกต่างกับคำตอบจาก MDP มาก โดยการใช้ค่าน้ำหนักเริ่มต้นจากตัวแบบการถดถอยเชิงเส้น, จำนวนฟีเจอร์ 300 ลักษณะ, อัตราการเรียนรู้และความน่าจะเป็นในการเลือกการกระทำแบบสำรวจแบบลดลงตามเวลาที่มีค่าเริ่มต้น 0.1 และ 0.9 ตามลำดับให้ผลลัพธ์ที่ใกล้เคียงกับ MDP มากสุด แสดงว่าถึงแม้ความผิดพลาดของผลลัพธ์ที่ดีที่สุดจะมีค่าต่ำสุด คำตอบจากวิธีการเรียนรู้แบบเสริมกำลังยังมีความผิดพลาดสูงเมื่อเทียบกับคำตอบจาก MDP

Other Abstract (Other language abstract of ETD)

In this study a reinforcement learning is applied to a financial planning problem to find an optimal consumption proportion and an optimal investment proportion in risky assets. The solutions from the reinforcement approach are compared with the exact solutions from an MDP approach. The algorithm used in this study is SARSA with ε-greedy action selection where the value approximation employs a regression method with Radial Basis Function (RBF) features. From the experiments, the errors between the optimal value estimated from the reinforcement learning and the exact solution from the MDP have a tendency to converge, indicating the effectiveness of the reinforcement learning in solving a financial planning problem. The algorithm is then adjusted to emphasize more on exploration. The errors from the adjusted algorithm are lower than those from the original algorithm, showing that the adjusted algorithm is more efficient than the original algorithm. In addition, considering the effects of factor adjustment of the SARSA algorithm focused on exploration in the first stage, it is found that the error between the optimal value of the reinforcement learning and the MDP is lowest when the initial weights from the linear regression model are used with 200 features and the initial decreased learning rate and epsilon are 0.1 and 0.9, respectively. When the optimal actions are used in the simulation, the obtained results of financial planning are very different compared to those from the MDP. The simulation in which 300 features are used instead gives the most similar result to the MDP. This shows that even though the error of the optimal value is lowest, the difference of the result from the reinforcement learning is still high compared to the result from the MDP.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

จันทรศิริภาส, ภัควัลย์, "การประยุกต์ใช้การเรียนรู้แบบเสริมกำลังกับการวางแผนทางการเงิน" (2019). Chulalongkorn University Theses and Dissertations (Chula ETD). 9771.
https://digital.car.chula.ac.th/chulaetd/9771

Download

Included in

Statistics and Probability Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

การประยุกต์ใช้การเรียนรู้แบบเสริมกำลังกับการวางแผนทางการเงิน

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

การประยุกต์ใช้การเรียนรู้แบบเสริมกำลังกับการวางแผนทางการเงิน

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner