Chulalongkorn University Theses and Dissertations (Chula ETD)

A channel and spatial attention feature extraction method for remote sensing image super-resolution using vision transformer

Other Title (Parallel Title in Other Language of ETD)

วิธีสกัดคุณลักษณะแบบช่องและเชิงพื้นที่ที่สนใจสำหรับการสร้างภาพความละเอียดสูงยิ่งยวดของภาพถ่ายทางไกลด้วยตัวแปลงการเห็น

Naveed Sultan, Faculty of Engineering

Year (A.D.)

2023

Document Type

Thesis

First Advisor

Supavadee Aramvith

Faculty/College

Faculty of Engineering (คณะวิศวกรรมศาสตร์)

Department (if any)

Department of Electrical Engineering (ภาควิชาวิศวกรรมไฟฟ้า)

Degree Name

Master of Engineering

Degree Level

Master's Degree

Degree Discipline

Electrical Engineering

DOI

10.58837/CHULA.THE.2023.862

Abstract

In recent years, convolution neural networks have achieved a remarkable advancement in remote sensing image super-resolution due to the complexity and variability of textures and structure. Current deep learning-based super-resolution models concentrate less on high-frequency features, which leads to suboptimal performance in capturing contours, texture, and spatial information. State-of-the-art deep learning methods now focus on feature extraction of remote sensing images using attention mechanisms. However, these methods are still incapable of effectively identifying and utilizing key content attention. The existing transformer models do not focus on capturing the spatial information in their encoder and decoder parts. To cover these problems, we proposed a channel and spatial attention feature extraction method using a vision transformer that effectively extracts and hierarchically embeds the features. A spatial gated feed-forward network was introduced in the transformer encoder and decoder part to capture the non-linear spatial information while embedding and defusing the features. The proposed model was trained on the UCMerced dataset using three scale factors. The experimental results show that our model focuses on the relevant features and suppresses irrelevant ones, enhancing the reconstructed images' quality. Our model archived a PSNR of 0.35 dB, 0.22 dB, and 0.19 dB on a scale factor of 2,3 and 4, respectively, compared to existing models.

Other Abstract (Other language abstract of ETD)

ปัจจุบัน โครงข่ายประสาทเทียมแบบคอนโวลูชันมีความก้าวหน้าเป็นอย่างมากในการสร้างภาพความละเอียดสูงยิ่งยวดของภาพถ่ายทางไกล เนื่องจากความซับซ้อนและความเปลี่ยนแปลงของพื้นผิวและโครงสร้างในภาพถ่ายทางไกล โดยปัจจุบันโมเดลสร้างภาพความละเอียดสูงยิ่งยวดภายใต้การเรียนรู้เชิงลึก มุ่งเน้นไปที่ลักษณะความถี่สูงน้อยลง ซึ่งทำให้ประสิทธิภาพในการจับเส้นขอบ ข้อมูลพื้นผิวและข้อมูลเชิงพื้นที่ไม่ดี วิธีการที่ใช้เทคโนโลยีการเรียนรู้เชิงลึกล่าสุด เน้นไปที่การสกัดคุณลักษณะของภาพถ่ายทางไกลโดยใช้วิธีสกัดคุณลักษณะที่สนใจ อย่างไรก็ตาม วิธีการเหล่านี้ยังไม่สามารถระบุข้อมูลที่สำคัญ และสัญญาณที่สนใจในส่วนข้อมูลที่เป็นประโยขน์ โมเดลตัวแปลงอื่นๆ ไม่เน้นข้อมูลเชิงพื้นที่ในส่วนของการเข้าและถอดรหัส เพื่อแก้ปัญหานี้ เราเสนอโมดูลวิธีสกัดคุณลักษณะแบบช่องและเชิงพื้นที่ที่สนใจ โดยใช้ตัวแปลงการเห็น ซึ่งสกัดและฝังลักษณะได้อย่างมีประสิทธิภาพ โดยโครงข่ายแบบเกตเชิงพื้นที่ชนิดป้อนไปข้างหน้า ถูกนำมาใช้ในส่วนของตัวแปลงการเข้าและถอดรหัสเพื่อหาข้อมูลเชิงพื้นที่แบบไม่เป็นเชิงเส้น ขณะการรวมและฝังคุณลักษณะ โมเดลได้รับการฝึกฝนด้วยชุดข้อมูล ยูซีเอ็มเมอร์ซีต จำนวนสามอัตราส่วน ซึ่งผลการทดลองแสดงให้เห็นว่าโมเดลที่เราเสนอเน้นที่คุณลักษณะที่เกี่ยวข้อง สามารถปรับลดลักษณะที่ไม่เกี่ยวข้อง และเพิ่มคุณภาพของภาพ โดยโมเดลของเรามีค่าพีเอสเอ็นอาร์ดีกว่า 0.35 0.22 และ 0.19 เดซิเบลที่อัตราขยาย สอง สาม และ สี่เท่า ตามลำดับเมื่อเปรียบเทียบกับวิธีการอื่นๆ

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Sultan, Naveed, "A channel and spatial attention feature extraction method for remote sensing image super-resolution using vision transformer" (2023). Chulalongkorn University Theses and Dissertations (Chula ETD). 11989.
https://digital.car.chula.ac.th/chulaetd/11989

Download

Included in

Electrical and Electronics Commons

COinS

Chulalongkorn University Theses and Dissertations (Chula ETD)

A channel and spatial attention feature extraction method for remote sensing image super-resolution using vision transformer

Other Title (Parallel Title in Other Language of ETD)

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Search

Browse

Author Corner

Chulalongkorn University Theses and Dissertations (Chula ETD)

A channel and spatial attention feature extraction method for remote sensing image super-resolution using vision transformer

Other Title (Parallel Title in Other Language of ETD)

Author

Year (A.D.)

Document Type

First Advisor

Faculty/College

Department (if any)

Degree Name

Degree Level

Degree Discipline

DOI

Abstract

Other Abstract (Other language abstract of ETD)

Creative Commons License

Recommended Citation

Included in

Share

Search

Browse

Author Corner