Target Tracking Control for an Unmanned Surface Vessel: Optimal Control vs Reinforcement Learning

Frafjord, Aksel Johan

Frafjord, Aksel Johan

Master thesis

Published version

Åpne

Frafjord_acit2023.pdf (4.181Mb)

Permanent lenke

https://hdl.handle.net/11250/3100591

Utgivelsesdato

2023

Metadata

Vis full innførsel

Samlinger

TKD - Master i Anvendt data- og informasjonsteknologi (ACIT) [237]

Sammendrag

This thesis studies the development and performance of Nonlinear Model Predictive Control (NMPC) and Reinforcement Learning (RL) for a target-tracking problem. The methodology involves developing the NMPC and RL approach and comparing their performance through simulated experiments. In the simulations, the controllers steer the Otter unmanned surface vessel (USV) to track a virtual target. The resulting NMPC controller performed with a stable error of approximately 0.7m with a refresh rate of 1.6 Hz. Whereas the best-performing RL Agent demonstrated a twofold performance. In the first part, the Agents managed an error between 0 and approximately 2m. However, when surpassing the experienced observation space from the training session, the Agent generated unfeasible controller signals, resulting in the Otter circling the target while tracking it. The Agent achieved a 1kHz refresh rate. In conclusion, the NMPC may need to be faster for practical implementations, and RL Agents require further development to be reliable. Therefore, for future work, it is suggested to use NMPC as an expert and apply imitation learning when training the Agents to achieve the best of both methods.

Utgiver

Oslomet - storbyuniversitetet