Potential-Based Reward Shaping Preserves Pareto Optimal Policies

Mannion, Patrick; Devlin, Sam; Karl, Mannion; Duggan, Jim

View/Open

Potential-Based Reward Shaping Preserves Pareto Optimal Policies.pdf (170.2Kb)

Date

2017-05

Author

Mannion, Patrick

Devlin, Sam

Karl, Mannion

Duggan, Jim

Metadata

Show full item record

Abstract

Reward shaping is a well-established family of techniques that have been successfully used to improve the performance and learning speed of Reinforcement Learning agents in singleobjective problems. Here we extend the guarantees of Potential- Based Reward Shaping (PBRS) by providing theoretical proof that PBRS does not alter the true Pareto front in MORL domains. We also contribute the rst empirical studies of the e ect of PBRS in MORL problems.

URI

https://research.thea.ie/handle/20.500.12065/2391

Collections

Other - School of Science, ATU Galway City [8]

The following license files are associated with this item:

Creative Commons

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland