We want robots to move around more safely in everyday environments like homes and shops, without harming humans, other robots, their surroundings, or themselves. Simultaneously, we explore how effectively a single policy learned by reinforcement learning can modulate robot behaviour, from risk-averse (cautious) to risk-neutral (maximizing the average reward), using a novel algorithm that we call risk-conditioned distributional soft actor-critic (RC-DSAC).
[Risk-conditioned distributional soft actor-critic for risk-sensitive navigation. J. Choi, C. Dance et al. ICRA 2021.]