Automatically learning fallback strategies with model-free reinforcement learning in safety-critical driving scenarios