Tail-latency aware scheduler for inference workloads