Multimodal variational autoencoders for sensor fusion and cross generation