Semantic representations of images and videos