A comprehensive benchmark for evaluating LLM-generated ontologies