New insights into hierarchical clustering and linguistic normalization for speaker diarization