A multimodal approach to music transcription