Closing the Loop on Speech to Music Translation: Automatically Generating Synthetic Percussive Sequences on the Mridangam from Konnakol
Document Type
Conference Proceeding
Source of Publication
2025 IEEE International Conference on Acoustics Speech and Signal Processing Workshops Icasspw 2025 Workshop Proceedings
Publication Date
5-27-2025
Abstract
This paper presents a pipeline to convert spoken Konnakol sequences, a South Indian vocal percussion language, into synthetic rhythmic sequences performed on the mridangam. We fine-tune the Whisper speech-to-text model on Konnakol data, enabling accurate transcription of spoken sequences, despite the small size of our dataset (approximately 15 minutes). The transcriptions are rhythmically encoded in a format that is compatible with the Konnakol Typewriter, a web application that converts these sequences into mridangam audio. Additionally, these transcriptions serve as input for a Markov model, which generates new rhythmic sequences that can also be processed through the Konnakol Typewriter to produce mridangam audio. Whisper's performance is impressive with very low error rates, making it an ideal tool for this task. This pipeline not only facilitates the transcription of Konnakol but also opens possibilities for creating educational tools, preserving cultural heritage, and generating data for rhythm-based applications. Future work will focus on refining the process to improve accuracy and versatility.
DOI Link
ISBN
[9798331519315]
Publisher
IEEE
Disciplines
Computer Sciences
Keywords
Automatic Speech Recognition (ASR), Carnatic Music, Konnakol Transcription, Machine Learning, Markov Chain Generation
Scopus ID
Recommended Citation
Krishnan, Gopika; Drabek, Julia; Anantapadmanabhan, Akshay; Ganguli, Kaustuv Kanti; and Guedes, Carlos, "Closing the Loop on Speech to Music Translation: Automatically Generating Synthetic Percussive Sequences on the Mridangam from Konnakol" (2025). All Works. 7434.
https://zuscholars.zu.ac.ae/works/7434
Indexed in Scopus
yes
Open Access
no