Investigating Temporal Convolutional Networks for Automated Stroke Transcription in the Mridangam
Document Type
Conference Proceeding
Source of Publication
2025 IEEE International Conference on Acoustics Speech and Signal Processing Workshops Icasspw 2025 Workshop Proceedings
Publication Date
5-27-2025
Abstract
This study explores the use of Temporal Convolutional Networks (TCN) for automatic stroke transcription of the mridangam in two approaches: 1) predicting individual strokes using audio segments by leveraging contextual information from previous strokes; 2) predicting stroke sequences, capturing temporal dependencies and patterns within stroke segments. The TCNs demonstrated strong performance, indicating that they are well suited for the task. In the first approach, we surpass an established model from previous research by 7% in classification accuracy and another one by 4%. The second approach showed promising results, demonstrating proficient learning of sequential information. We also demonstrate the model's sensitivity to contextual information by experimenting with sequences containing shuffled strokes, which results in a dramatic decline in performance.
DOI Link
ISBN
[9798331519315]
Publisher
IEEE
Disciplines
Computer Sciences
Keywords
Automatic Drum Transcription (ADT), Carnatic Music, Machine Learning, Mridangam Transcription, Temporal Convolutional Networks
Scopus ID
Recommended Citation
Krishnan, Gopika; Anantapadmanabhan, Akshay; Ganguli, Kaustuv Kanti; and Guedes, Carlos, "Investigating Temporal Convolutional Networks for Automated Stroke Transcription in the Mridangam" (2025). All Works. 7433.
https://zuscholars.zu.ac.ae/works/7433
Indexed in Scopus
yes
Open Access
no