Investigating Temporal Convolutional Networks for Automated Stroke Transcription in the Mridangam

Document Type

Conference Proceeding

Source of Publication

2025 IEEE International Conference on Acoustics Speech and Signal Processing Workshops Icasspw 2025 Workshop Proceedings

Publication Date

5-27-2025

Abstract

This study explores the use of Temporal Convolutional Networks (TCN) for automatic stroke transcription of the mridangam in two approaches: 1) predicting individual strokes using audio segments by leveraging contextual information from previous strokes; 2) predicting stroke sequences, capturing temporal dependencies and patterns within stroke segments. The TCNs demonstrated strong performance, indicating that they are well suited for the task. In the first approach, we surpass an established model from previous research by 7% in classification accuracy and another one by 4%. The second approach showed promising results, demonstrating proficient learning of sequential information. We also demonstrate the model's sensitivity to contextual information by experimenting with sequences containing shuffled strokes, which results in a dramatic decline in performance.

ISBN

[9798331519315]

Publisher

IEEE

Disciplines

Computer Sciences

Keywords

Automatic Drum Transcription (ADT), Carnatic Music, Machine Learning, Mridangam Transcription, Temporal Convolutional Networks

Scopus ID

105007714216

Indexed in Scopus

yes

Open Access

no

Share

COinS