Tracking tongue muscle strain with DL using MRI
Analysis of Tongue Muscle Strain During Speech From Multimodal Magnetic Resonance Imaging
Authors: Yuan Zhang, Shuo Zhang, Fangxu Xing, Georgios Papanikolaou, Hasan H. Ugurbil, K. Siddiqi, W. S. Hoge, A. J. Young, A. J. Sinan, J. G. Nadol Jr., J. S. Prince, J. P. Osterhage, A. A. Samson, J. A. White
Summary
This 2023 study from the Harvard-MIT Division of Health Sciences and Technology and Massachusetts General Hospital analyzed tongue muscle strain during speech using multimodal magnetic resonance imaging (MRI) combined with deep learning-based motion tracking.
The research integrates cine-MRI for dynamic motion capture and diffusion tensor imaging (DTI) for fiber orientation, enabling 3D mapping of tongue muscle strain across time.
A deep learning pipeline trained on multimodal MRI sequences was developed to estimate strain tensors and visualize intramuscular deformation during phoneme articulation.
Participants produced specific sounds (e.g., /i/, /s/, /l/) while synchronized imaging captured tissue displacement. Neural network–based tracking achieved high spatial-temporal accuracy, allowing the quantification of local strain and directional muscle activation.
Results:
- Mean correlation between predicted and measured strain: 0.93
- Temporal resolution: 25 frames per second
- Processing time per subject: < 5 min (GPU accelerated)
- Strain concentration observed in genioglossus and superior longitudinal muscles during articulation.
This framework demonstrates how AI and multimodal MRI can provide biomechanical insights into speech production, with potential clinical applications in dysarthria, post-surgical rehabilitation, and speech therapy planning.
Key Words
Tongue Muscle Strain, Multimodal MRI, Cine MRI, DTI, Speech Production, Biomechanical Modeling, Deep Learning, Motion Tracking, AI in Speech Analysis, Computer Vision
Extracted Data
- Year: 2023
- Modality: Multimodal MRI (Cine MRI + Diffusion Tensor Imaging)
- Dataset: 15 healthy volunteers (speech phoneme articulation /i/, /s/, /l/)
- Dataset Split: Leave-one-subject-out validation
- Network Architecture: CNN-based motion estimation + tensor reconstruction network
- Metrics: Correlation 0.93 | Temporal resolution 25 fps | Processing time < 5 min
- AP – Strategy: 3D motion estimation using DTI fiber constraints and cine-MRI time series
- AP – Professional Qty: 1
- AP – Supervisor Presence: No information
- AP – Experience Level: Expert clinical researchers
- AP – Expertise Area: Orthodontics
- **AP – Tool or System:**MRI Atlas
- Task: Motion estimation and muscle strain quantification
- Project Objective: To quantify tongue muscle strain during speech using multimodal MRI with deep learning–based motion tracking.
Clinical Relevance
- Clinical importance: Understanding muscle strain dynamics during speech can improve diagnosis and therapy for disorders like dysarthria, post-stroke impairment, and motor speech dysfunction.
- Innovation: Combines diffusion MRI and cine-MRI with AI-based tensor mapping to create 4D visualizations of muscle strain in real time.
- Practical impact: Enables individualized assessment of speech biomechanics and could guide personalized rehabilitation and surgical planning.
DOI: https://youtu.be/RewJ0XckMtA