Visual assessment of articulatory motion of the vocal tract in speech has been drastically advanced by recent developments in high-speed real-time MRI. However, the variation of speaking rate among different subjects prevents quantitative analysis of data from a larger study group. We present a pipeline of methods that aligns audio and image data in the time domain and produces temporally matched image volumes for various subjects. Comparison of the cross-correlation score before and after time alignment showed an increased similarity between source and target image sequences, enabling production of preprocessed multi-subject data for the subsequent statistical atlas construction studies.
This abstract and the presentation materials are available to members only; a login is required.