Magnetic Resonance Imaging (MRI) can produce scans of parts of the body which, unlike X-Ray photography, show soft tissues and additionally do not have a radiation risk. But of course MRI scans cannot be taken at anywhere near the rate at which the tongue moves during articulation. Hence the attached videos of movements of the tongue and lips in the articulation of short phrases were gathered by a procedure known as gated MRI, video MRI or stroboscopic MRI. This is a procedure that has been used to show movements within the heart as it beats. Use is made of the fact that the heart beats at a (roughly) regular rate; by taking a scan at a slightly later stage in each successive beat, the illusion can be produced of movements within the heart.

For watching the articulation of a phrase in the mouth, the drawback is that the subject must repeat the utterance 18 to 20 times to the beat of a metronome. The more precisely the subject repeats the phrase, the better the final image quality will be. The procedure starts with a short pre-MRI training session to get the subject used to the phrases and the metronome.

The outcome is a series of 68 frames, showing a sagittal section of the head, which can then be converted to video. Because the procedure involves a number of repetitions gated between metronome beats, it is circular; thus the beginning of a phrase shows up not as the first of the 68 frames but at a place that varies from phrase to phrase. The frames have to be carefully inspected to identify this beginning, though subjects usually place the first accented syllable of the phrase on the metronome beat. Thus, the first frame usually shows the vowel in that syllable and the last frame wraps around into that same vowel.

Audio is not available because of the noise-reduction techniques that have to be applied to overcome the very loud sound of the MRI scanner. It results in audio which has a ‘tinny’ quality and sounds non-human, like a dalek.

Because the repetitions are circular, there may be regressive (i.e. anticipatory) and progressive influences between the end and the beginning of a phrase. (Some of the phrases were spoken with a pause between the end and the beginning of the next repetition, but not all of them were.) Hence a number of frames showing such assimilation have to be discarded at the boundary. Once the sequence of frames has been treated in this way it can then be converted to video. Finally, to enable the videos to be stopped at appropriate places, they have been slowed down by a factor of twelve. Thus, for example, Video 1 ‘Show a bull’ was 2 secs in real time, but is here slowed down to 24 secs.

Before matching commentaries to videos, readers and watchers should familiarise themselves with the general appearance of the vocal tract. Start from the spine and the spinal vertebrae. In front of vertebrae 4-5-6 can be seen firstly the oesophagus (food pipe) and then the trachea (windpipe). The larynx and the vocal cords cannot be seen. In front of the first two vertebrae is the back wall of the pharynx (throat). The epiglottis, which serves to block off the oesophagus when speaking, can be seen at the top of the trachea. Then the big blob above and forward of the epiglottis is the tongue. Above the tongue, the thin sliver is the hard palate and where it broadens towards the back is the soft palate. When we look at videos containing a nasal, the soft palate can be seen moving away from the back wall of the pharynx, thus opening up the airway to the nose. Where the hard palate curls downwards at the front is the alveolar ridge (teeth ridge). The teeth cannot be seen (although we can tell where the upper teeth are when the tip of the tongue moves forward to form [θ,ð]). At the very front of the mouth are the lips, which we can see coming together to form [p,b,m].

Images and videos prepared for this website by Alan Cruttenden. Thanks to Greg Kochanski; to Chris Alves; to the Phonetics Laboratory, Oxford University; and to the Churchill Hospital, Oxford.

