dc.description.abstract | With the advent of PET, fMRI and MRI, the brain function areas of speech are no longer covered with an unknown veil. Even so, there are still no effective treatments for many speech disorders. While in the past speech models were dominated by vowels, this study proposes to combine the brain and the speech model to simulate Chinese syllables and tone changes. With the integrated model, we can add designated consonants to simulate CV structure, and finally applied in the simulation of disorder hypotheses to find effective ways to treat language.
In this study, the brain and speech model used were DIVA( direction into velocities articulator ) and GODIVA( gradient order DIVA ). The DIVA model contains the speech sound map (SSM), the auditory state and error map, the somatosensory state and error map, the articulatory velocity and position map, and the cerebellum, each component of the model correspond to the left anterior pre-motor cortex, the parietal and temporal cortex, the parietal lobe cortex, the motor cortex and the cerebellar cortex. The GODIVA model simulates the left inferior frontal sulcus, the frontal operculum and the pre-supplementary motor area, which respectively represent the phonological performance area, the speech structure performance area and the speech auditory mapping area.
Our approach was to apply the intersection of two models, the speech auditory map, as the projection from GODIVA to DIVA. The output of the GODIVA model was used as the brain signal instruction. The first step of this study was to change the brain instruction into auditory signal, and then use the neural network model to adjust the fundamental frequency of the learning target. At last, we compared the simulation results with the actual sound spectrum and shape of the vocal tract. In the part of the vowel spectrum, the simulation results were located within the regions of typical vowel formants except for the vowel /ㄨ/ (/u/), but were all located at the boundary regions. The first formant of the tested vowels tend to approach 450 Hz and the second formants near1600 Hz. In the part of the vocal tract shape of CV structure, we select the stop consonants and diphthon /ㄞ/ (/ai/) as the CV structure. Because Chinese stop consonants have no difference in voice cue but aspiration, we only have to adjust tongue location and intensity of the aspiration. It is obvious to notice that the same trend existed between the simulation results and the actual vocal tract shapes. However, due to the fact that the speech structure of the DIVA model is divided only into six parts, including labial, alveolar ridge, hard palate, velum, uvula and pharynx, some consonants cannot be accurately simulated. Besides, the selection of the vowel affects the simulation stability. For future study, we hope that the vocal tract shape of the DIVA model could be modified to accurately simulate all Chinese tonal syllables.
| en_US |