Manusya, Journal of Humanities

Publication Date



In several studies the duration of segments (i.e. consonants and vowels) is measured to classify languages according to their speech rhythm. This research investigates whether Principal Component Analysis (PCA), a new method of analyzing segment-timing parameters for language classification, can be used to classify twelve Southeast Asian languages according to their timing patterns. The twelve Southeast Asian languages examined are Malay, Cebuano, Standard Thai, Southern Thai, Tai Yuan, Vietnamese, Hmong, Mien, Burmese, Sgaw Karen, Mon and Khmer. Spontaneous speech from three speakers from each language was recorded. Vocalic, consonantal, voiced, and unvoiced intervals of 30 seconds of speech, not including pauses and hesitations, from each speaker were measured and analyzed using the three language typological classification models of Ramus et al. (1999), Grabe and Low (2002), and Dellwo et al. (2007). Eight parameters calculated from the duration of all intervals were then examined. In addition, Principal Component Analysis (PCA) was used to explore the relations among the parameters. The results from the PCA show that the twelve languages can be classified into four groups: 1) Mon - Khmer; 2) Burmese - Hmong; 3) Vietnamese - Southern Thai - Tai Yuan; and 4) Malay - Cebuano. Standard Thai, Sgaw Karen, and Mien are not explicitly clustered with the other languages. The phonetic and phonological characteristics which seem to influence the twelve-language classification are the number of syllables in a word, the existence or non-existence of tone, and phonation type.

First Page


Last Page




To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.