|
|
| Instructors: Chilin Shih and Greg P. Kochanski | May 20 - August 8, 2002, 6:00 - 9:25 P.M., Wednesdays |
The goal of this course is to apply statistical and mathematical and physiological concepts to understand and to provide a quantitative description of language. We emphasize hands-on experience using on-line databases, testing and verifying hypotheses, and acquiring basic computational skills. Students will use their knowledge to build a software model of phoneme duration, and also to identify which language a document is in.
There is one required textbook: Cartoon Guide: The Cartoon Guide to Statistics, by Larry Gonick and Woollcott Smith. HarperPerennial, 1993, New York. ISBN 0-06-273102-5.
Conditional probabilities, Bayes' Theorem, MAP (Maximum A-posteriori Estimation), Good-Turing estimation.
Frequency, ratio, rank, mean, median, standard deviation, Zipf's law, graphical display, N-grams, CART, multiple linear regression.
Linguistic units: phones, syllables, words, phone per second, documents, linguistic distribution, durations, document language.
Articulator motions, articulatory definition of phonemes, springs, masses, and accelerations, muscle physiology, control strategies.
Language identification, author identification, modeling of speech segment durations.
Unix file management, pipe, tr, sort, uniq, basic programming: sh, awk.
| [ Linguistics | Montclair ] | Greg Kochanski: [ Mail | Home ] |