Instructors: Chilin Shih and Greg P. Kochanski | May 20 - August 8, 2002, 6:00 - 9:25 P.M., Wednesdays |
The goal of this course is to apply statistical and mathematical and physiological concepts to understand and to provide a quantitative description of language. We emphasize hands-on experience using on-line databases, testing and verifying hypotheses, and acquiring basic computational skills. Students will use their knowledge to build a software model of phoneme duration, and also to identify which language a document is in.
There is one required textbook: Cartoon Guide: The Cartoon Guide to Statistics, by Larry Gonick and Woollcott Smith. HarperPerennial, 1993, New York. ISBN 0-06-273102-5.
H2 Syllabus H3 Probability concepts:Conditional probabilities, Bayes' Theorem, MAP (Maximum A-posteriori Estimation), Good-Turing estimation.
H3 Statistical concepts:Frequency, ratio, rank, mean, median, standard deviation, Zipf's law, graphical display, N-grams, CART, multiple linear regression.
H3 Linguistics concepts:Linguistic units: phones, syllables, words, phone per second, documents, linguistic distribution, durations, document language.
H3 Physiological Concepts:Articulator motions, articulatory definition of phonemes, springs, masses, and accelerations, muscle physiology, control strategies.
H3 Applications:Language identification, author identification, modeling of speech segment durations.
H3 Computation skills:Unix file management, pipe, tr, sort, uniq, basic programming: sh, awk.
[ Linguistics | Montclair ] | Greg Kochanski: [ Mail | Home ] |