The thread linking all of my research projects is that I build and solve computational models of complex systems. My research is in computational phonetics, learning about the strategies we use to produce speech. Since speech and language are complicated things, this involves large experiments and data sets.
Broadly speaking, my work is in the overlap of Linguistics, Psychology and Computer Science.
My summer 2008 project involves testing speech recognition systems on the conversational parts of the British National Corpus. (In collaboration with John Coleman (Oxford) and Jiahong Yuan (UPenn).)
The BNC has data that is almost unobtainable in these days of careful ethics approval and release forms. It contains conversations recorded by people who wore tape recorders as they went about their normal business. capturing the chaos of day-to-day speech. The BNC has been available as transcribed text, but so far the speech in it has been inaccessible. We're currently working on aligning the text to the speech.
From a ASR point of view, it is a very challenging set of data because of the wide variety of the recording conditions. Noise will change from the rustle of paper to bird song to traffic sounds. The room acoustics varies from outdoors to the interior of an automobile.
|[ Papers | kochanski.org | Phonetics Lab | Oxford ]||Last Modified Thu Jul 9 22:12:52 2015||Greg Kochanski: [ Home ]|