Greg Kochanski

Introduction: What is Good Data

A lecture in Advanced Core Training in Linguistics (ACTL) , 23 October 2009, University College London. The first lecture of the "Getting Good Data" course. Instructor: Dr. Greg Kochanski

Course Description

In research, there is no substitute for getting good data: working with defective data is always difficult, and best avoided. The aim of this course is to introduce beginning research students about a variety of "best practices" in getting language data, of various types. A secondary aim is to showcase some interesting linguistic resources that students might not otherwise have encountered or think of using. In some respects, you might think of it as an advanced training in research methods.

There will be five 1h30m lectures, the first four of which are supplemented by 1-hour tutorials in the following week.

Lecture:

The actual lecture given is at http://kochanski.org/gpk/teaching/0910ACTL_UCL/getting_good_data.pdf

Experimental Design

Good data is the result of a good experimental design.

• Scientific bloopers and self-delusion. Let the subject (rather than you) answer the question. (Reading: pathological science and systematic errors. [Source] )
• Almost all experiments need statistics because probability and statistics allow us to describe and reason about our ignorance. In an experiment experiment involving humans, compare the number of things we can control to the number of possibly relevant things that we cannot. The latter is bigger. (Reading: Why use probabilities and statistics? [Source] )
• Counting Things and sampling errors. What's the difference between a frequency and a probability? A probability is a theoretical model; a sample is real.
• Counting things and statistical sampling. Opportunity sampling, stratified sampling, random sampling. [Source]
• A 5-minute introduction to hypothesis testing. (Reading: Hypothesis testing by eye. [Source] )
• Estimating the size of an experiment.

Tutorial:

Estimating the size of experiments. Practical design of experiments. Discussion of tips and techniques for particular kinds of data collection.

Students may wish to read "The Cartoon Guide to Statistics," by Larry Gonick and Woollcott Smith, HarperCollins Publishers, 1993. ISBN 0-06-273102-5.