Greg Kochanski

How do I do compute Z-scores for perfection?

H2 Question:

How do I compute Z-scores when I have only a small number of trials?

In our experiment each stimulus was repeated only 7 times (because the whole test was already very long) which led to several hit rates of 100% correctness and also to several false alarm rates of 0%. But in the z-transformation there are no corresponding values for these results. So one could only take approximate values for both cases which, to my mind, adulterates the discrimination results.

H2 Answer:

That's true as far as it goes, but it is a classic mistake.

You need to remember that the observed frequencies (i.e. the result of your experiment) are not the same thing as the underlying probability of success. The frequency is indeed the most likely probability (i.e. it is the MLE or Maximum Likelyhood Estimator), but it is only an estimator. So, if the observed frequency is 7 successes out of 7 trials, the single most likely value of the probability is 100%, but a probability of 99% is nearly as likely, as is 98% along with all probabilities down to about 85%. Thus, a wide range of probabilities are consistent with your observation, and each probability corresponds to a different Z-score. One can ignore this problem if you have enough data so that the underlying probability of success will be close to your observed frequency. However, here where the data is sparse and where the Z-scores change so dramatically near 100%, one cannot ignore it. A proper solution involves Bayesean techniques and takes a fair bit of work and math, but a simple approximate solution is what's known in the trade as the "add half". Compute P2=(successes+0.5)/(successes+0.5+failures+0.5), and that will give you a number (called the ELE or Expected Likelyhood Estimator) which sits in the middle of the range of acceptable probability estimates, instead of at the extreme edge (as the MLE does). Z-scores obtained from P2 will be much more useful. For more details, you can look at the Good-Turing estimator, another reasonable solution to the problem, and one that is a bit better than ELE.

[ Papers | kochanski.org | Phonetics Lab | Oxford ]

Last Modified Thu Sep 1 10:41:32 2005

Greg Kochanski: [ Home ]