Module collect_aesop1

A script to run a reading experiment. Text stimulus, verbal response. It puts text up on a screen, the subject reads it, and the program records what the subject says. It is designed for recording largish blocks of text (paragraph sized) but is flexible and automated.

Usage

Run it as:

       collect_data -d some_directory groupname

and it will read some_directory/stimuli/groupname.fiat display stimuli, record speech, and write some_directory/response/groupname/datecode.fiat that contains all the input stimuli, along with metadata. It also produces one .wav file per utterance, named some_directory/response/groupname/datecode/StimulusNumber_RepetitionNumber.wav

These are Fiat 1.2 format data files, and can be read with gmisclib.fiatio from the Sourceforge speechresearch project. The Fiat format is defined at http://www.phon.ox.ac.uk/files/pdfs/fiat.pdf.

Features

Source code is available, so it can be modified as necessary.
Written with in Python using the widely available GTK package for ease of installation.
Instructions to the subject and the texts the subject reads can be written in any Unicode characters, so the software can be used for most languages.
Customizeable via a file to defines the experiment.
Yields a file of metadata describing exactly what went on.
Suitable for reading paragraph-sized chunks of text.

Operation

The software can be conveniently run if the experimenter has the keyboard and the subject has the mouse. In the body of the experiment, the subject clicks "next" to see a prompt, then speaks. Then, the experimenter types "q" to terminate the recording. The software then pauses, waiting for the subject to click "Next" (to go on to the next stimulus) or "Repeat" to read the stimulus again. Alternatively, the experimenter can hit the space bar to go forward, or the "r" key to repeat a reading.

Note that the software can enforce a limit on how many times a text can be read, via the Maxreps value.

If the experimenter types 's', the recording is terminated and deleted, effectively skipping that stimulus. A comment is left in the output file, but no other metadata for this utterance. The experimenter can also type "x" -- this acts like "q", but leaves a mark in the "flag" column.

The software records one audio .wav file for each line in the control file, and writes one line in the output metadata file.

In typical operation, the groupname selects which experimental group a given subject is in. A subject is then simply identified by the filename of the output metadata file, so the data is naturally anonymized.

However, if a pre-existing subject ID needs to be carried through, or if a single subject comes in for more than one session, then the easiest solution is to use a different groupname for each subject. Typically, you'd use the subject ID number as the groupname, and simply make a copy of the input (control) file for each subject in a group.

Control File Format

The program reads a file (also in Fiat format) that controls many aspects of the experiment (on a stimulus-by-stimulus basis if need be). The variables below affect the experiment; any other data in the input file is simply passed to the output metadata file.

Values can be set in the header of the input Fiat file (in which case they have effect throughout the experiment) and/or they can be given columns of their own. If they have a column of their own, and a value is present then that value over-rides the default. (Note that the code %na in a column means that no value is given, so the value specified in the header, if any, would be used.) To say this again: the values set in the header and the columns of data in the input file are merged together. When the program is looking for a value, it looks first in the column data, then if nothing is found, it takes the value from the header.

The software has three text areas. A small one above for instructions to the subject, another small one above for status information, and a big one below for material to read.

INSTR_* : Instructions to be given to the subject at various points in the experiment. The instructions appear in the upper area.
INSTR_key_for_first : Show this just before the first stimulus is presented.
INSTR_last_chance : Warn the subject that this is their last try for this stimulus.
INSTR_continue_repeat : Ask wheter to continue to the next stimulus or repeat the last one?
INSTR_continue_norepeat : Much like continue repeat, except this is presented on the last stimulus, when there is nothing more to do, but you could still try the final stimulus again.
INSTR_read : "Please read the text below" or similar.
INSTR_welcome : An instruction to present at the beginning of the experiment. (E.g. "Welcome")
INSTR_thanks : An instruction to present at the end of the experiment (E.g. "Thank you")
B_repeat : The text of the "Repeat stimulus" button
B_next : The text of the "Next stimulus" button
STAT_recording : What to put in the "status" box when the recorder is running for the first time on a stimulus
STAT_repeating : What to put in the "status" box when the recorder is running on a repeat of a stimulus.
Maxreps : How many times can you repeat a stimulus?
BigInfo : What to put on the main screen while the subject is waiting. (This is typically blank -- it is a way to emphasize unusual instructions.)

Output (Metadata) File Information

All the input control information is copied to the output metadata file. Additional columns are added as follows:

stimulusTime1 : A moment shortly before the stimulus is visible.
stimulusTime2 : A moment shortly after the stimulus is visible.
d : A directory containing data, relative to the directory that holds the metadata file.
f : The root of a particular utterance's audio file, within d. d and f are used together, so the path to the audio file is d/f.wav, starting at the directory that holds the output metadata file.
recordStartTime1 : A moment shortly before the recording starts.
recordStartTime2 : A moment shortly after the recording process has been forked. Unfortunately, we do not know if the recording has started yet or not, but at least the recording process has been created. On modern Linux systems (c2009) the recording starts no more than 50ms after recordStartTime1.
RecordEndTime1 : A moment shortly after the recording program has shut down.
i : A integer count of which utterance, 0...N.
rep : An integer count of how many times the subject has attempted this utterance.
flag : Zero, or one (if the 'x' key was pressed during the utterance).

Software downloads should also be available from the "speechresearch" project on http://sourceforge.org, the Oxford University Library system, and http://kochanski.org/gpk .

This software is copyright Greg Kochanski (2010) and is available under the Lesser Gnu Public License, version 3 or higher. It was funded by the UK's Economic and Social Research Council under project RES-062-23-1323. This is available from http://sourceforge.org/projects/speechresearch, http://kochanski.org/gpk/papers/2010/aesop_data_collect, and http://www.phon.ox.ac.uk/files/releases/2008aesopus2_data_collect.tar

Copyright: Greg Kochanski, 2010

License: Gnu Public License, version 3 or higher.

Contacts:: gpk@kochanski.org, greg.kochanski@phon.ox.ac.uk

Version: Aesop.0.20.2

Note: Please cite in academic papers as "data collection software used in "Rhythm measures with language-independent segmentation", Anastassia Loukina, Greg Kochanski, Chilin Shih, Elinor Keane and Ian Watson Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009). ISSN 1990-9772 Brighton, UK, 7--10 September 2009, pp 1531--1534. The software may be downloaded from http://www.phon.ox.ac.uk/files/releases/2008aesopus2_data_collect.tar . (URL checked ZZZ/ZZZ/ZZZ.)

Classes

[hide private]

autokilled_process
This class represents a subprocess that's automatically started and automatically killed by __del__ or an explicit call to close().

gui
This is the Graphical User Interface for the experimental data collection software.

experiment_c
A class that defines the sequence of the experiment.

Functions

[hide private]

run(argv)

source code

Variables

[hide private]

ROOT = '/projects/aesop/data_files'

TEXT_ROOT = '/projects/aesop/Texts_for_recording'

RATE = 16000

CHANNELS = 2

LMARGIN = 100

RMARGIN = 100

CID_R = 1

__package__ = None

Imports: os, signal, datetime, subprocess, fiatio, gpkmisc, die, EC, gtk, gobject