collect_aesop1

1 #! python 2 3 """A script to run a reading experiment. Text stimulus, verbal response. 4 It puts text up on a screen, the subject reads it, and the 5 program records what the subject says. It is designed for 6 recording largish blocks of text (paragraph sized) but is 7 flexible and automated. 8 9 Usage 10 ===== 11 12 Run it as:: 13 14 collect_data -d some_directory groupname 15 16 and it will read I{some_directory}C{/stimuli/}I{groupname}C{.fiat} 17 display stimuli, record speech, and write 18 I{some_directory}C{/response/}I{groupname/datecode}C{.fiat} that contains 19 all the input stimuli, along with metadata. 20 It also produces one .wav file per utterance, 21 named I{some_directory}C{/response/}I{groupname/datecode/StimulusNumber_RepetitionNumber}C{.wav} 22 23 These are Fiat 1.2 format data files, and can be read with 24 L{gmisclib.fiatio} from the U{Sourceforge<http://www.sourceforge.org>} 25 U{speechresearch project<http://sourceforge.net/projects/speechresearch/>}. 26 The Fiat format is defined at U{http://www.phon.ox.ac.uk/files/pdfs/fiat.pdf}. 27 28 Features 29 ======== 30 31 - Source code is available, so it can be modified as necessary. 32 33 - Written with in Python using the widely available GTK package for ease of installation. 34 35 - Instructions to the subject and the texts the subject reads can be 36 written in any Unicode characters, so the software can be used for 37 most languages. 38 39 - Customizeable via a file to defines the experiment. 40 41 - Yields a file of metadata describing exactly what went on. 42 43 - Suitable for reading paragraph-sized chunks of text. 44 45 Operation 46 ========= 47 48 The software can be conveniently run if the experimenter 49 has the keyboard and the subject has the mouse. In the 50 body of the experiment, the subject clicks "next" to see 51 a prompt, then speaks. Then, the experimenter types 52 "q" to terminate the recording. The software then pauses, waiting 53 for the subject to click "Next" (to go on to the next stimulus) 54 or "Repeat" to read the stimulus again. 55 Alternatively, the experimenter can hit the space bar to go 56 forward, or the "r" key to repeat a reading. 57 58 Note that the software can enforce a limit on how many times a 59 text can be read, via the C{Maxreps} value. 60 61 If the experimenter types 's', the recording is terminated and 62 deleted, effectively skipping that stimulus. A comment is left in 63 the output file, but no other metadata for this utterance. The experimenter 64 can also type "x" -- this acts like "q", but leaves a mark in the 65 "flag" column. 66 67 The software records one audio .wav file for each line in the control file, 68 and writes one line in the output metadata file. 69 70 In typical operation, the I{groupname} selects which experimental group 71 a given subject is in. A subject is then simply identified by the filename 72 of the output metadata file, so the data is naturally anonymized. 73 74 However, if a pre-existing subject ID needs to be carried through, 75 or if a single subject comes in for more than one session, then the 76 easiest solution is to use a different I{groupname} for each subject. 77 Typically, you'd use the subject ID number as the groupname, and simply 78 make a copy of the input (control) file for each subject in a group. 79 80 Control File Format 81 =================== 82 83 The program reads a file (also in Fiat format) that controls many aspects of the experiment 84 (on a stimulus-by-stimulus basis if need be). The variables below affect the experiment; 85 any other data in the input file is simply passed to the output metadata file. 86 87 Values can be set in the header of the input Fiat file (in which case they have effect 88 throughout the experiment) and/or they can be given columns of their 89 own. If they have a column of their own, and a value is 90 present then that value over-rides the default. 91 (Note that the code C{%na} in a column means that no value is given, 92 so the value specified in the header, if any, would be used.) 93 To say this again: the values set in the header and the columns of data in the input 94 file are merged together. When the program is looking for a value, it looks first 95 in the column data, then if nothing is found, it takes the value from the header. 96 97 The software has three text areas. A small one above for instructions to the 98 subject, another small one above for status information, 99 and a big one below for material to read. 100 101 102 - INSTR_* : Instructions to be given to the subject at various points 103 in the experiment. The instructions appear in the upper area. 104 105 - C{INSTR_key_for_first} : Show this just before the first stimulus is presented. 106 - C{INSTR_last_chance} : Warn the subject that this is their last try for this stimulus. 107 - C{INSTR_continue_repeat} : Ask wheter to continue to the next stimulus or repeat the last one? 108 - C{INSTR_continue_norepeat} : Much like continue repeat, except this is 109 presented on the last stimulus, when there is nothing 110 more to do, but you could still try the final stimulus 111 again. 112 - C{INSTR_read} : "Please read the text below" or similar. 113 - C{INSTR_welcome} : An instruction to present at the beginning of the experiment. (E.g. "Welcome") 114 - C{INSTR_thanks} : An instruction to present at the end of the experiment (E.g. "Thank you") 115 - C{B_repeat} : The text of the "Repeat stimulus" button 116 - C{B_next} : The text of the "Next stimulus" button 117 - C{STAT_recording} : What to put in the "status" box when the recorder is running 118 for the first time on a stimulus 119 - C{STAT_repeating} : What to put in the "status" box when the recorder is running 120 on a repeat of a stimulus. 121 - C{Maxreps} : How many times can you repeat a stimulus? 122 - C{BigInfo} : What to put on the main screen while the subject is waiting. 123 (This is typically blank -- it is a way to emphasize unusual instructions.) 124 125 Output (Metadata) File Information 126 ================================== 127 128 All the input control information is copied to the output metadata file. Additional columns are 129 added as follows: 130 131 - C{stimulusTime1} : A moment shortly before the stimulus is visible. 132 - C{stimulusTime2} : A moment shortly after the stimulus is visible. 133 - C{d} : A directory containing data, relative to the directory that holds the metadata file. 134 - C{f} : The root of a particular utterance's audio file, within C{d}. C{d} and C{f} 135 are used together, so the path to the audio file is C{d/f.wav}, starting at the directory 136 that holds the output metadata file. 137 - C{recordStartTime1} : A moment shortly before the recording starts. 138 - C{recordStartTime2} : A moment shortly after the recording process has been forked. 139 Unfortunately, we do not know if the recording has started yet or not, 140 but at least the recording process has been created. On modern Linux 141 systems (c2009) the recording starts no more than 50ms after recordStartTime1. 142 - C{RecordEndTime1} : A moment shortly after the recording program has shut down. 143 - C{i} : A integer count of which utterance, 0...N. 144 - C{rep} : An integer count of how many times the subject has attempted this utterance. 145 - C{flag} : Zero, or one (if the 'x' key was pressed during the utterance). 146 147 Software downloads should also be available from the "speechresearch" project 148 on http://sourceforge.org, the Oxford University Library system, and 149 http://kochanski.org/gpk . 150 151 This software is copyright Greg Kochanski (2010) and is 152 available under the Lesser Gnu Public License, version 3 or higher. 153 It was funded by the UK's Economic and Social Research 154 Council under project RES-062-23-1323. This is available from 155 http://sourceforge.org/projects/speechresearch, 156 http://kochanski.org/gpk/papers/2010/aesop_data_collect, and 157 http://www.phon.ox.ac.uk/files/releases/2008aesopus2_data_collect.tar 158 159 @copyright: Greg Kochanski, 2010 160 @license: Gnu Public License, version 3 or higher. 161 @contact: gpk@kochanski.org 162 @contact: greg.kochanski@phon.ox.ac.uk 163 @version: Aesop.0.20.2 164 @note: Please cite in academic papers as "data collection software used in 165 "Rhythm measures with language-independent segmentation", 166 Anastassia Loukina, Greg Kochanski, Chilin Shih, Elinor Keane and Ian Watson 167 Proceedings of the 10th Annual Conference of the International 168 Speech Communication Association (Interspeech 2009). ISSN 1990-9772 169 Brighton, UK, 7--10 September 2009, pp 1531--1534. 170 The software may be downloaded from 171 http://www.phon.ox.ac.uk/files/releases/2008aesopus2_data_collect.tar . 172 (URL checked ZZZ/ZZZ/ZZZ.) 173 """ 174 175 import os 176 import signal 177 import datetime 178 import subprocess 179 from gmisclib import fiatio 180 from gmisclib import gpkmisc 181 from gmisclib import die 182 import exp_collection as EC 183 import gtk 184 import gobject 185 186 ROOT = '/projects/aesop/data_files' 187 TEXT_ROOT = '/projects/aesop/Texts_for_recording' 188 RATE = 16000 189 CHANNELS = 2 190 191 LMARGIN = 100 # Blank margin to left of stimulus area 192 RMARGIN = 100 # Blank margin to right of stimulus area 193 194 CID_R = 1 195 196

197 -class autokilled_process(object):

198 """This class represents a subprocess that's automatically 199 started and automatically killed by C{__del__} or an explicit call to C{close}(). 200 """

201 - def __init__(self, args, **kv):

202 self.p = subprocess.Popen(args, **kv)

203

204 - def __del__(self):

205 self.close()

206

207 - def close(self):

208 if self.p.poll() is None: 209 os.kill(self.p.pid, signal.SIGINT) 210 self.p.wait() 211 return self.p.returncode

212 213 214 215

216 -class gui(EC.GUI_base):

217 """This is the Graphical User Interface for the experimental data 218 collection software. 219 @note: all the S_* functions represent states during the data collection 220 process. The program hops from one to the other, around in a 221 loop through the S_* functions for each utterance. 222 """

223 - def set_button_texts(self, rs, ns, repeat=None, next=None):

224 """The GUI has two buttons for the subject to press. One moves on to the 225 next paragraph to read; the other repeats the current paragraph. 226 227 @param rs: Should the 'repeat' key accept clicks? 228 @type rs: L{bool} 229 @param ns: Should the 'next' key accept clicks? 230 @type ns: L{bool} 231 @param repeat: A label for the "repeat" button 232 @type repeat: str or None 233 @param next: A label for the "next" button 234 @type next: str or None 235 """ 236 if repeat is not None: 237 self._repeat.set_label(repeat) 238 if next is not None: 239 self._next.set_label(next) 240 self._repeat.set_sensitive(rs) 241 if rs: # KLUGE 242 self._repeat.hide() 243 self._repeat.show() 244 self._next.set_sensitive(ns) 245 if ns: # KLUGE 246 self._next.hide() 247 self._next.show()

248 249

250 - def __init__(self, extra_line_space=5, extra_para_space=5, 251 top_font=None, stim_font=None):

252 """Creates an instance of the GUI. (Normally there is just one.) 253 @param extra_line_space: How many extra pixels should separate one line of the stimulus from the next? 254 @type extra_line_space: L{int}, in pixels 255 @param extra_para_space: How many extra pixels should separate one paragraph of the stimulus from the next? 256 @type extra_para_space: L{int}, in pixels 257 @param top_font: The name of the font used in the top section of the GUI. 258 @param stim_font: The name of the font used to present the stimulus 259 @type stim_font: L{str}, passed into L{pango.FontDescription} 260 @type top_font: L{str}, passed into L{pango.FontDescription} 261 """ 262 self._next = None 263 self._repeat = None 264 EC.GUI_base.__init__(self, extra_line_space=extra_line_space, 265 extra_para_space=extra_para_space, 266 top_font=top_font, stim_font=stim_font)

267 268

269 - def connect_experiment(self, expcall, log):

270 """Connect the GUI to the class that defines the experiment. 271 @param log: A place to log everything that happens in the experiment 272 @type log: normally an instance of L{fiatio.merged_writer} 273 @type expcall: a function pointer 274 @param expcall: a function that knows how to change the experiment's state in 275 response to keyboard and mouse events. 276 """ 277 # We need to stick in the extra "None" in these 278 # calls to substitute for the "event" argument 279 # that isn't supplied by Button clicks, but is 280 # supplied by keyboard keypress events. 281 self._repeat.connect("clicked", expcall, None, log) 282 self._next.connect("clicked", expcall, None, log) 283 EC.GUI_base.connect_experiment(self, expcall, log)

284 285 286

287 -class experiment_c(EC.experiment_base):

288 """A class that defines the sequence of the experiment. 289 """

290 - def __init__(self, hdrs, stimlist, log, outname):

291 # You can set paragraph and line spacing and fonts 292 # by adding extra arguments to the GUI_base constructor. 293 self.gui = gui(top_font="Serif 12", stim_font="Serif 24") 294 self.gui._stim.set_left_margin(LMARGIN) 295 self.gui._stim.set_right_margin(RMARGIN) 296 EC.experiment_base.__init__(self, stimlist, hdrs, self.gui) 297 298 self.info = {} 299 self.p = None 300 self.repcount = 0 301 self.gui.connect_experiment(self.event, log) 302 self.status = '-' 303 self.outname = outname

304 305

306 - def instruct(self, s):

307 """Give an instruction to the subject. 308 """ 309 text = self.get('INSTR_%s' % s) 310 self.gui.instr_win().get_buffer().set_text(text)

311 312

313 - def present(self, s):

314 """Present a stimulus to the subject. 315 """ 316 if 'textfile' in s and s['textfile']: 317 text = EC.get_text(os.path.join(self.hdr['TEXT_ROOT'], s['textfile'])) 318 elif s['text'].startswith('@'): 319 text = EC.get_text(os.path.join(self.hdr['TEXT_ROOT'], s['text'][1:])) 320 else: 321 text = s['text'] 322 self.info['stimulusTime1'] = datetime.datetime.now().isoformat() 323 self.gui.stim_win().get_buffer().set_text(text) 324 gobject.idle_add(self.set_timing_info, 'stimulusTime2')

325 326

327 - def clear_stimulus(self):

328 self.gui.stim_win().get_buffer().set_text('')

329 330

331 - def set_timing_info(self, name):

332 self.gui.window.get_screen().get_display().sync() 333 # Ensure that the display is actually 334 # showing the prompt. 335 self.info[name] = datetime.datetime.now().isoformat() 336 return False

337 338

339 - def get_audio_file_name(self):

340 self.info['d'] = self.outname 341 self.info['f'] = "%06d_%1d" % (self.i, self.repcount) 342 return os.path.join(self.info['d'], self.info['f']) + '.wav'

343 344

345 - def start_recorder(self):

346 audiofilename = self.get_audio_file_name() 347 gpkmisc.makedirs(os.path.dirname(audiofilename)) 348 args = ['arecord', '-t', 'wav', '-f', 'S16_LE', 349 '-r', str(RATE), '-c', str(CHANNELS), 350 audiofilename 351 ] 352 print 'subprocess', args 353 self.info['recordStartTime1'] = datetime.datetime.now().isoformat() 354 self.p = autokilled_process(args) 355 self.info['recordStartTime2'] = datetime.datetime.now().isoformat() 356 return audiofilename

357

358 - def ok_cont(self):

359 return not self.is_last_stimulus()

360 361

362 - def ok_rep(self):

363 return self.i>=0 and self.repcount<int(self.get('Maxreps'))

364 365

366 - def S_waiting(self, ev, log):

367 """Waiting for the user to do something. 368 """ 369 if self.first_entry(): 370 if self.i < 0: 371 self.instruct('key_for_first') 372 elif self.is_last_stimulus(): 373 self.instruct('last_chance') 374 elif self.repcount >= int(self.get('Maxreps')): 375 self.instruct('continue_norepeat') 376 else: 377 self.instruct('continue_repeat') 378 379 self.gui.set_button_texts(self.ok_rep(), True, 380 self.get('B_repeat'), 381 self.get('B_next')) 382 biginfo = self.get('BigInfo', '') 383 if biginfo: 384 if biginfo.startswith('@'): 385 # print '@BigInfo from', self.hdr['TEXT_ROOT'], biginfo[1:] 386 biginfo = EC.get_text(os.path.join(self.hdr['TEXT_ROOT'], biginfo[1:])) 387 # print '@BigInfo got', biginfo 388 self.gui.stim_win().get_buffer().set_text(biginfo) 389 390 # print 'S_waiting', ev 391 if ev is self.gui._next or ev == ' ': 392 self.clear_stimulus() 393 return self.S_continue 394 elif ev is self.gui._repeat or ev == 'r': 395 self.clear_stimulus() 396 return self.S_repeat 397 return None

398 399

400 - def S_continue(self, ev, log):

401 # print 'S_continue', ev 402 if self.ok_cont(): 403 self.repcount = 0 404 self.next_stimulus() 405 self.status = self.get('STAT_recording') 406 return self.S_present 407 # print '-> S_final' 408 return self.S_final

409 410

411 - def S_repeat(self, ev, log):

412 # print 'S_repeat', ev 413 if self.ok_rep(): 414 self.repcount += 1 415 self.status = self.get('STAT_repeating') 416 return self.S_present 417 return self.S_waiting

418 419

420 - def S_present(self, ev, log):

421 # print 'S_present', ev 422 self.instruct('read') 423 s = self.get_current_stimulus() 424 assert s, "Empty stimulus" 425 self.info = s 426 audiofilename = self.start_recorder() 427 self.gui.status_push(CID_R, '%s %s' % (self.status, os.path.basename(audiofilename))) 428 self.gui.set_button_texts(False, False, None, None) 429 self.present(s) 430 self.info['i'] = self.i 431 self.info['rep'] = self.repcount 432 return self.S_recording

433 434

435 - def S_recording(self, ev, log):

436 # print "S_recording", ev 437 if isinstance(ev, str) and (ev==u'q' or ev==u'x'): 438 iret = self.p.close() 439 self.info['RecordEndTime1'] = datetime.datetime.now().isoformat() 440 assert iret in [0,1], "Bad return code from arecord: %d." % iret 441 EC.check_wav(self.get_audio_file_name(), self.gui) 442 self.p = None 443 if ev == u'x': 444 self.info['flag'] = 1 445 else: 446 self.info['flag'] = 0 447 self.gui.status_pop(CID_R) 448 self.clear_stimulus() 449 log.datum(self.info) 450 log.flush() 451 if not self.ok_cont() and not self.ok_rep(): 452 return self.S_final 453 return self.S_waiting 454 elif isinstance(ev, str) and ev==u's': 455 iret = self.p.close() 456 os.remove(self.get_audio_file_name()) 457 self.p = None 458 self.gui.status_pop(CID_R) 459 self.clear_stimulus() 460 log.comment('Skipped') 461 log.flush() 462 if not self.ok_cont() and not self.ok_rep(): 463 return self.S_final 464 return self.S_waiting 465 return None

466 467

468 - def S_initial(self, ev, log):

469 if self.first_entry(): 470 self.instruct('welcome') 471 self.gui.set_button_texts(False, True, 472 self.get('B_repeat'), self.get('B_next')) 473 self.gui.status_push(CID_R, 'Waiting.') 474 if ev is self.gui._next or isinstance(ev, str): 475 return self.S_waiting 476 return None

477 478

479 - def S_final(self, ev, log):

480 # print "S_final" 481 if self.first_entry(): 482 self.instruct('thanks') 483 gobject.timeout_add(2000, self.gui.destroy, None) 484 return None

485 486 487

488 -def run(argv):

489 global ROOT, TEXT_ROOT 490 arglist = argv[1:] 491 while arglist and arglist[0].startswith('-'): 492 arg = arglist.pop(0) 493 if arg == '--': 494 break 495 elif arg == '-d': 496 ROOT = arglist.pop(0) 497 elif arg == '-t': 498 TEXT_ROOT = arglist.pop(0) 499 else: 500 die.die("Unrecognized flag: %s" % arg) 501 502 try: 503 subjectID = arglist[0] 504 except IndexError: 505 die.die("Need to specify subject ID") 506 507 datecode = datetime.datetime.now().strftime('%y%m%dT%H%M') 508 d, c = fiatio.read_merged(open(os.path.join(ROOT, "stimuli", subjectID) + ".fiat", 509 'r') 510 ) 511 h = {'subjectID': subjectID, 'ROOT': ROOT, 'TEXT_ROOT': TEXT_ROOT} 512 outname = os.path.join(ROOT, "response", subjectID, datecode) 513 print 'LOG file=', outname 514 gpkmisc.makedirs(os.path.dirname(outname)) 515 log = fiatio.merged_writer(open(outname + '.fiat', 'w')) 516 die.info("Writing output to %s" % outname) 517 experiment = experiment_c(h, d, log, outname) 518 try: 519 log.headers(experiment.get_hdrs()) 520 log.header('Start', datetime.datetime.now().isoformat()) 521 experiment.gui.status_push(1, "Log file = %s" % outname) 522 experiment.gui.main() 523 log.header('End', datetime.datetime.now().isoformat()) 524 log.close() 525 finally: 526 experiment.close()

527 528 529 if __name__ == '__main__': 530 import sys 531 run(sys.argv) 532

Source Code for Module collect_aesop1