readiter(mlf_fn,
postfix=' .wav ' ,
datapath=' . ' ,
strict=True,
findfile=True,
pathedit=None,
time_quantum=1e-07,
verbose=False)
| source code
|
Read a HTK Master Label (MLF) file. Datapath and pathedit are ways to
deal with the situation where the MLF file has been moved, or (for other
reasons) the filenames in the MLF file don't point to the actual
data.
- Parameters:
mlf_fn (str) - filename of the data file.
strict (bool) - If true, raise an exception if an audio file cannot be found.
time_quantum (float ) - A factor to convert from the time information in the MLF to real
units of time (like seconds). Ideally, time_quantum=1e-7
seconds for MLF files, but that isn't exactly accurate for some
sampling rates (like 11025 samples/sec) when the sampling
interval is not an integral multiple of 100 nanoseconds.
- Returns: an iterator producing
dict(str: various)
- sequence of
{'filespec':path, 'd': d, 'f': f, 'symbols':
[...] } , ... . This is an iterator of dictionaries.
Each dictionary corresponds to one utterance, or one "label
file" in the MLF. Attributes 'd' and 'f' are only present if
findfile==True; os.path.join(x['d'], x['f']) is a
path to the corresponding audio. x['filespec'] is
the path information in the MLF, x['i'] is an
int indexing which utterance this is within the MLF,
and x['symbols'] is the label information for that
utterance. It is a list of tuples produced by parse_label_line.
|