Package lib :: Module percep_spec2

Module percep_spec2

Perceptual spectrum for speech.

Version: $Revision: 1.43 $

Date: $Date: 2007/03/07 23:54:40 $

Classes
	roughness_c This should be based on Hutchinson and KNopoff 1978 Kameoka an Kuriygawa 1969a; Viemeister 1988; Aures 1985; Plomp and Levelt 1965.
	peakalign_c
	fft_filterbank

Functions

tau_lp(fc)
This is the cutoff frequency of the modulation transfer function in the ear, as a function of frequency.

source code

cochlear_filter(fofc)
Transfer function taken from "Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank IEEE Transactions of Speech and Audio Processing, 10(7) October 2002, Pages 495-503.

source code

threshold(f)
IN pressure amplitude.

source code

accum_abs_sub(a, b, o)
This does a little computation in a block-wise fashion to keep all the data witin the processor's cache.

source code

list_accum_abs_sub_dly(tk, dly)

source code

process_voicing(tk, tick, Dt)
Returns: (avg, min) where avg-min is the voicing estimator.

source code

block_percep_spec(data, dt, Dt, **kwargs)
This computes the perceptual spectrum in blocks and glues them together.

source code

perceptual_spec(data, dt, Dt, bmin=CBmin, bmax=CBmax, db=BBSZ, do_mod=0, do_dissonance=False, PlompBouman=True, do_peakalign=False, e=None)
This returns something roughly like the neural signals leaving the ear.

source code

test()

source code

Variables
	pylab = `None`
	CBmax = `erb_scale.f_to_erb(8000.0)`
	CBmin = `erb_scale.f_to_erb(50.0)`
	BBSZ = `0.5`
	E = `0.333`
	Neural_Tick = `1e-4`
	DISS_BWF = `0.25`
	MAX_DISS_FREQ = `250.0`
	HP_DISS_FREQ = `30.0`
	BLOCK_EDGE = `0.3`

Imports: sys, M, erb_scale, gpkmisc, die, numpy, VM, power, PS, gammatone2

Function Details

tau_lp(fc)

source code

This is the cutoff frequency of the modulation transfer function in the ear, as a function of frequency. From R. Plomp and M. A. Bouman, JASA 31(6), page 749ff, June 1959 'Relation of hearing threshold and duration of tone pulses'

cochlear_filter(fofc)

source code

Transfer function taken from "Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank IEEE Transactions of Speech and Audio Processing, 10(7) October 2002, Pages 495-503. au=Frank Baumgarte.

threshold(f)

source code

IN pressure amplitude. Crudely taken from Handbook of Perception vol 4: hearing, E.C.Carterette and M.P.Friedman, editors, Academic Press 1978, isbn 0-12-161904-4. Curve near 70db used.

accum_abs_sub(a, b, o)

source code

This does a little computation in a block-wise fashion to keep all the data witin the processor's cache. It computes sum( abs(a-b) ).

process_voicing(tk, tick, Dt)

source code

Parameters:

Dt (float) - the sampling rate of the output vectors.
tick (float) - the sampling rate of the vectors in tk.
tk ([ numpy.ndarray, ...]) - a list of vectors, each of which represents the neural firing of a particular point in the cochlea.

Returns:

(avg, min) where avg-min is the voicing estimator. avg: float, min: float

block_percep_spec(data, dt, Dt, **kwargs)

source code

This computes the perceptual spectrum in blocks and glues them together. It's useful when the data is too large to fit in memory.

perceptual_spec(data, dt, Dt, bmin=CBmin, bmax=CBmax, db=BBSZ, do_mod=0, do_dissonance=False, PlompBouman=True, do_peakalign=False, e=None)

source code

This returns something roughly like the neural signals leaving the ear. It filters into 1-erb-wide bins, then takes the cube root of the amplitude.

Returns:

(channel_info, data, time_offset), where

time_offset is the time of the the first output datum relative to the time of the first input sample.
channel_info
data