Podcast from Resonance fm show

I hosted this programme on Resonance104.4fm in the week leading up the The Voice Symposium, featuring contributions from Sophie Scott, Emma Bennett, Joe Banks, Pradheep Shanmugalingam, Carolyn McGettigan and Holly Pester.

You can stream or download the podcast here:

‘The Voice’ broadcast on Resonance104.4fm by thevoxlab

Variable Speechiness

Sophie Scott – my main point of contact at the ICN – mentioned the work of her colleague Stuart Rosen, so I looked him up. This paper by him and a number of others (including Sophie) is freely available online, and is really interesting. In broad terms – and I hope I’ve got this right – the experiment they describe was designed to test whether the areas of the left hemisphere which are specialised for language processing are responding to specific acoustic features of speech, or rather to its lexical properties.

What appealed to me was their invention of a set of stimuli which varied in two respects: in its ‘speechiness’, that is to say its speech-like acoustic features, and in its intelligibility as language.

To make these stimuli they took recordings of short sentences like ‘The clown had a funny face’ and ‘The wife helped her husband’ and reduced them to variations on two dimensions: spectrum and amplitude. You can make sentences which are neither speech-like nor intelligible by keeping either spectrum or amplitude constant; and you can make sentences which are speech-like but still unintelligible by mixing the spectra of one phrase with the amplitudes of another.

The subjective reports of what these stimuli sounded like are very suggestive. When both spectrum and amplitude are kept constant, the sound is like ‘wind in the trees’ or ‘electronic vowel sounds’. When the amplitude changes but the spectrum is held constant, the stimuli are ‘rhythmic’, ‘like a nursery rhyme’. If spectral variation is combined with a constant amplitude, the descriptions varied from ‘like speech with the bits taken out’ to ‘like an alien’ or ‘a lunatic raving’. Finally, the most speech-like but non-intelligible combination of the spectra and amplitudes from two different phrases run together was described as ‘very much like speech’, like someone ‘with a regional accent’ or like ‘aliens again’.

It may not be the point of this experiment, but I’m really interested by this gradation of speechiness, from wind and electronics through nursery rhymes and pure rhythm, up through aliens and the madness to a regional accent – almost speech but no gold watch! It’s interesting that animal noises don’t get mentioned: but the authors do mention that they treated the stimuli to make them ‘sound less like bird calls’.

So what did the study find? The speech-like but unintelligible modulations were processed bilaterally, a finding at odds with the theory that the left temporal lobe is specialised to process all speech-related sounds. As the authors write, it seems that ‘crucial left temporal lobe systems involved in speech processing are not driven simply by low level acoustic features of the speech signal but require the presence of linguistic information to be activated.’

If you want to have a listen to some of the stimuli, a few WAV samples are here. This is the supplementary material for a subsequent paper by Carolyn McGettigan from the ICN; unfortunately the paper itself is only available to subscribers. The wav files aren’t labelled but if you look at their names you can tell the conditions:

intSmodAmod = intelligible speech where spectrum and amplitude modulate

SmodA0 = unintelligible speech; spectrum modulates but amplitude is constant

S0Amod = unintelligible speech; spectrum is constant but amplitude modulates

SmodAmod = unintelligible speech; spectrum and amplitude modulate but from different sentences