Psycholinguistics Geekout

Yesterday was an exciting day at the ICN for all sorts of reasons: I visited an MRI scanning suite for the first time, I tried reading some more poems under speech jamming conditions, and the BBC came and did some recordings and interviews with us (more on which anon…)

But even more exciting than all that was a comment Sophie made in passing about something called the MRC psycholinguistic database. This, I’ve discovered, is an absolute goldmine for someone with a fetish for words and is an even greater example of the internet adding to the sum of human knowledge than Lolcats.

Essentially, it’s an online dictionary for researchers who want to create lists of word-stimuli for experiments. It allows you to select from a database of more than 150,000 words, narrowing them down by criteria that are second nature for a psycholinguist but for a writer (or for this one, at least) are excitingly new ways of choosing vocabulary.

You can select words by their standardised scores on familiarity, concreteness, meaningfulness, age of acquisition and a host of other measures. You can filter the word sets by part of speech, irregular pluralisation or contextual status (Specialised, Archaic, Dialect, Nonsense, Rhetorical, Erroneous, Obsolete, Colloquial…) And once you’ve chosen your criteria, the output is presented to you in a beautifully stripped back aesthetic:

This delightful list (ETHER – GAUNTLET – LANCER – LICHEN – LYRE – RAMROD – WHALEBONE – WICKET if you can’t read it above) was produced by specifying words between 2-5 syllables, with a concreteness rating of greater than 500 out of 700 and a meaningfulness rating (Colorado norms) of less than 300 out of 700, and then filtering for nouns.

So how could sets like this be useful for poetry? The raw output could be used to create list poems, but the database could also be used to create differentiated vocabulary pools. Limiting your choice of words like this would, for a start, allow you to create a series of variations on a particular theme, in the manner of Raymond Queneau’s Exercises in Style. I’m sure it could be taken further than this though – constraint-based writing is more satisfying, for me, when it’s a means to get somewhere rather than an end.

“Don’t you understand trying to stammer?”

The phenomenon of DAF – Delayed Auditory Feedback – has been provoking interest lately, due to its use in a prototype “speech-jamming” gun invented by Japanese researchers. As Sophie has pointed out, the potential for using DAF to shut people up against their will needs to be strongly qualified, but I was interested to try it out on myself and see the effect.

So last week Zarinah Agnew kindly set me up with a pair of headphones in front of her computer, and loaded a programme which made everything I said repeat back in my ears at a variable delay. I’d brought a few different texts with me, to see if some material was harder to read than others. I had some Gerard Manley Hopkins – poems full of internal and end-rhymes, consonance and assonance – a short play by Gertrude Stein, wonderfully telegraphic (or should that be telephonic?), called ‘I like it to be a play’, and Maggie O’Sullivan’s Palace of Reptiles, with its poems rich in sound-play but not rhythmically constrained in the same way as Hopkins’ lines.

Zarinah started me off on a delay of 200ms, which usually causes maximum interference. I opened the Hopkins, and started reading. I wasn’t stopped dead in my tracks, but listening back to myself, I do sound a bit out of my head. The vowels are drawn out and slurred, the rhythm is all over the place, and when I read “pride and crared for crown” for “pride and cared for crown”, I make a mistake similar to the typical ‘preservative error’, with the ‘r’ getting carried over into the next word (Hashimoto and Sakai, 2003, give the example of “hypodermic nerdle”).

Here’s a snippet – low quality because recorded on my dictaphone, but you get the picture.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

Next we tried the Stein, at the same level of delay, and I found it easier, though my reading isn’t particularly fluent and I do produce a little stutter as I read “she expected a distress”, which sets me off slurring the next few lines.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

And again, here, one mistake causes ripples and wobbles which pass through the next few lines (which end, “Don’t you understand trying? / Don’t you understand trying to stammer? / No indeed I do not.”) I like this (unintentional) effect of a distorted sense of timing, as if I’m being played back on a faulty record player.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

Overall though, I seemed to manage the Stein more easily, getting through the short, clipped exchanges, with their well-marked pause points, more easily than the enjambed Hopkins.

The O’Sullivan poem (‘Now to the Ears’) was relatively easy as well, even with the 200ms delay: perhaps the well-spaced syllables of this particular poem are less pressurised and compressed than those of ‘The Sea and the Skylark’, even if equally sonically complex – though listening to this MP3 of her reading, my version is much too slow.

I’m not sure why I found it easier: perhaps I was just getting accustomed to the reverb, listening instead to my undelayed voice coming in through my cheekbones. The artist Charles Stankievech has written a very interesting article about the history of headphones, identifying them with a “bracketing of the world”, and tracing their genealogy from 19th-century stethoscopes. He cites Jonathan Sterne on how the stethoscope created new relations between doctor and patient, and turned the voice from a carrier of meaning, bearing patients’ self-descriptions of their illnesses, to a potential symptom in itself, a “kind of sound effect – a container of timbre and an index of the states that shaped it”. According to Stankievech, the invention of headphones which followed created a new, impossible space filled with floating “sound masses”, an “in-head” experience of sound “between the ears”.

Listening to your own voice replayed with delay imparts a dislocating twist to this perception of headspace, if that is what it can be called. Two forms of proprioception are set against each other, as your ears and your flesh return contrary signals about what you’ve just said. Zarinah told me that she had sound recordings of people who are completely knocked sideways by this, either reduced to making single sounds, or trying to shout over their own voices, which does nothing but increase the feedback, causing an escalating loop of interference as people try to out-shout themselves.

Further reading:
Hashimoto, Y., & Sakai, K. (2003). Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: an fMRI study. Human Brain Mapping, 20, 22-28.
Stankievech, Charles. (2007). From stethoscopes to headphones: an acoustic spatialization of subjectivity. Leonardo Music Journal, 17, 55-59.
Sterne, Jonathan. (2003). The Audible Past: Cultural Origins of Sound Reproduction. Durham: Duke University Press.
Takaso, H., Eisner, F., Wise, R., Scott, S. (2010). The effect of delayed auditory feedback on activity in the temporal lobe while speaking: a positron emission tomography study. Journal of Speech, Language, and Hearing Research, 53, 226-236.

‘I am sitting in a room different from the one you are in now’

During last week’s group meeting, Pradheep Shanmugalingam, one of the lab’s PhD students, mentioned that he was working on an experiment which involved the use of sine wave speech. This is speech which has been treated so that its essential features, the formants which allow us to distinguish sounds, are replicated using synthetic sine waves. The result is a musical R2-D2 warbling which on first hearing is hardly recognisable as speech.

The paradigm that Pradheep is using relies on this fact; that unless people are told they are listening to speech, they don’t hear any speech content in the sounds. Once they are told, however, a process of ‘tuning in’ takes place: a recognition of the speech as speech, which also transfers to novel sine wave sentences they have never heard.

You can try out a related process for yourself at this web page by Matt Davis at the Cambridge Cognition and Brain Sciences Unit. Listen to the sine wave speech first, then the unencoded message, and then return to the sine waves. When you hear them a second time, the speech ‘pops out’.

This reminded me, in a roundabout way, of Alvin Lucier’s I am Sitting in a Room. In this minimalist composition from 1969, Lucier set up a feedback loop, reciting a text – which is also a description of what he is about to do – in a room, recording the result, and then replaying the tape in the same room whilst recording this second performance. As the iterations continue, in Lucier’s words, “the resonant frequencies of the room reinforce themselves so that any semblance of my speech, with perhaps the exception of rhythm, is destroyed.”

What emerges instead, as Lucier puts it, are “the natural resonant frequencies of the room articulated by speech.” And yet, knowing this is speech, and with Lucier’s pacing and characteristic stammer resonating in our ears, the sense of it persists through its degradation, even as the sounds turn into deep bottle tones and high glass rubbings. Eventually though, sense vanishes, and you’re left just with a whistle and throb that sounds like water in the pipes, as Lucier’s formants disappear into the room’s. There’s an original recording of the piece here on Ubuweb.

Variable Speechiness

Sophie Scott – my main point of contact at the ICN – mentioned the work of her colleague Stuart Rosen, so I looked him up. This paper by him and a number of others (including Sophie) is freely available online, and is really interesting. In broad terms – and I hope I’ve got this right – the experiment they describe was designed to test whether the areas of the left hemisphere which are specialised for language processing are responding to specific acoustic features of speech, or rather to its lexical properties.

What appealed to me was their invention of a set of stimuli which varied in two respects: in its ‘speechiness’, that is to say its speech-like acoustic features, and in its intelligibility as language.

To make these stimuli they took recordings of short sentences like ‘The clown had a funny face’ and ‘The wife helped her husband’ and reduced them to variations on two dimensions: spectrum and amplitude. You can make sentences which are neither speech-like nor intelligible by keeping either spectrum or amplitude constant; and you can make sentences which are speech-like but still unintelligible by mixing the spectra of one phrase with the amplitudes of another.

The subjective reports of what these stimuli sounded like are very suggestive. When both spectrum and amplitude are kept constant, the sound is like ‘wind in the trees’ or ‘electronic vowel sounds’. When the amplitude changes but the spectrum is held constant, the stimuli are ‘rhythmic’, ‘like a nursery rhyme’. If spectral variation is combined with a constant amplitude, the descriptions varied from ‘like speech with the bits taken out’ to ‘like an alien’ or ‘a lunatic raving’. Finally, the most speech-like but non-intelligible combination of the spectra and amplitudes from two different phrases run together was described as ‘very much like speech’, like someone ‘with a regional accent’ or like ‘aliens again’.

It may not be the point of this experiment, but I’m really interested by this gradation of speechiness, from wind and electronics through nursery rhymes and pure rhythm, up through aliens and the madness to a regional accent – almost speech but no gold watch! It’s interesting that animal noises don’t get mentioned: but the authors do mention that they treated the stimuli to make them ‘sound less like bird calls’.

So what did the study find? The speech-like but unintelligible modulations were processed bilaterally, a finding at odds with the theory that the left temporal lobe is specialised to process all speech-related sounds. As the authors write, it seems that ‘crucial left temporal lobe systems involved in speech processing are not driven simply by low level acoustic features of the speech signal but require the presence of linguistic information to be activated.’

If you want to have a listen to some of the stimuli, a few WAV samples are here. This is the supplementary material for a subsequent paper by Carolyn McGettigan from the ICN; unfortunately the paper itself is only available to subscribers. The wav files aren’t labelled but if you look at their names you can tell the conditions:

intSmodAmod = intelligible speech where spectrum and amplitude modulate

SmodA0 = unintelligible speech; spectrum modulates but amplitude is constant

S0Amod = unintelligible speech; spectrum is constant but amplitude modulates

SmodAmod = unintelligible speech; spectrum and amplitude modulate but from different sentences