Database Poem Performance

Here’s a walkthrough of an attempt to use the MRC psycholinguistic database to produce poetry. This work activates two dimensions of the database, familiarity and concreteness, sampling each dimension at 9 points from minimum to maximum to gather sets of words for performance.

The words were then read out alternately by me (on familiarity) and Holly Pester (on concreteness). The sets are not of equal length, so when one performer runs out of words, the other carries on alone. First performed at Poetry Parnassus, Southbank Centre, 30 June 2012, for an event curated and filmed by SJ Fowler.

Database Poem 1 from James Wilkes on Vimeo.

Playing with databases

Below is a recording of a poem-sketch made using the MRC psycholinguistics database. You can just go ahead and listen to it, or read more about the process and background below.

vox lab poem sketch 1 by jwilkes

The poem consists of sets of words produced from the database by varying the level of “imageability”, or how easy a word is to visualise. Joanette, Goulet and Iannequin illustrate this by asking us to think about the difference between the words “anger” and “antitoxin”; the former is abstract but easy to visualise, whereas the latter is concrete but hard to visualise.

I set the level of imageability to the highest level consistent with producing only one word. This was 660, on a scale which goes from 100 to 700, and it produced the word ‘BEACH’, repeated three times. I then decreased imageability in steps of 5, from 655 down to 630. With each of these seven iterations, the set of words available increased, and in this work-in-progress I simply read them out in alphabetical order.

Psycholinguistics Geekout

Yesterday was an exciting day at the ICN for all sorts of reasons: I visited an MRI scanning suite for the first time, I tried reading some more poems under speech jamming conditions, and the BBC came and did some recordings and interviews with us (more on which anon…)

But even more exciting than all that was a comment Sophie made in passing about something called the MRC psycholinguistic database. This, I’ve discovered, is an absolute goldmine for someone with a fetish for words and is an even greater example of the internet adding to the sum of human knowledge than Lolcats.

Essentially, it’s an online dictionary for researchers who want to create lists of word-stimuli for experiments. It allows you to select from a database of more than 150,000 words, narrowing them down by criteria that are second nature for a psycholinguist but for a writer (or for this one, at least) are excitingly new ways of choosing vocabulary.

You can select words by their standardised scores on familiarity, concreteness, meaningfulness, age of acquisition and a host of other measures. You can filter the word sets by part of speech, irregular pluralisation or contextual status (Specialised, Archaic, Dialect, Nonsense, Rhetorical, Erroneous, Obsolete, Colloquial…) And once you’ve chosen your criteria, the output is presented to you in a beautifully stripped back aesthetic:

This delightful list (ETHER – GAUNTLET – LANCER – LICHEN – LYRE – RAMROD – WHALEBONE – WICKET if you can’t read it above) was produced by specifying words between 2-5 syllables, with a concreteness rating of greater than 500 out of 700 and a meaningfulness rating (Colorado norms) of less than 300 out of 700, and then filtering for nouns.

So how could sets like this be useful for poetry? The raw output could be used to create list poems, but the database could also be used to create differentiated vocabulary pools. Limiting your choice of words like this would, for a start, allow you to create a series of variations on a particular theme, in the manner of Raymond Queneau’s Exercises in Style. I’m sure it could be taken further than this though – constraint-based writing is more satisfying, for me, when it’s a means to get somewhere rather than an end.