International Computer Archive of Modern and Medieval English
Short Descriptions
The Brown Corpus (The Standard Corpus of Present-Day Edited American English)
first computer-readable general corpus of texts for linguistic research on modern English
compiled by W. Nelson Francis and Henry Kučera at Brown University
compiled 1963 – 64
period: 1961
over 1 million words (500 samples of 2000+ words each)
written American English
The LOB Corpus (The Lancaster-Oslo/Bergen Corpus), and tagged version
British English counterpart of the Brown Corpus
Compiled by Geoffrey Leech (project leader), Stig Johansson (project leader), Knut Hofland (head of computing), Roger Garside (head of computing, POS-tagged version)
compiled: original version 1970–1978, POS-tagged version 1981–1986
period: 1961
1 million words (500 texts of circa 2000 words each)
15 text categories, 9 informative and 6 imaginative
British English
The Freiburg-Brown Corpus of American English (FROWN), and tagged version
Freiburg update of the Brown corpus
intended to match the Brown as closely as possible in size and composition
language of the early 1990s
compiled by: Christian Mair, Geoffrey Leech, Nick Smith and their teams
compiled: 1992 – 1996
period 1992
1 million words (500 texts of around 2000 words each, 15 text categories, 9 informative and 6 imaginative)
American English
The Freiburg-LOB Corpus of British English (FLOB), and tagged version
The Freiburg update of the LOB corpus (F-LOB)
intended to match LOB as closely as possible in size and composition
language of the early 1990s
compiled by: Christian Mair (original), Geoffrey Leech (POS-tagged version)
compiled: 1991 – 1996
period: 1991
1 million words (500 texts of around 2000 words each, 15 text categories)
British English
The International Corpus of English – East African component
the Survey of English Usage (SEU) at University College London
Survey of Spoken English (SSE) at Lund University in 1975 (sister project of the London Survey)
Compiled by: Jan Svartvik (Survey of Spoken English (SSE), Lund University), Randolph Quirk (Survey of English Usage, University College London), Sidney Greenbaum (Survey of English Usage, University College London), Knut Hofland (Norwegian Computing Centre for the Humanities, Bergen)
Compiled 1959 – 1990
500,000 words (100 spoken texts, 5000 words each
spoken British English
further references:
Svartvik, J. & R. Quirk, eds. 1980. A Corpus of English Conversation. Lund Studies in English, 56. Lund: Liber/Gleerups.
Svartvik, J., ed. 1990. The London Corpus of Spoken English: Description and Research. Lund Studies in English, 82. Lund: Lund University Press.