Corpus Manuals


The following corpus manuals are currently available for download (.pdf):



Manuals for the following corpora can be made available on request (in .html):

Written English

  • The Brown Corpus
  • The LOB Corpus
  • The Tagged LOB Corpus
  • The Freiburg-LOB Corpus of British English (FLOB)
  • The Freiburg-Brown Corpus of American English (FROWN)
  • The Kolhapur Corpus of Indian English
  • The Australian Corpus of English (ACE)
  • The Wellington Corpus of Written New Zealand English

Spoken English

  • The London-Lund Corpus of Spoken English
  • The Lancaster/IBM SEC Corpus, The Machine-Readable Corpus of Spoken English
  • The Wellington Corpus of Spoken New Zealand English

Historical English

  • The Diachronic part of The Helsinki Corpus of English Texts
  • The Helsinki Corpus of Older Scots, bibliography biblio.htm or biblio.doc
  • Corpus of Early English Correspondence Sampler (CEECS)
  • The Newdigate Newsletters
  • The Lampeter Corpus of Early Modern English Tracts
  • Innsbruck Computer-Archive of Machine-Readable English Texts (ICAMET) Sampler, manual info

Parsed corpora

  • The Polytechnic of Wales Corpus