MANUAL OF INFORMATION TO ACCOMPANY THE SEC CORPUS

The Machine-Readable Corpus of Spoken English

 

 

L. J. Taylor
Dr. G. Knowles

Unit for Computer Research on the English Language
Bowland College
University of Lancaster
Bailrigg
Lancaster
January, 1988

 


 

CONTENTS

Introduction

1 Distribution af SEC Material

2 Composition of Corpus

2.1 Breakdown into categories

2.2 Speakers in the Corpus

2.3 Dates of Texts

2.4 Duration of Extracts

2.5 Source of material

2.6 SEC text details

3 Verslons of SEC Material

3.1 Spoken Recording

3.2 Unpunctuated transcriptions

3.3 Orthographic transcriptions

3.4 Grammatically tagged versions

3.5 Prosodic transcriptions

3.5.1 Prosodic characters

4 Samples of different versions

4.1 Unpunctuated transcription

4.2 Orthographic transcription

4.3 Grammatically tagged version

4.3.1 Horizontal formats

4.3.2 Verfical format

4.4 Prosodic version

References

Appendix, CLAWS1 tagset