English corpus linguistics



The aim of this module is to introduce you to corpus linguistics, and the use of corpora in studying English language. The first half of the spring term will introduce the theory and practice of corpus linguistics, and the second half will explore how corpora are currently used in linguistic research on English. The summer term is devoted to workshops related to individual project work. This module is largely practical and skills driven and the assessment is designed to test your skills in accessing the primary literature, data collection and analysis, descriptive adequacy, critical thinking, argumentation, and written presentation skills.

Learning outcomes

On completion of this module you should be able to:

  • understand and discuss the main issues and methodologies of corpus linguistics
  • understand the role of corpus data in linguistic research
  • carry out linguistic investigations using a variety of corpora and corpus methodologies
  • perform simple statistical tests

You will develop your competence in the following skills:

  • recognising and explaining complex patterns in linguistic data
  • forming valid generalisations about language from corpus data
  • expressing grammatical concepts clearly and concisely
  • designing and carrying out a small research project using corpus data
  • summarising and presenting findings in a style appropriate to the norms of the discipline
  • understanding and applying basic statistical concepts relevant to linguistic analysis

Note that a Research Extension module can be taken alongside this module, for students who wish to write a dissertation.

This module will be capped at 35.



  • E01C Understanding English Grammar



Contact hours

A minimum of 24 contact hours over two terms, made up of lectures, seminars, practicals and workshops. 

Teaching programme

Classes in the spring term consist of lectures, seminars, and related practical sessions. Seminars will be used primarily for discussion and presentation. Practical sessions involve working with corpora, and students should be prepared to learn the details of using these largely on their own with the help of a manual. The majority of the data analysis involves using simple statistical techniques, which will be taught, but may require a significant amount of self-study to master. 

Suggestions for reading before the module starts

Reading the following will be good advance preparation for the module:

  • McEnery, Tony (2012). Corpus linguistics: method, theory and practice. Cambridge: Cambridge University Press (Chapters 1, 2 and 5).
  • McEnery, Tony; Richard Xiao and Yukio Tono (2006). Corpus-based language studies: an advanced resource book. London: Routledge (Sections A1-A4, A6-A7 and B3-B4).

Assessment and feedback

Assessment and feedback

Formative assessment

  • Group exercises throughout the term.
  • Written and oral feedback during teaching sessions and in surgery hour by appointment.

Summative assessment

  • Exercises
    • Weight: 30%
  • 3000-word project report
    • Weight 70%


Transferable skills developed in this module

All modules provide an opportunity to work on general oral/written communication skills (in class and in assessments) and general self management (organising your studies), alongside the specific skills in language or linguistics that the module teaches.

In addition to these, this module will allow you to particularly develop skills in the application of IT/numeracy skills. In this module you will learn to extract linguistic data from large electronic corpora using various types of software. You will need to organize, manipulate, analyse and quantify this data electronically in order to investigate questions about real-world language use.

Follow this link to hear how past students use transferable skills from their degree in their current jobs.

About this module

  • Module name
    English corpus linguistics
  • Course code
    L32H (LAN00032H)
  • Teacher
    Eva Zehentner 
  • Term(s) taught
  • Credits