Accessibility statement

Statistical Modelling & Practical Data Analysis with R - MAT00068M

« Back to module search

  • Department: Mathematics
  • Module co-ordinator: Dr. Yue Zhao
  • Credit value: 10 credits
  • Credit level: M
  • Academic year of delivery: 2022-23
    • See module specification for other years: 2021-22

Module will run

Occurrence Teaching period
A Autumn Term 2022-23 to Spring Term 2022-23

Module aims

This module provides essential skills for self-reliantly carrying out statistical data analyses of real data, from the thorough formulation of the question to be investigated up to the presentation of the analysis' results.

The participants of the module first receive a general introduction to statistical modelling and practical data analysis including an overview of a selected range of statistical methods, then learn how to implement these in the statistical software environment R, before they each carry out and present two statistical analysis projects based on real data sets. The latter constitutes the most significant part of the module. The first and smaller statistical data analysis is carried out individually at the end of the autumn term. The second and larger project is to be completed in groups of 3 or 4 participants at the end of the spring term. Each of the projects involves the theoretical analysis of the studied problem, the conception of the data analysis, its realization in R, summarizing the analysis in a written report, and the professional presentation of the analysis' results in front of the fellow participants. The participants' mastery of the module content will also be tested through three class tests scattered through the two terms.

Module learning outcomes

Upon completion of this module, students should

  • have achieved a general understanding of statistical data analysis as a process of modelling the real world situation underlying the data and of responsible extraction of information from it
  • understand the formal framework of a significant part of the discussed statistical models and tools
  • be able to use various statistical tools to analyse real data sets in the statistical software environment R
  • be able to carry out a self-reliant and responsible statistical data analysis on the basis of a real data set and a research question
  • be able to neatly write up the results of a statistical data analysis, employing tables and graphs in an appropriate way
  • be able to present the results of a statistical data analysis to an audience with diverse statistical knowledge, appropriately assisted by computer slides

General academic and graduate skills to be obtained:

  • Problem solving skills
  • Group working skills
  • Computer programming skills
  • Professional presentation skills

Module content

Indicative module content:

  • Introduction to statistical modelling and to challenges in practical data analyses
  • Introduction to practical data analysis with the statistical software environment R
  • Overview of a wide range of standard models for statistical data analysis: linear regression, generalised linear regression, advanced regression methods, cluster analysis, classification, advanced multivariate methods
  • Overview of statistical tools to address selected challenges and general needs in practical data analysis: missing values, sampling weights, variable selection
  • Implementation of the discussed statistical methods in the statistical software environment R
  • How to subsume and to present results of statistical data analyses (for statisticians and non-statisticians)
  • Practical statistical analysis of real data sets, reporting and presentation of the obtained results

Assessment

Task Length % of module mark
Coursework - extensions not feasible/practicable
Class Tests: Statistical Modelling
N/A 45
Groupwork
Group Coursework: Statistical Modelling
N/A 40
Oral presentation/seminar/exam
Group Presentation: Statistical Modelling
N/A 15

Special assessment rules

None

Additional assessment information

The group project report cannot be repeated as it represents a joint effort of the group members, while the reassessment concerns only the individual students who do not achieve a pass mark for the whole module.

Reassessment

Task Length % of module mark
Essay/coursework
Reassessment: Data Analysis Project
N/A 100

Module feedback

Current Department policy on feedback is available in the undergraduate student handbook. Coursework and examinations will be marked and returned in accordance with this policy

Indicative reading

Fahrmeir L, Kneib T, Lang S and Marx B (2013). Regression: Models, Methods and Applications. Springer

Hastie T, Tibshirani R and Friedman J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition. Springer

Haerdle W and Simar L (2007). Applied Multivariate Statistical Analysis. 2nd edition. Springer

Everitt B and Hothorn T (2011). An Introduction to Applied Multivariate Analysis with R. Springer

Kleiber C and Zeileis A (2008). Applied Econometrics with R. Springer



The information on this page is indicative of the module that is currently on offer. The University is constantly exploring ways to enhance and improve its degree programmes and therefore reserves the right to make variations to the content and method of delivery of modules, and to discontinue modules, if such action is reasonably considered to be necessary by the University. Where appropriate, the University will notify and consult with affected students in advance about any changes that are required in line with the University's policy on the Approval of Modifications to Existing Taught Programmes of Study.