Accessibility statement

Central Hall, York University

Analysing Patient-Level Data using Hospital Episode Statistics (HES)


The current issues with COVID-19 have produced a fast-moving situation which continues to challenge the organisation of many events internationally.  We have therefore had to carefully consider the best way of dealing with this short course.  With many countries now facing lockdowns for unknown lengths of time, it is with great regret that we have taken the decision to postpone this course. 

Participants who have already registered and paid their course registration fees will be contacted soon and given the option to either defer their place until the next running of the course (dates to be confirmed), or if they prefer they can have a full refund of registration fees. 

We sincerely apologise for any inconvenience caused in these extraordinary circumstances, and we look forward to continuing to provide our short course offerings to you in the future.




I could not do what I do for the GIRFT programme without having done the course; it has proved to be invaluable.

Peter Lanyon,
Consultant Rheumatologist, Nottingham University Hospitals NHS Trust

Hospital episode statistics (HES) contains details of all admissions to NHS hospitals and all NHS outpatient appointments in England and is a main data source for a wide range of healthcare analyses for the NHS, government and many other organisations and individuals. There is also an increasing role for this observational dataset in providing evidence-based parameters which are not collectable in trials for the economic evaluation of new technologies. Admitted patient care data is available from 1989 onwards with about 20.8 million new inpatient records recorded in 2018, a 28% increase in a decade. Outpatient attendance data has been collected since 2003, with more than 123 million new outpatient appointments recorded in 2018, a 65% increase from ten years ago. Accident & Emergency (A&E) data is available from 2007 onwards with 24.8 million new A&E attendances recorded in 2018, a 21% increase in a decade.

However because of the size and complexity of HES, it is one of the most challenging and difficult datasets to get to grips with: complex coding of data items, data provided at a level which is not immediately amenable to analysis, missing data, duplicates, costing episodes via HRGs and other data issues mean that the analyst has significant upfront investment costs in learning to come to terms with the data before being able to produce meaningful analyses that are free from common errors.

I really liked the fact that the course gave a really good insight into HES (HES is amazingly complicated!)

Previous participant

Taught by academics with extensive experience in using HES for a wide range of outputs, this intensive workshop introduces participants to HES data and how to handle, manipulate and begin to analyse these very large datasets using computer software. Participants will engage in problem-solving exercises, analysing the information in highly interactive sessions. At the end of the course, the participants should understand the complex nature of the HES datasets, understand the importance of approaching HES with a disciplined programming structure and have the tools required to manipulate and re-code data from the raw form to that required for analysis. Participants will be provided with Stata codes and artificial datasets that resemble the HES data which they can copy and take away.

HES data can be linked to other datasets held by NHS Digital, such as the Mental Health Services Dataset (MHSDS) which records all contacts with secondary and specialist mental health services in England.  In the same week the Centre for Health Economics (CHE) will be offering a course about analysing the MHSDS data, for more details please follow:


This course includes instruction on how to:

  • understand, manage and manipulate the data
  • construct and analyse key variables such as waiting times or length of stay
  • analyse individual patient records defined as Finished Consultant Episodes, Provider Spells and Continuous Inpatient Spells
  • monitor emergency readmissions
  • link inpatient data to A&E, Outpatient and Critical Care data
  • aggregate data by Healthcare Resource Group or providers/commissioners
  • cost data by HRG and reference costs
  • use the data for benchmarking and policy evaluation

The tutors have worked extensively with HES data and will guide participants through the potential pitfalls using case studies, practical examples and problem-solving exercises.


This workshop is offered to people working in the public sector, academia and the private sector. It is suitable for analysts who wish to harness the power of non-randomised episode level patient data to shed further light on such things as patient costs and pathways, re-admissions and outcomes and provider performance. The workshop is suitable for individuals working in NHS hospitals, commissioning organisations and the Department of Health, pharmaceutical companies or consultancy companies and for health care researchers and PhD students. Overseas applicants may also find the tuition can be applied to similar scenarios in their own country, but must be aware that the tuition and exercises relate directly to HES data which is created for, and used in, England.

Participants should have some knowledge of introductory statistics and familiarity with computer software (e.g. SAS, Stata, SPSS). In this workshop we will be using Stata but other software users should not be discouraged. All codes we show rely on the same logic and can be easily translated from one software to another. In addition, for those who are new to Stata or want to brush up their Stata knowledge we offer a Stata introduction session (optional) where we cover the basic Stata commands that we will be using to demonstrate HES data manipulation methods. For more information about Stata and Stata resources please visit the Stata tab.

Please, don’t forget to indicate on the registration form if you will attend the Stata introduction session.

Course dates

Course dates

  • To be announced


Course programme

Day 1:   

  • 09:10 - 09:30 Registration
  • 09:30 - 12:45 Introduction to computer software: Stata. (Optional: indicate attendance (or not) on the online registration form)
  • 12:35 - 12:45 only: Registration for those not attending optional Stata session
  • Introduction to HES datasets. Data-generating process. HES inpatient records: episodes, provider spells, and continuous inpatient spells. Examining an example HES extract. Linkage to other datasets
  • HES data manipulation: key variables, missing values, dates, duplicate records, linking patient episodes, constructing continuous inpatient spells
  • 17:30 -18:30 Drinks reception

Day 2:

  • HES data manipulation: practical session on the computer
  • Linking inpatient data to other data (A&E, outpatient, Critical Care) 
  • Analysing Patient Reported Outcome Measures (PROMs) 
  • Using HES to measure hospital performance
  • 19:00 -22:00 Workshop dinner in York's city centre

Day 3:

  • Introduction to Healthcare Resource Groups (HRGs) and grouping process
  • The HRG Grouper: how to assign patients to HRGs and use of HRGs for costing purposes
  • Using HES to measure hospital performance
  • 16:00 Workshop finishes



The tutors for this course are researchers in the Centre for Health Economics, University of York

Nils Gutacker (Senior Research Fellow)
Maria Jose Aragon (Research Fellow)
James Gaughan (Research Fellow)
Katja Grasic (Research Fellow)
Panos Kasteridis (Research Fellow)
Rita Santos (Research Fellow)

and Professor Chris Bojke from the Academic Unit of Health Economics, Leeds Institute of Health Sciences, University of Leeds



Before you register on this workshop, please ensure you have secured the appropriate funding from your Organisation.
Registration is done online by Credit/Debit Card for instant payment and a guaranteed secured place on this workshop (please note the University of York does not accept American Express cards).
If you or your Organisation cannot pay by credit/debit card, please email the workshop co-ordinator: 
We regret that we cannot reserve or hold workshop places in advance of booking or payment.  


Fees are fully inclusive of tuition, lunches, drinks reception, course dinner and course materials, but do not include accommodation. VAT is not payable. Transferring between courses is not possible.

Public/academic sector Private/commercial sector 
Course fee 



Full-time PhD students can apply for a subsidised place at £250. Please email for an application form. These places are allocated at the discretion of the organisers, and you will be contacted within a few days, following your completed, returned form.

Cancellations and alterations

A full refund of course fees (less 10% administrative charge) will be made for cancellations received in writing at least one month prior to the workshop. Substitutes can be made but please email new delegate's details when known to Cancellations made less than one month prior to the workshops are non-refundable/non-changeable.

In the unlikely event that, due to unforeseen circumstances, the course has to be cancelled by the University of York, our liability is limited to refund of workshop fees. We recommend delegates have adequate insurance cover to claim any travel or personal expenses.


Once registered, the course administrator will give further information about accommodation available on campus and in York in your registration confirmation. There are a large number of hotels and guest houses in York, and workshop participants will be personally responsible for making their own accommodation arrangements.


To make the most of the introductory session and enhance your hands-on experience you may also read a Stata introductory tutorial prior to the workshop. There is an abundance of Stata resources including the official Stata documentation, Stata books, the Stata Journal, the Stata Blog and Web resources. All can be accessed through 

Different tutorials and lecture notes emphasize different coding aspects (data analysis, statistical analysis etc). While you can choose the one that looks most suitable for your programming background, we recommend the following because it emphasises data analysis and is simple and relevant to the workshop material.

For a book reference one of the simplest to follow is:

Who to contact

Course dates

  • To be announced