Skip to content Accessibility statement

Patient data vital in understanding Covid-19 and its mutations

News

Posted on Monday 16 November 2020

A new study has found 95.5 per cent of current entries in the world’s largest novel coronavirus genome database do not contain relevant patient information — a critical piece of the puzzle to understand how the virus is evolving
Report author, Professor Vasan says its critical to gather patient data

Researchers - led by a York virologist - have used this finding to develop a standardised data collection template, which can be implemented on repositories like GISAID, without identifying the patient and making it easier for clinical teams treating patients to share more of their knowledge.

This enables the scientific community to access important information including symptoms, vaccine status and travel history and in doing so build a more complete picture of the impact of Covid-19 on each patient.

SARS-CoV-2, the virus that causes Covid-19, is one of the most sequenced viruses in history, with over 200,000 sequences on GISAID as of 16 November 2020. The last 100,000 sequences of the virus were uploaded in the past two months, a global record.

Vital information

The study - led by Australia’s national science agency Commonwealth Scientific and Industrial Research Organisation (CSIRO) who are collaborating with GISAID and other academic partners - proposes a standardised data collection method to help scientists and clinicians around the world gather and share vital information in the fight against Covid-19.

CSIRO researcher and senior author of the paper Professor S.S. Vasan, who is also Honorary Professor at the University of York, UK, said it is critical to collect the ‘patient journey’ in as much detail as possible to understand the impact of virus evolution on the disease and its consequences.

Professor Vasan added: “We urgently need de-identified patient data associated with these virus genome sequences in order to decipher whether disease outcomes are due to a mutation, or multiple mutations, in the virus or host factors such as age, gender and co-morbidities.

 “It’s very likely this information is known to the clinical teams who treated the patient but does not make its way to public repositories such as GISAID, due to the number of steps involved.”

Recognising this need for clinical data, GISAID made ‘patient status’ a compulsory field for uploading virus sequences since 27 April 2020. 

However, the study showed a lack of digital infrastructure for collecting clinical information has hampered progress. 

Health systems

It also identified the need for a standardised vocabulary and mechanism for linking in with health systems as key factors for capturing the necessary information.

Lead author and CSIRO researcher Dr Denis Bauer, who is also Honorary Associate Professor at Macquarie University, Sydney, said with the adoption of the study’s proposed data collection template, future sequences shared through the GISAID initiative could contain more meaningful de-identified patient information.

Dr Bauer added: “We have identified steps in the clinical health data acquisition cycle and workflows that likely have the biggest impact in the data-driven understanding of this virus.

 “Following the ‘Fast Healthcare Interoperable Resource’ implementation guide, we have introduced an ontology-based standard questionnaire consistent with the World Health Organization’s recommendations.”

Genome sequences

Barwon Health’s Director of Infectious Diseases Professor Eugene Athan welcomed the new data collection template.

Professor Athan said: “Barwon Health is leading a study on the long-term biological, physiological and psychological effects of Covid-19, in partnership with CSIRO and Deakin University, and we intend to implement this mechanism for our data collection and reporting.

“Having a simplified and standardised approach to sharing relevant patient information alongside genome sequences will enable critical research into Covid-19 and comparisons between different studies and population sets.

 “I encourage clinicians and scientists around the world to share, wherever possible, de-identified patient information and clinical outcomes using this template to support ongoing research efforts.”

Research newsletter

Our monthly research newsletter features a curated mix of news, events, and recent discoveries delivered straight to your inbox.

Sign up

Explore more news

News

7 April 2026

Reducing population vulnerability is just as critical as cutting toxic air emissions for saving lives, according to the findings of a new study.

News

2 April 2026

In one of the largest releases of its kind, almost 16 million records have been made available online - chronicling the personal tragedies and everyday lives of Yorkshire people across nearly seven centuries.

News

1 April 2026

The University of York’s key community partner, York Cares, has been selected by Lord Mayor Elect, Cllr Margaret Wells, as her official charity for the year ahead.

News

31 March 2026

Scientists at the University of York have cracked a 40-year-old biological cold case by revealing how the parasite that causes Sleeping Sickness stays one step ahead of the human immune system.

News

26 March 2026

A University of York academic has been appointed to the panel of a public inquiry investigating the violent confrontation between police and striking miners at Orgreave coking plant in South Yorkshire in June 1984.

Read more news

Our response to the coronavirus pandemic

We're working with partners in York and further afield as part of a global effort to fight the COVID-19 virus. From covid analysis in the labs to producing face shields for the frontline, we're using our knowledge and expertise to support the effort.