Our Hackathon has now completed its first stage (gathering academic literature and global BCG vaccination data), and enters the second stage (establishing whether there is a causal link between this and COVID-19 infection and case fatality). We are calling for Data Scientists to volunteer to join this exciting initiative
Background
The initiative is prompted by the suggestion that there may be a link between reduced rates of infection and lower case fatality rates associated with COVID-19 in countries that recommend BCG vaccine for all as opposed to countries that recommend BCG only for specific high-risk groups. We hope that the analysis done as part of this task might help discover useful information about the BCG – COVID-19 clinical trials. For example, some insights that may come from this analysis is whether factors such as the strain of BCG, the age at which people have been vaccinated, revaccination, or how long ago people have been vaccinated are important.
Using machine learning and other technologies, we hope to be able to provide support for the role of BCG vaccinations or an alternative hypothesis. We would like to get answers to the following questions:
- Is BCG vaccination causally related to reduced COVID‐19 mortality or other factors like lockdown and average age of the population are responsible for the different mortality rates?
- If BCG vaccination reduces COVID-19 mortality, what are the key factors, for example:
- How long does the immunity engendered by BCG last after vaccination?
- Which BCG strain has been used?
- What is the optimal time to vaccinate?
Establishing an evidence-backed link
The only way to truly understand the correlation between the BCG vaccine and COVID-19 is to conduct randomized trials combined with deep analysis of existing data. To that end, Estafet and Elsevier have initiated a two-stage hackathon. The groups and are working together with the BCG World Atlas team, which is led by infectious disease specialist at the University of Ottawa, Alice Zwerling. The BCG Atlas is an open-source database of global BCG vaccination policies and practices, founded in 2011.
Many of these ecological studies were based on data from the BCG Atlas, so the first stage of the hackathon aimed to augment and improve the Atlas; with additional data and health records available on BCG vaccinations. These have been found through natural language processing (NLP) methods. With thirty volunteers globally, including judges, organizers, and data gatherers, prizes were awarded to those deemed to have extended the data most. The winner was Dimitrina Zlatkova of Sofia University, who contributed 57 additional entries, followed by developer Marouane Benmeida of Morocco with 33 additional entries.
Join the hackathon
The hackathon now moves to stage two, where the volunteers will seek to answer a series of questions. Such as whether the BCG vaccination is causally related to reduced COVID‐19 mortality, or if other factors like lockdowns and average age of the population are responsible for the different mortality rates? And, if BCG vaccination does reduce COVID-19 mortality, what are the key factors, for example, how long does the immunity engendered by BCG last after vaccination? And does the strain of BCG vaccination impact immunity? The team is now looking for more volunteers to get involved as the hackathon progresses.
When it comes to something like COVID-19, one of the biggest challenges in our lifetimes, data science will certainly be critical – but it is the blend of scientific understanding and technical acumen through data science which is vital.
It must be the job for all of us engaged in data science projects – whether in academia, commercial research, or public and government research – to stem the hype and assess the veracity of a claim before accepting the conclusion. It’s not just important in the development of treatments and vaccinations against COVID-19, but is paramount in establishing broad public trust in data-led decision making.