How to use data to combat the Covid-19 pandemic
How are we getting infected? Family and friends, the supermarket? How do we feel after week-long confinement? The Covid19Impact Survey by Nuria Oliver aims to answer questions like these.
We live in a world of data, and today, more than ever, data is an integral part of our existence during this horrible pandemic. We wake up eager to know the latest figures of confirmed cases, hospitalized, people in intensive care, deaths …; we have lunch while listening with concern to the terrible economic data derived from the COVID-19 pandemic; and we go to bed worried that there might not be enough tests, masks, respirators or hydroxychloroquine. Not to mention our daily – and almost constant – fight with a huge amount of false data and fake news that floods social networks and WhatsApp groups.
Data, without a doubt, is essential to help us understand where we are, make predictions about where we are going and make decisions based on the evidence, supposedly captured by such data. In fact, several weeks ago I co-authored an article with a team of experts and practitioners about the value of aggregate data from the mobile phone network to help us fight pandemics. Fortunately, this type of data will soon be used at a European level. I hope and wish that we do indeed realize its potential.
Undoubtedly, decisions based on data are superior to decisions based on opinions, motivations and personal or collective interests, or intuitions. Such is the importance and interest in data that the so-called data economy could exceed 700,000 million euros this year, and a new profession has even emerged that is certainly one of the most demanded today: data scientist, whose guild I belong to.
Evaluating the Quality of Data
However, just as –or more– important than having access to data is making sure that the data is of good quality and that it captures the underlying reality without bias. Do we really think that there are only 102,136 cases of Covid-19 in Spain? What is not measured does not exist –rather, it does exist, but we do not know about it — and its existence depends on how it is measured.
Analyzing the number of deaths in Spain from COVID-19 (9,053 when I write these lines) and applying the mortality rates reported in the scientific literature, there should be more than 1,3 million infected people right now in Spain to explain the death rate, that is, more than ten times higher than the reported number of cases. And it might be even more, according to an article recently published by Imperial College, it is estimated that there may be up to 11 million people infected with corona virus right now in Spain.
If we make decisions based on data, but if such data is not accurate and does not reflect reality well, those decisions may not be the most appropriate to the situation. We cannot forget that the data often becomes part of the historical record of reality, so that its biases, errors and inaccuracies are reflected for posterity and have the ability to define what happened or did not happen at a certain historical moment.
I have been designing and applying Artificial Intelligence algorithms for the analysis of all types of data for more than 25 years, with the aim of having a positive social impact. In my personal data crusade, I have learned the importance of not only evaluating the quality of the data but also identifying its gaps that may lead us to the wrong conclusions or simply prevent us from answering important questions.
The Covid-19 Impact Survey – One of the Largest Citizen Surveys in the World
For the past four weeks, I have been working tirelessly with data related to the COVID-19 virus, building predictive models of the number of cases, hospitalizations, ICUs… but also thinking about what information we lack to be able to make the best decisions in the shortest possible time. In this process of reflection, I have concluded that we are unaware of some important aspects of this pandemic, such as: how are we getting infected, despite social containment measures? Is it through our family and friends? Is it because of going to work? Or by going to the supermarket? How many people are infected with the virus? How do we feel after being confined for many weeks in our homes?
During the weekend of March 28th, we launched in Spain a large-scale survey aimed at answering this question, covid19impactsurvey.org. We shared it using social networks among our friends and colleagues. The survey was received with great enthusiasm by thousands of people, professional and civic associations, city councils, universities and the media, who contributed to making it go viral. Thus, by the afternoon of March 30th we had already collected more than 146,000 responses, making it one of the largest citizen surveys in the world the context of COVID-19.
Such a large sample allows us to obtain a unique perspective on people’s concerns and situation during the current COVID-19 pandemic. From the analysis we derived 11 implications for the design of public policies related to the management of the COVID-19 pandemic. Among others: the value of involving citizens in data capture; close contacts play a large role in virus transmission (family, friends, clients, and patients), especially among those who had tested positive for a coronavirus test; more than 28% of the respondents lack the necessary resources to implement effective quarantine measures in their homes; gender and age matter, since we find significant differences in age and gender; the majority of citizens (46%) demand more measures and are also able to maintain social distance for one month (44.4%) or between two and six months (29%), reflecting great citizen solidarity towards the measures; the economic impact is evident: 15% of SMEs face bankruptcy according to the participants and 19% of them had lost a significant part of their savings; 17% of the respondents reported having at least 1 symptom of COVID-19 and 6.5% reported having at least one of the most serious symptoms and, finally, we derived a need to do more tests since 6% of the participants indicated that they could not perform the test despite the recommendation by their doctor.
This project is part of the Data Science for COVID-19 TaskForce within the Valencian Government, working directly with the President of the region. The goal is to assist in data-driven public policy decision-making. It is an inspiring example of collaboration between policy makers, citizens and experts. It is composed of data science experts from the University Jaume I, the University of Valencia, the Polytechnical University of Valencia, the University Miguel Hernandez, the University of Alicante, the CEU Cardenal Herrera University and Fisabio.
The COVID-19 pandemic changes each day, and so do our perceptions and sentiments about it. Thus, I invite you to help us, with your responses, to generate valuable data that would help us make better decisions for us all. The positive power of data is in our hands, in the hands of all of us. Let’s work together to achieve it.
The Covid19ImpactSurvey: Assessing the Pulse of the COVID-19 Pandemic in Spain via 24 questions (PDF)