This white paper summarizes key learnings from a four-part seminar, “Safe, Fair, Equitable and Responsible Use of Human Mobility Data,” which convened in March and April 2021 at the Radcliffe Institute for Advanced Studies at Harvard University. The seminar was attended by over 40 domain experts representing academia, industry, law, humanitarian relief, and disaster response. The interdisciplinary exchange sought to map areas of convergence between technology companies that produce human mobility data, epidemiologists and public health practitioners that incorporate these novel data streams into their models and research, lawyers, ethicists, and data scientists that are concerned with responsible data management including privacy protection for individuals and groups, and health agencies and disaster responders that use insights from such data for decision making. Seminar deliberations also identified the technical, regulatory, and translational gaps that preclude the effective integration of such data into field response.
The advent of mobile phones and internet-connected devices has generated enormous amounts of data on individual and group mobility patterns, collected by telecommunications companies, smartphone apps, data aggregators, and brokers. For the past several years, these data have helped researchers estimate population movement patterns to inform epidemiological modeling, situational awareness, and resource allocation in crisis settings. Though these data are routinely collected by telecom and other companies for business analytics, they are shared with researchers or policymakers on an ad-hoc and limited basis. Strict national and regional legal frameworks guide the re-use of these data globally, and when the data are shared with researchers, they are done so in accordance with local law and after prolonged contractual negotiations. Data use agreements take a long time to formulate due to the unfamiliarity of regulatory bodies, ethics review boards, and data providers with the applications of these novel data streams to public health issues and with the related risks and necessary protections associated with these data sets. During a public health emergency, small pools of academic researchers and policymakers have access to these data through preexisting relationships with technology companies. With the development of “differential privacy” technology for producing aggregated data with strong de-identification, companies have begun sharing their datasets with a wider community of researchers; some data are now available publicly.
Access to human mobility data increased exponentially in 2020 during the COVID-19 pandemic, when non-pharmaceutical interventions – like travel bans and stay-at-home orders – became the mainstay of public health response. Around the world, researchers used location or movement information derived from telecommunication data, such as Call Data Records (CDRs) or x-Data Records (xDRs), the latter generated with a mobile device connected to the internet; first-or third-party Software Development Kit data collated from smartphone apps; vehicle GPS devices, Bluetooth exchanges or geotagged social media data, to study the impact of non-pharmaceutical interventions on population movement and on the evolving circumstances of the pandemic. Publicly accessible “scorecards” attempted to rank counties and neighborhoods based on their mobility patterns.
Given the urgency of the moment, the efforts to leverage these data were laudable but fraught with limitations and potential for inadvertent harm. The provenance of these data was poorly understood by many who sought to apply the data to inform public policy. The owners of mobile devices who generate data sets are not usually a random or representative sample of the population of interest, and not all data providers have the same spatial or temporal coverage. Information on the representativeness and coverage of these data is rarely available and must be inferred by researchers themselves. Additionally, the methods to collect, de-identify, and share data vary widely across companies. Robust analysis needs to consider the uncertainty and bias associated with these data.
The technical expertise to conduct such an analysis is often inaccessible to policymakers and is only available to researchers that have previously worked with mobility data. The capacity, expertise, and mandate of these researchers is highly variable, poorly mapped, and being relied on in the absence of common reporting and recording requirements by regulatory agencies. Even if the analysis produced was robust, there remains a wide translational gap between the complex methodological questions that interest researchers and the simple, actionable information that policymakers need in times of crisis. Information generated from these data is often presented in ways that do not align with existing ways of working within the domain of emergency response. Under-sourced public health and disaster response agencies often do not have the necessary internal capacity to engage with a complex analysis during a crisis.
The “Safe, Fair, Equitable and Responsible Use of Human Mobility Data” seminars sought to identify the aforementioned challenges in accessing, understanding, analyzing, and applying human mobility data before and during the pandemic to develop a shared roadmap of priorities for scientists, policymakers, and technology companies. The technical, regulatory, and societal challenges explored during the seminar and in this white paper are organized around three clusters: Data Readiness, Methods Readiness, and Translational Readiness.
The seminar series resulted in this white paper, which summarizes key points of consensus and recommendations. This document will in turn set the agenda for a consultative process (in collaboration with CrisisReady and the Global Facility for Disaster Reduction and Recovery, GFDRR) to develop guidance for governments and response agencies seeking to use these novel data streams for emergency preparedness and response.
The section on Data Readiness examines the technical, regulatory, and ethical issues concerning access to human mobility data generated from mobile phones. Key themes that emerged include the criteria and eligibility for access, the granularity of the data that can be shared, the tension between risk and utility of the data shared, the means-for and barriers-to sharing data across institutions or jurisdictions, and finally, the determination of the arbiter of these decisions.
The section on Methods Readiness examines issues of representativeness, uncertainty, privacy, and epidemiological applications of these data. The section outlines advances in the application of multiple large data streams generated by mobile digital advertising (AdTech) companies, social media platforms, and telecom companies to public health response planning and modeling. We list potential technical and regulatory solutions to mitigate potential harm from the use, reuse, and recombination of these data.
The section on Translational Readiness examines global approaches that seek to improve the integration of novel data streams, namely human mobility data, into response planning by researchers, policymakers, and response agencies. Early efforts to address these challenges are through the socialization of end products, the promotion of data “bilinguals” who can navigate both the science and regulatory realms, the creation of regional hubs, networks, and multistakeholder “assemblies,” and direct training and capacity building within response agencies. Incentives to use these data within academic institutions, governments, non profit organizations, and technology companies are nascent and unaligned, despite the potential for analysis to be of use in a disaster or public health emergency. More evidence on the utility of these data is needed to improve incentives. In order to develop such evidence and enhance the approaches for integrating novel data streams, more data sharing is also needed.
The “Data-Methods-Translational Readiness” framework presented in this white paper brings together key issues around preparing the data for timely use, applying the data meaningfully and purposefully, and nurturing local capacity to receive and act on the analysis. The paper presents a broad view of the state-of-the-art and lists key domains of inquiry to be pursued by technology companies, scientists, lawmakers, and response agencies for the responsible use of these and other novel data streams to maximize public good without causing or exacerbating harm.
We thank Abshishek Bhatia for synthesizing the notes and insights from the seminars that are featured in this document, Sraavya Sambara for her research support, Navin Vembar and Nishant Kishore for their technical inputs, and Joseph Nallen for editing and designing this white paper. We thank Maham Khan and Takahiro Abe for diligently documenting the seminar deliberations in their role as rapporteurs, and Nick Jones for facilitating GFDRR’s participation and organizational support. We also thank the Radcliffe Institute for Advanced Study at Harvard University for supporting the seminars.
We are deeply grateful for the seminar participants’ and featured discussants’ time and valuable insights, and for making the seminar series possible in the first place. Furthermore, we appreciate their feedback and suggestions as finalized in this white paper.