Skip to main content

Genomic epidemiology of SARS-CoV-2 in Guangdong Province, China

Our take —

This study, available as a preprint and thus not yet peer-reviewed, analyzed epidemiological data from 1,388 positive COVID-19 cases in Guangdong Province, China, and generated 53 SARS-CoV-2 viral genomes that were analyzed alongside previously published sequences. The authors find that most infections in Guangdong between January and March 2020 were a result of virus importations from elsewhere, and that chains of local transmission were limited in size and duration. Their study showed that the large-scale surveillance and intervention measures implemented in Guangdong were effective in interrupting community transmission.

Study design


Study population and setting

The study population includes 1,388 confirmed cases (out of 1.6 million tests) of SARS-CoV-2 in Guangdong Province, China between January 30 and March 19, 2020. A total of 53 SARS-CoV-2 genomes were generated from these 1,388 positives and analyzed alongside 177 previously published sequences from GISAID. These 177 published genomes included 73 sequences from China, 17 of which were also from Guangdong. The purpose of the study was to investigate the timing and relative contributions of imported cases versus local transmission and how transmission patterns reflected emergency response measures in Guangdong.

Summary of Main Findings

The authors found that approximately 25% of the 1,388 confirmed cases in Guangdong were due to local transmission, and two-thirds were linked to travel, specifically from Hubei Province, where COVID-19 was first identified. Over half of the locally acquired cases were linked to household transmissions. Analysis of the 53 virus sequences generated as part of this study showed that the sequences were similar to genomes from all over China and from other countries, which confirms that most of the cases in Guangdong were linked to travel rather than local community transmission. Analyzing the mutations in these genomes (through Bayesian techniques) also showed that the virus was imported multiple times into Guangdong during the second half of January 2020.

Study Strengths

Availability of both epidemiological and genetic data for the 53 sequenced cases is a strength of this study, as the two data types can be compared and often support each other’s findings. Additionally, the methodology is clear and very detailed, and the code used for the analyses was provided by the authors. The study also clearly outlines the strengths and limitations of phylogenetic analysis, which helps reader interpret the strength of the conclusions.


Because surveillance in Guangdong was targeted towards travelers, the data may overestimate the proportion of travel-associated cases. Additionally, limited sampling of COVID-19 from other Chinese provinces during this period may have led to underestimating the number of introductions into the region. Finally, limited mutations and high similarity of virus genomes, especially early in the outbreak, makes it difficult to determine the exact patterns of viral spread.

Value added

By showing that most cases of COVID-19 from January to March 2020 were due to travel rather than local transmission, this paper highlights the effectiveness of surveillance and intervention measures implemented in Guangdong in reducing local transmission.