Skip to main content

Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic

Our take —

In this peer-reviewed study, the authors analyzed genomes of SARS-CoV-2 and related viruses (from the Sarbecovirus subgenus) to assess the history of recombination in this group and to estimate the timing of SARS-CoV-2 divergence from its ancestors. The results indicate that while recombination is common in sarbecoviruses, the receptor binding region of SARS-CoV-2 does not appear to be a recent recombination with pangolin coronaviruses and likely derives from ancestral viruses in bats. SARS-CoV-2 was estimated to have diverged from its nearest ancestor in bats between 1948 and 1982, indicating that evolutionary ancestors of the virus have been circulating in bats for many years prior to spillover into humans.

Study design

Ecological, Modeling/Simulation, Other

Study population and setting

The study used 68 full coronavirus genomes from the subgenus Sarbecovirus (containing SARS-CoV and SARS-CoV-2) collected from human cases, bats, and other intermediate hosts in northern, central, and southern China since 2002. The goal of the study was to determine the evolutionary history of SARS-CoV-2, specifically to understand the likely source of SARS-CoV-2 in humans (bats, pangolins, or another species), and to identify how long the virus had been circulating in that animal host.

Summary of Main Findings

The authors found that recombination is common among sarbecoviruses, with 67/68 genomes showing evidence of genomic exchange. They find that SARS-CoV-2 and bat-associated RaTG13 are part of a single lineage separate from SARS-CoV and related sarbevoviruses, suggesting that SARS-CoV-2 is the result of a direct (or nearly-direct) zoonotic transmission from bats. Specifically, SARS-CoV-2 did not acquire its variable loop region of the spike protein (containing the receptor binding domain that interfaces with human ACE2) through a recent recombination event with related sarbecoviruses in pangolins. Rather, RaTG13 is the recombinant virus, having acquired its variable loop domain from an as yet unsampled SARS-related coronavirus. The authors also used three different methods to estimate that SARS-CoV-2 appears to have diverged from a common ancestor in bats between 1948 and 1982. The estimated divergence time between the closest pangolin coronavirus to SARS-CoV-2 and the lineage containing SARS-CoV-2 and RaTG13 was between 1851 and 1877, indicating that pangolins likely acquired coronaviruses independently from bats, and were probably not an intermediate host that facilitated adaptation of SARS-CoV-2 to humans. These results indicate that a direct progenitor for SARS-CoV-2 has been circulating in horseshoe bats for decades before spillover into humans.

Study Strengths

The authors used a robust approach to deal with the issue of recombination in phylogenetic inference, which if unaddressed can lead to longer branch lengths and inflated divergence times. The authors also used a robust approach involving multiple prior distributions to estimate evolutionary divergence times, thereby capturing some of the uncertainty that is inherent in time-measured phylogenetic analysis and improving upon previously published results.


As with other phylogenetic analyses of sarbecoviruses, inferences about the evolutionary origin of SARS-CoV-2 and the diversification of sarbecoviruses generally are limited by the current availability of genomes related to SARS-CoV-2 in bats and potential intermediate hosts. Additional sampling of bats and potential intermediate hosts around Wuhan and other areas of central China could reveal sarbecoviruses that represent a closer ancestor of SARS-CoV-2 that would provide more information on when and how the virus spilled over into humans.

Value added

This study provides a detailed explanation of the recombination history among sarbecoviruses and in the lineage containing SARS-CoV-2 and its closest relative in bats, RaTG13. The authors demonstrate that SARS-CoV-2 is not a recent recombinant of pangolin and bat viruses, and instead shares features with bat-associated sarbecoviruses. This suggests that spillover from bats to humans may have been direct or near-direct (i.e., a brief residence in an intermediate host). Pangolins do not appear to have been intermediate hosts based on the currently available data.

This review was posted on: 27 August 2020