Skip to main content

A comprehensive survey of bat sarbecoviruses across China for the origin tracing of SARS-CoV and SARS- CoV-2

Our take —

In this study, available as a preprint and thus not yet peer reviewed, the authors reported a massive effort to describe the diversity of coronaviruses in bats across 14 provinces of China sampled between 2016 and 2021. In 13,064 samples from bats, 146 sarbecoviruses were detected predominantly in Rhinolophus species. While no relatives of SARS-CoV-2 were detected, the closest relative of SARS-CoV in bats was discovered in R. sinicus sampled in 2020, with 95.8% similarity to SARS-CoV at the genome level. Swab samples taken from the Huanan Seafood Market in Wuhan in February 2020 detected animal coronaviruses reflective of animals previously reported being sold in the market, but none of the viruses were related to SARS-CoV-2. This study highlights that China possesses a diverse assemblage of SARS-related coronaviruses in bats, with southern China being a notable hotspot. Additional sampling in this region and in neighboring areas of Southeast Asia could shed more light on the evolutionary origin of SARS coronaviruses.

Study design


Study population and setting

In an effort to describe the diversity of coronaviruses in bats across China, the authors collected 13,064 samples from 56 bat species in 703 locations across 14 Chinese provinces between 2016 and January 2021. Pharyngeal and anal swabs were collected from live bats and then pooled by collection date, species, and site. The field team also had brief access to the Huanan Seafood Market in Wuhan in February 2020, a location where some of the earliest COVID-19 cases visited, and may have been a site where spillover of SARS-CoV-2 from animals occurred. Environmental swab samples (n = 22) were collected from cold storage areas that contained animal products, and 80 swab samples were taken from the environment around stalls selling animal products (ground, walls, sewers, door handles, chopping blocks, knives, and scissors). All samples from bats and the market were tested for the presence of coronavirus RNA using PCR targeting the RdRp gene and with next-generation sequencing. Phylogenetic analysis was then performed to identify different clades of sarbecoviruses in the samples, identity evidence of recombination, and infer whether identified viruses could use human ACE2 based on similarity of their spike protein with other viruses with known ability to enter human cells.

Summary of Main Findings

In the samples collected from bats, 199 of 372 pools were positive for coronavirus RNA: 113 with alphacoronaviruses, 64 with betacoronaviruses, and 22 with both genera. Samples within 44 pools containing sarbecovirus RNA (n = 1,068) were rescreened individually, yielding 146 positive samples mainly from seven Rhinolophus species; 69 of the positive samples produced full genomes from next-generation sequencing. Phylogenetic analysis showed that none of the sarbecoviruses from bats were related to SARS-CoV-2, and instead fell into multiple clades more closely related to SARS-CoV. Six identical genomes (YN2020B-G) from R. sinicus collected in Yunnan Province in 2020 had the highest sequence identity shared with SARS-CoV detected in a bat to date (95.8% across genome, 93.3% within spike); these viruses and another, YN2020H from the same species and year, were predicted to be capable of using human ACE2 based on the phylogenetic clustering with SARS-CoV. In the samples collected from the Huanan Market, three of 11 pools were positive for coronavirus RNA and four coronaviruses were detected, but none were related to SARS-CoV-2 or other sarbecoviruses. Viruses included hedgehog HKU31-related coronavirus in the subgenus Merbecovirus, rabbit HKU14-related coronavirus in the subgenus Embecovirus, canine coronavirus in the subgenus Tegacovirus, and rat coronavirus in the subgenus Embecovirus; these findings were consistent with animal species reported being sold ( []) in the market up to 2019 (e.g., hedgehog, rabbit, bamboo rat).

Study Strengths

This study reports an enormous sampling effort to describe coronaviruses circulating in bats in China, including during 2020 and 2021, and the team had very privileged access to the Huanan Market shortly after it was closed to the public.


Due to the pooling strategy, the authors only know the individual-level prevalence of samples containing sarbecovirus RNA that were screened individually; for most other bat species that were not screened individually, these data are unavailable. Such information would have been useful to know if prevalence changed over time, especially in commonly sampled species over the long time period of the study. The inference by the authors about whether viruses could enter human cells was based solely on phylogenetic clustering with other spike sequences that were evaluated in previous studies. In vitro experiments would be needed to evaluate whether other factors beyond ACE2 binding, such as the presence of key proteases, influence infectivity of human cells. Finally, a very small area of the Huanan Market was sampled in the study, so it is unclear what proportion of the animal-selling stalls were sampled in February 2020, and whether or not they were representative of the animal species sold prior to the start of the pandemic.

Value added

This study provides additional data that highlights southern China as a hotspot for sarbecovirus diversity. The absence of relatives of SARS-CoV-2 in the sampled bats, despite sampling in the same areas where previous studies have detected such relatives, suggests that this lineage may not be common in most parts of China and may be more restricted to the southern provinces of China and in neighboring countries in Southeast Asia.

This review was posted on: 15 November 2021