Ecological, Modeling/simulation, Other
Study population and setting
The authors present a reanalysis of data from three recent publications: 1) similarities in genetic sequences within the spike protein of SARS-CoV-2 and human immunodeficiency virus (HIV-1) published by Pradhan et al. (https://www.biorxiv.org/content/10.1101/2020.01.30.927871v2, now withdrawn); 2) identification of potential intermediate hosts of SARS-CoV-2 by comparing the ways that the virus and animals use their genetic code to produce proteins, specifically their relative synonymous codon usage, published by Ji et al. (https://doi.org/10.1002/jmv.25682); and 3) assembly of a draft coronavirus genome from metagenomic reads from Malayan pangolins produced by multiple research groups.
Summary of Main Findings
The authors found that the genetic sequences within the spike protein share no significant similarity with HIV-1 (contradicting Pradhan et al.); rather, all four sequences were close matches to other viruses and three out of four matched exactly with sequences in a coronavirus from a bat. The reanalysis of codon usage between SARS-CoV-2 and potential intermediate hosts was performed using a more complete database than that used by Ji et al. and additional coronaviruses for comparison. The authors find that the most probable intermediate hosts for SARS-CoV-2, SARS-CoV, and MERS-CoV based on codon usage are frogs, which are not known to be involved in any way with the life history of these viruses, thus calling into question the biological validity of relying on codon usage for identifying intermediate hosts. Finally, they successfully put together all of the sequences of pangolin coronaviruses into a draft genome, with 73% coverage and 91% sequence identity (92% for the spike protein) compared to the SARS-CoV-2 genome.
The authors make a clear and well-supported argument against the claims presented in the Pradhan et al. and Ji et al. studies. The authors reexamine the spike protein sequences with broader search parameters than the original Pradhan et al. study. For the codon usage analysis, a broader diversity of viruses (SARS-CoV and MERS-CoV) and potential intermediate hosts were considered, and the database of codon usage was updated more recently than the one used by Ji et al.
Regarding the analysis of pangolin coronavirus metagenomes, the phylogenetic distance of the pangolin coronavirus to SARS-CoV-2 is still too far to implicate pangolins as the intermediate hosts of the virus. More surveying must be done in bats, pangolins, and other mammals to identify the zoonotic source of SARS-CoV-2 in humans.
This study discredits two controversial hypotheses regarding the origin of SARS-CoV-2 that emerged early in the outbreak and generated significant media attention.