Introductions and early spread of SARS-CoV-2 in the New York City area

Our take —

This study analyzes SARS-CoV-2 virus genome sequences from patients in New York City and compares them to others that were publicly available at that time. The study found evidence for multiple introductions of the virus from domestic and international sources. Most of the introductions were from Europe, with limited evidence of direct transmissions from China. The study also provides evidence for community transmission of COVID-19 within the New York City area as early as mid-March 2020.

Study design


Study population and setting

The study involved the generation and analysis of 90 virus genome sequences from 84 COVID-19 patients seeking care at the Mount Sinai Health System in New York City (NYC) between February 29 and March 18, 2020. These sequences were analyzed together with 2363 publicly available sequences from GISAID ( deposited up to April 1, 2020. The purpose of the study was to identify the early events underlying the rapid spread of the virus in the New York City metropolitan area.

Summary of Main Findings

The study noted that expansion of COVID-19 screening led to a surge in newly diagnosed cases in NYC at the end of March 2020. Sequencing of isolates from NYC patients infected during this time period led to the generation of 90 complete or near-complete genome sequences. Analysis of these NYC sequences in the context of other published sequences revealed multiple viral introductions into the city, a majority of which were likely introductions from Europe. This is because many of the NYC sequences fall within a clade on the tree that is primarily made up of sequences from different parts of Europe. A close relationship between other NYC viral sequences and those from the Washington State outbreak also suggested domestic introductions of the virus into NYC, and a cluster of highly similar NYC sequences provided evidence of community transmission in the region. The sequence data also suggest that there was a period of global COVID-19 transmission that occurred between late January and mid-February that was not detected and did not have sequences on this tree.

Study Strengths

The samples used in this study are from a discrete place and time, and are placed in the context of all the available published sequences at GISAID at that time, which may have increased their ability to determine introductions and comment on the NYC situation specifically. The methodology is clear and detailed, which makes the study reproducible, and the authors are careful not to over-interpret the data.


There were relatively few sequences from New York city (90) considering the large size of the epidemic in the area. Therefore, there may have been additional introductions into NYC from both domestic and international sources that are unaccounted for in this analysis. It is also important to remember that biases in sampling and uncertainty in the phylogenetic analysis can have an impact on the inferred relationships between samples, which could lead to wrongly assuming a direct connection between locations that are actually separated by unsampled cases.

Value added

The paper finds evidence for specific introductions of the SARS-CoV-2 virus into NYC, and discusses how periods of untracked global transmissions led to rapid spread of the virus to the NYC area and the rest of the world. The clustering of sequences that were exclusively from NYC provided molecular evidence of community transmissions in the city even early in the outbreak (prior to March 18, 2020).