Skip to main content

Mapping the human genetic architecture of COVID-19

Our take —

This case-control genome-wide association study (GWAS) meta-analysis used COVID-19 cases of varying severity and population-based controls to identify host genetic variants associated with  SARS-CoV-2 infection and COVID-19 hospitalization. Using data from 49,562 cases and 2,000,000 population-based controls representing 46 unique studies from 19 countries, researchers identified 13 distinct loci associated with SARS-CoV-2 infection or COVID-19. Many of the identified loci are also known genes associated with elevated risk for interstitial lung disease (DPP9 and FOXP4) or protective effects on autoimmune-related diseases (TYK2). The identified genes provide potential therapeutic targets for COVID-19, however heterogeneity in case ascertainment, sample sizes, and phenotyping warrant additional more detailed studies.  

Study design


Study population and setting

This case-control meta-analysis included summary statistics from 46 different studies: the sample size consisted of 49,562 cases of European (77%), Middle Eastern (4.9%), East Asian (3.6%), South Asian (3.6%), African (4.9%), and Admixed American (7.0%) ancestry. Three main categories of COVID-19 disease were defined: SARS-Cov-2 infected individuals who were hospitalized for COVID-19 and are either deceased or require respiratory support, cases with lab-confirmed SARS-Cov-2 infection hospitalized with moderate to severe COVID-19, and all cases that had lab-confirmed SARS-CoV-2 infection or physician or self-reported COVID-19. GWAS analysis was run using SAIGE or PLINK, and meta-analyses were performed using the summary statistics from each study. A PheWAS (phenome-wide association study) was conducted to investigate previously reported phenotypes and to investigate 15 index variants associated with risk of developing COVID-19. Finally, GWAS summary statistics for 43 complex disease, behavioral, neuropsychiatric, biomarker, and complex disease phenotypes were chosen for genetic correlation and Mendelian randomization analyses.

Summary of Main Findings

Thirteen distinct loci associated with SARS-CoV-2 infection or COVID-19 were identified. The strongest signal for increased susceptibility to SARS-CoV-2 infection was at the ABO locus, with variants in two additional loci (PPP1R15A and SLC6A20) also demonstrating associations with higher infection susceptibility. Nine of the 13 loci were associated with an increased risk of developing severe COVID-19 symptoms, including variants in DPP9 (OR 1.29, p 2.0×10-12) and FOXP4 (OR 1.2, p 6.0×10-13), which were previously identified as increasing lung disease risk. Previously identified autoimmune disease-protective variants in TYK2 conferred an increased risk for hospitalization due to COVID-19 (OR 1.43, 95% CI: 1.29-1.59, p 9.71×10-12), and a variant in KANSL1 (OR 0.96, p 1.00×10-20) was associated protectively against COVID-19-related hospitalization. Interestingly, heritability of SARS-CoV-2 infection was enriched in genes expressed in the lung (p 5×10-4). Overall, this meta-analysis suggests a polygenic architecture (that is, influenced by more than one gene) of SARS-CoV-2 infection and COVID-19 severity.

Study Strengths

The study population is large and drawn from multiple studies with global ancestry representation, albeit primarily European ancestry (77%). The augmented sample size increases the statistical power to identify associations of varying effect size.


Despite the identification of genetic variants associated with infection and disease, untagged genetic variation suggested by linkage disequilibrium structure and physical proximity, particularly at the SLC6A20 locus, may drive association signals in certain regions. Variability in case ascertainment, sample sizes, and case phenotyping among the included 46 studies may bias the associations. Inclusion of untested population controls assumed not to have been infected may bias effect sizes. Lower socioeconomic status and other socio-demographic variables associated with higher SARS-CoV-2 infection risk, COVID-19 disease severity, and study sampling are likely to have introduced selection bias, which may further distort effect sizes.

Value added

This study brings together the largest number of COVID-19 host genetics studies to date using standardized methods. This study provides valuable insight on the putative genes that may be involved with infection and severity, and warrants additional gene exploration with refined phenotypes and additional diverse populations.

This review was posted on: 11 September 2021