Researchers have sequenced the most comprehensive map of the SARS-CoV-2 genome yet

Authors: Anne Trafton MIT News First Published May 17, 2021

In early 2020, a few months after the Covid-19 pandemic began, scientists were able to sequence the full genome of SARS-CoV-2, the virus that causes the Covid-19 infection. While many of its genes were already known at that point, the full complement of protein-coding genes was unresolved.

Now, after performing an extensive comparative genomics study, MIT researchers have generated what they describe as the most accurate and complete gene annotation of the SARS-CoV-2 genome. In their study, which appears today in Nature Communications, they confirmed several protein-coding genes and found that a few others that had been suggested as genes do not code for any proteins.

“We were able to use this powerful comparative genomics approach for evolutionary signatures to discover the true functional protein-coding content of this enormously important genome,” says Manolis Kellis, who is the senior author of the study and a professor of computer science in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) as well as a member of the Broad Institute of MIT and Harvard.

The research team also analyzed nearly 2,000 mutations that have arisen in different SARS-CoV-2 isolates since it began infecting humans, allowing them to rate how important those mutations may be in changing the virus’ ability to evade the immune system or become more infectious.

Comparative genomics

The SARS-CoV-2 genome consists of nearly 30,000 RNA bases. Scientists have identified several regions known to encode protein-coding genes, based on their similarity to protein-coding genes found in related viruses. A few other regions were suspected to encode proteins, but they had not been definitively classified as protein-coding genes.

To nail down which parts of the SARS-CoV-2 genome actually contain genes, the researchers performed a type of study known as comparative genomics, in which they compare the genomes of similar viruses. The SARS-CoV-2 virus belongs to a subgenus of viruses called Sarbecovirus, most of which infect bats. The researchers performed their analysis on SARS-CoV-2, SARS-CoV (which caused the 2003 SARS outbreak), and 42 strains of bat sarbecoviruses.

Kellis has previously developed computational techniques for doing this type of analysis, which his team has also used to compare the human genome with genomes of other mammals. The techniques are based on analyzing whether certain DNA or RNA bases are conserved between species, and comparing their patterns of evolution over time.

Using these techniques, the researchers confirmed six protein-coding genes in the SARS-CoV-2 genome in addition to the five that are well established in all coronaviruses. They also determined that the region that encodes a gene called ORF3a also encodes an additional gene, which they name ORF3c. The gene has RNA bases that overlap with ORF3a but occur in a different reading frame. This gene-within-a-gene is rare in large genomes, but common in many viruses, whose genomes are under selective pressure to stay compact. The role for this new gene, as well as several other SARS-CoV-2 genes, is not known yet.

The researchers also showed that five other regions that had been proposed as possible genes do not encode functional proteins, and they also ruled out the possibility that there are any more conserved protein-coding genes yet to be discovered.

“We analyzed the entire genome and are very confident that there are no other conserved protein-coding genes,” says Irwin Jungreis, lead author of the study and a CSAIL research scientist. “Experimental studies are needed to figure out the functions of the uncharacterized genes, and by determining which ones are real, we allow other researchers to focus their attention on those genes rather than spend their time on something that doesn’t even get translated into protein.”

The researchers also recognized that many previous papers used not only incorrect gene sets, but sometimes also conflicting gene names. To remedy the situation, they brought together the SARS-CoV-2 community and presented a set of recommendations for naming SARS-CoV-2 genes, in a separate paper published a few weeks ago in Virology.

Understanding COVID-19 through genome-wide association studies

Authors: Tom H. Karlsen  Nature Genetics volume 54, pages368–369 (2022)Cite this article

8742 Accesses 1 Citations 87 Altmetric

Defining the most appropriate phenotypes in genome-wide association studies of COVID-19 is challenging, and two new publications demonstrate how case-control definitions critically determine outcomes and downstream clinical utility of findings.

Exploring self-reported data from more than 700,000 participants in a direct-to-consumer ancestry genetics company, in this issue of Nature Genetics, Roberts et al. report how several commonly used phenotype definitions in COVID-19 genetics studies converge to represent either susceptibility to infection by the SARS-CoV-2 virus or risk of severe COVID-19 disease1. For pragmatic reasons, early genome-wide association studies (GWAS) in COVID-19 focused on hospitalized cases compared with unscreened and often previously genotyped controls2,3. While allowing for rapid assessments during the first and very challenging wave of the pandemic, such study designs are biased towards the biology of complications in COVID-19. The emphasis on patients with mild or no symptoms, including identification of household COVID-19 exposure as a high-risk measure, allowed the authors to conduct a deep investigation of susceptibility to SARS-CoV-2 infection through comparisons such as exposed individuals who tested positive for COVID-19 versus exposed individuals who tested negative. Not only did these assessments corroborate the controversial ABO locus as a bona fide susceptibility gene for SARS-CoV-2 infection2,4, they also suggested the presence of a hitherto unexplored pool of protective variants.

In a dedicated query of rare variants (minor allele frequency (MAF) < 0.005), also reported in this issue of Nature Genetics, Horowitz et al. identified an association signal between a non-coding X chromosome variant (rs190509934) upstream of angiotensin-converting enzyme 2 (ACE2) and protection against SARS-CoV-2 infection5. The authors go on to substantiate their finding using RNA sequencing – data from liver tissue, showing that the protective allele leads to an almost 40% reduction in ACE2 expression levels in carriers. The association inherently holds considerable plausibility, with the membrane-bound ACE2 serving as the binding site for the SARS-CoV-2 spike glycoprotein, initiating virus cell entry6. Furthermore, Horowitz et al.5 and Roberts et al.1 utilize rich phenotype data to dissect the chromosome 3p21.31 association into a susceptibility signal and a severity signal, which localize to SLC6A20 and LZTFL1, respectively, as also observed by others7SLC6A20 encodes the sodium–imino-acid (proline) transporter 1 (SIT1), which functionally interacts with ACE2 (ref. 8), and the risk allele has been shown to associate with increased expression of SLC6A20 (ref. 2). Along with data suggesting that the receptor-binding domain of the SARS-CoV-2 spike protein preferentially interacts with blood group A9, which is encoded by the risk variant at the ABO locus, genetics of the susceptibility to SARS-CoV-2 infection appear to converge on the cell entry apparatus for the virus.

Critical illness in COVID-19 develops in fewer than 10% of individuals infected with SARS-CoV-2 (ref. 10). Given the window from the first symptoms of COVID-19 to onset of severe disease with respiratory failure (typically about one week)10, prediction of a severe disease course following SARS-CoV-2 infection is of considerable clinical interest as well as from a therapeutic point of view. Reliable risk stratification may guide therapeutic interventions during this lead-in period, characterized by enhanced viral replication. These interventions potentially include antiviral therapies, convalescent plasma, neutralizing monoclonal antibodies or — possibly more important for hospitalized patients — immunomodulating drugs.

Horowitz et al. found that a high genetic risk score (top 10%) based on six established severity variants was associated with a 1.65-fold and 1.75-fold higher risk of severe disease, in individuals with or without the presence of clinical risk factors such as age and diabetes, respectively5. Others have found an odds ratio of 2.0 for the impact of the rs10490770 risk allele at the 3p21.31 locus on the combined end-point of death or severe respiratory failure in an overall COVID-19 patient population11, with almost double the effect size in individuals 60 years or younger (odds ratio of 3.5). These magnitudes are comparable with those associated with clinical risk factors. Findings of lower age in individuals homozygous for the chromosome 3p21.31 risk variant support enhanced utility of genetic risk stratification in the young patient population2.

The execution of GWAS in COVID-19 has been remarkably nimble, due in part to robust collaborative networks set up during past GWAS12, as well as the utilization of previously genotyped study populations such as the UK Biobank, AncestryDNA and 23andme1,3,4,5. The rapid phenotyping undertaken by several biobanks and direct-to-consumer genetics companies during the COVID-19 pandemic is unprecedented, and the resulting publications deserve acknowledgement as a form of ‘population-level testing’ for genetic clues in emerging diseases. The orchestration of projects by the COVID-19 Host Genetics Initiative has also been an important catalyzer of activities13. Figure 1 summarizes published and peer-reviewed GWAS articles on COVID-19. However, even at time of writing, the meta-analysis of the sixth data freeze of the COVID-19 Host Genetics Initiative has been released online, reporting on a total of 23 loci involving in COVID-19 susceptibility (7 loci) and severity (15 loci); adding 10 new loci to the consortium’s own publication only 3 months ago7. The 22-month period that has passed since the publication of the first COVID-19 GWAS2 appears even more impressive in comparison with the 7 years of Crohn’s disease genetics — spanning from the 2001 nucleotide-binding oligomerization domain 2 (NOD2) susceptibility gene discovery to a 2008 meta-analysis14,15 — that it took to achieve the same amount of insight. Further exemplified by the 20-year history of genetics of Crohn’s disease, translational studies of GWAS findings take time, but may reveal new and unexpected aspects of pathophysiology. It is in this context that the rapid unravelling of COVID-19 genetics becomes important. Some of the loci hold immediate biological plausibility (for example, ACE2 and some of the chemokines), whereas the underlying mechanisms of others remain obscure. Following this recent sprint of COVID-19 GWAS to which Horowitz et al.5 and Roberts et al.1 significantly contribute, the subsequent translational ultramarathon of biological studies can begin — and with this a deeper understanding of the pathophysiology of SARS-CoV-2 infection and its complications will emerge. Vaccination has proven the ultimate protection against SARS-CoV-2 infection. The hope is that the biological insights provided by COVID-19 GWAS will facilitate identification and development of novel treatment options of not only hospitalized and critically ill COVID-19 patients, but also treatment modalities that can prevent hospitalization.

figure 1
Fig. 1: Genetic loci from COVID-19 GWAS in peer-reviewed publications to date.

References

  1. Roberts, G. H. L. et al. Nat. Genet. https://doi.org/10.1038/s41588-022-01042-x (2022).Article PubMed Google Scholar 
  2. Ellinghaus, D. et al. N. Engl. J. Med. 383, 1522–1534 (2020).CAS Article Google Scholar 
  3. Pairo-Castineira, E. et al. Nature 591, 92–98 (2021).Article Google Scholar 
  4. Shelton, J. F. et al. Nat. Genet. 53, 801–808 (2021).CAS Article Google Scholar 
  5. Horowitz, J. E. et al. Nat. Genet. (in the press).
  6. Yan, R. et al. Science 367, 1444–1448 (2020).CAS Article Google Scholar 
  7. COVID-19 Host Genetics Initiative. Nature https://doi.org/10.1038/s41586-021-03767-x (2021).
  8. Kuba, K. et al. Pharmacol. Ther. 128, 119–128 (2010).CAS Article Google Scholar 
  9. Wu, S. C. et al. Blood Adv. 5, 1305–1309 (2021).CAS Article Google Scholar 
  10. Berlin, D. A., Gulick, R. M. & Martinez, F. J. N. Engl. J. Med. 383, 2451–2460 (2020).CAS Article Google Scholar 
  11. Nakanishi, T. et al. J. Clin. Investhttps://doi.org/10.1172/jci152386 (2021).
  12. Bulik-Sullivan, B. K. & Sullivan, P. F. Nat Genet 44, 113 (2012).CAS Article Google Scholar 
  13. The COVID-19 Host Genetics Initiative Eur. J. Hum. Genet28, 715–718 (2020).
  14. Hugot, J. P. et al. Nature 411, 599–603 (2001).CAS Article Google Scholar 
  15. Barrett, J. C. et al. Nat. Genet. 40, 955–962 (2008).CAS Article Google Scholar 

COVID-19 outcomes and the human genome

Authors: Michael F. Murray MDEimear E. Kenny PhDMarylyn D. RitchiePhDDaniel J. Rader MDAllen E. Bale MDMonica A. Giovanni MS, CGC & Noura S. Abul-Husn MD, PhD Genetics in Medicine  volume 22, pages1175–1177

BACKGROUND

In the COVID-19 pandemic, the opportunity to link host genomic factors to the highly variable clinical manifestations of SARS-CoV-2 infection has been widely recognized.1,2 The overt motivation for this research is the clinical implementation of any new insights to improve clinical management and foster better patient outcomes.

Human infection is a complex interaction between the microbe, the environment, and the human host.3 Variation in the human genome has only rarely been linked to complete resistance to infection by a specific microbe; far more commonly host genomic variability has been linked to complications associated with infections (see Table 1).3,4,5 In this pandemic, the ability to identify host genomic factors that increase susceptibility or resistance to the complications of COVID-19 and to translate these findings to improved patient care should be the goal.Table 1 Sample characteristics.

Full size table

Several approaches can be taken to uncover relevant host genomic factors. Familial and population-based linkage analyses and analyses of extreme phenotypes can uncover monogenic variants contributing to COVID-19 clinical outcomes.6 Genome-wide association studies (GWAS)7,8 and multiomic-based approaches can be used to uncover common variants and biological networks underlying host-pathogen interactions. Likewise, data derived from genomes, such as HLA haplotypes, ABO blood groups, and polygenic risk scores (PRS),9 can be used to understand COVID-19 susceptibility, resistance, and complications. Furthermore, biobanks linking genomic data to electronic health records (EHRs)10 can be leveraged to investigate the impact of these genomic factors on the clinical course of SARS-CoV-2 infected patients.

Many recognize that this area of research needs to go forward in a manner that is proactively inclusive of traditionally underserved populations to both avoid the exacerbation of existing health-care disparities and to optimize discovery. Past efforts have demonstrated the value of this type of inclusion, as was seen in the extension of a CCR5-associated delta 32 correlation to HIV-1 infection in individuals with European ancestry to a promoter variant in CCR5 linked to perinatal HIV-1 transmission in individuals with African ancestry.11,12,13

As host genomic factors are discovered, new strategies supporting rapid clinical implementation should be trialed to realize improvement in outcomes for SARS-CoV-2 infected patients. Implementation will require an infrastructure to deliver relevant genomic results to infected patients and their health-care providers to guide clinical management. This commentary examines the types of genomic factors that might be identified in emerging COVID-19 discovery and implementation research, based on decades of genomic discovery, research into other human infections, and advances in genomic medicine.

PHASES OF PHENOTYPE ASCERTAINMENT IN THE COVID PANDEMIC

In this fast-moving pandemic, we believe there will be at least two phases to defining COVID-19 related phenotypes. Currently in the United States, we are in an initial phase when important limitations influence the ability of research teams to ascertain and appropriately define phenotypes of interest. These limitations include (1) the absence of widespread viral and serologic testing to accurately distinguish those who have been infected from those who have not, (2) the lack of knowledge about infection exposure at a community level, and (3) institutional limits to recruiting human subjects in a time of social distancing. Heterogeneity of testing strategies and their sensitivity, and nascent regulatory oversight may pose challenges in clear and reproducible definitions of COVID-19-related phenotypes. In the second phase, adequate serologic testing may allow for increased numbers and more accurate discrimination of cases and controls, as well as the ability to define additional clinical phenotypes of interest (e.g., asymptomatic seropositive individuals). The use of telemedicine, which has expanded for health-care delivery during the pandemic, in addition to community outreach efforts, can overcome barriers to recruitment in this infectious disease outbreak.

To find important genotype–phenotype correlations, there will need to be phenotypes that are ascertained in a manner that is clear, quantitative, and reproducible, and there will need to be adequate sampling from well-defined cases and controls. One rubric that can be used for phenotyping during this initial phase of COVID-19 host genomic research is the Ordinal Scale for Clinical Improvement proposed by the World Health Organization (WHO) in their blueprint for therapeutic trials (see Supplemental Table 1).14 For instance, this scale can be applied across research groups and across health systems in order to allow phenotypic groupings of COVID-19 patients based on (1) need for hospitalization, (2) need for oxygen supplementation, (3) progression to respiratory failure, or (4) mortality, and these phenotypes could be readily extracted from EHRs. In the current initial phase of the COVID-19 pandemic, difficulties with the enrollment and appropriate scoring of uninfected, asymptomatic, or mildly affected patients (categories 0–2 in Supplemental Table 1) are anticipated. Specifically, asymptomatic positives will be mistakenly scored as 0 instead of 1 without either viral screening or serologic testing. In addition, patients who would be scored 0–2 are difficult to recruit and consent given the social distancing limitations that are currently in place. As serologic testing becomes more sophisticated, widespread, and robust, it is anticipated that COVID-19-related phenotyping will become more standard, facilitating reproducible and scalable COVID-19 research.

CANDIDATE GENES AND PATHWAYS

At least three lines of inquiry might inform the nomination of candidate genes for intensive interrogation with COVID-19 phenotypes: (1) what do we know about the microbial life cycle, (2) what do clinical observations in patients suggest with regard to biological pathways that are likely being triggered, and (3) what does the literature teach us about host genetics in infection that could apply to this novel infection. For example, the cellular surface receptor for SARS-CoV-2 virus is encoded by the ACE2 gene, and critical amino acid residues in the binding interaction have been described.15,16 This and other insights into host–pathogen interactions will elucidate specific variants, genes, and pathways underlying interindividual COVID-19 susceptibility and response. Genes and pathways related to COVID-19 could also include other viral receptor genes (e.g., TMPRSS2) (unpublished data: https://doi.org/10.1101/2020.03.30.20047878), inflammatory and immune response pathways (e.g., IL-6 pathway), and genes involved in hypercoagulability and acute respiratory distress syndrome.17 Other genes that may be of interest include genes associated with ABO blood group (e.g., FUT2) in light of a report on an association between blood groups and COVID-19 in China (unpublished data: https://doi.org/10.1101/2020.03.11.20031096) as well as similar associations in the past.18 Research into the genetics of the interplay between viral infection and common diseases (e.g., diabetes and heart disease) is also of interest to many investigators. As our understanding of genes underlying SARS-CoV-2 infectivity and biological mechanisms grows, we will better elucidate their potential involvement in disease susceptibility and clinical outcomes.

GENOME-SCALE APPROACHES FOR DISCOVERY AND RISK PREDICTION

In tandem, the global scientific community has rapidly mobilized collaborative efforts to advance unbiased genome-wide COVID-19 host genomic discovery through large-scale genomic studies. For example, the COVID-19 Host Genetics Initiative is organizing analytical activities across a growing network of over 120 studies to identify genomic determinants of COVID-19 susceptibility and severity.1 It is difficult at this stage to estimate the number of research participants needed to identify host genomic factors related to the COVID-19 novel pathogenic exposure. If we assume that the effect size and allele frequency of genetic variants important for COVID-19 susceptibility, resistance, and/or complications are as variable as other host factors in infectious conditions (i.e., Supplemental Table 1), then the number of cases and controls needed to have statistical power to identify associations could vary widely. Collaborative efforts like the COVID-19 Host Genetics Initiative should be well-powered for the unbiased discovery of novel genes and pathways. Such efforts foster data aggregation and sharing broadly among the research community and are likely to greatly impact the speed with which COVID-19 discoveries can be made and disseminated worldwide.

In aggregate, knowledge of host genomic factors could lead to improved care for patients with COVID-19, through risk stratification, as well as targeted prevention and treatment options. For example, GWAS discovery efforts could yield PRS for COVID-19 clinical outcomes, which could be used in the context of other clinical data to risk stratify patients early in the disease course. Host genomic factors could be linked to variability in the protective immune response and have implications for vaccination strategies, or could be used to optimally select patients for novel therapeutic treatments and trials. However, as it can take many years for genomic discoveries to directly benefit patients,10 in parallel we need to prepare our health systems with infrastructure to rapidly integrate high quality, clinically relevant COVID-19 host genomic findings into the care of individuals with SARS-CoV-2 infection.

CONCLUSIONS

The COVID-19 pandemic currently threatens to overwhelm health-care systems and undermine economies. There is no proven therapeutic and no vaccine for the novel coronavirus causing this pandemic. In this moment, we emphasize the sentiments voiced by the COVID-19 Host Genetics Initiative, namely that “[i]nsights into how to better understand and treat COVID-19 are desperately needed. Given the importance and urgency in obtaining these insights, it is critical for the scientific community to come together around this shared purpose.” 1

As the community works together to develop a COVID-19 host genomics research engine, we are poised for novel discovery and advances in genomic medicine. A model to understand human genomic variants linked to COVID-19 outcomes can be conceived as a continuum from ultrarare to common. We offer Supplemental Table 2 as a way to think about findings that can be expected from this research.19,20 It is imperative that the research community prioritize high-quality and reproducible findings, even under the pressure for expediency, and be mindful of ethical, legal, or social issues that could emerge related to the COVID-19 impact among different groups within society.

References

  1. The COVID-19 Host Genetics Initiative. https://www.covid19hg.org. Accessed 15 April 2020.
  2. The COVID Human Genetic Effort. https://www.covidhge.com. Accessed 15 April 2020.
  3. Murray MF. Susceptibility and response to infection. In: Rimoin DL, Connor JM, Pyeritz RE, et al., editors. Emery and Rimoin’s principles and practice of medical genetics. 6th ed. London: Churchill Livingston; 2013.
  4. Gabriel SE, Brigman KN, Koller BH, et al. Cystic fibrosis heterozygote resistance to cholera toxin in the cystic fibrosis mouse model. Science. 1994;266:107–109.CAS Article Google Scholar 
  5. Ahuja SK, He W. Double-edged genetic swords and immunity: lesson from CCR5 and beyond. J Infect Dis. 2010;201:171–174.CAS Article Google Scholar 
  6. Ciancanelli MJ, Huang SX, Luthra P, et al. Infectious disease. Life-threatening influenza and impaired interferon amplification in human IRF7 deficiency. Science. 2015;348:448–453.CAS Article Google Scholar 
  7. European Bioinformatics Institute. The NHGRI-EBI catalog of published genome-wide association studies (GWAS). https://www.ebi.ac.uk/gwas/. Accessed 15 April 2020.
  8. Visscher PM, Brown MA, McCarthy MI, Yang J. Five years of GWAS discovery. Am J Hum Genet. 2012;90:7–24.CAS Article Google Scholar 
  9. Khera AV, Chaffin M, Aragam KG, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50:1219–1224.CAS Article Google Scholar 
  10. Abul-Husn NS, Kenny EE. Personalized medicine and the power of electronic health records. Cell. 2019;177:58–69.CAS Article Google Scholar 
  11. Samson M, Libert F, Doranz BJ, et al. Resistance to HIV-1 infection in Caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor gene. Nature. 1996;382:722–725.CAS Article Google Scholar 
  12. Liu R, Paxton WA, Choe S, et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell. 1996;86:367–377.CAS Article Google Scholar 
  13. Kostrikis LG, Neumann AU, Thomson B, et al. A polymorphism in the regulatory region of the CC-chemokine receptor 5 gene influences perinatal transmission of human immunodeficiency virus type 1 to African-American infants. J Virol. 1999;73(Dec):10264–10271. PMID: 1055934CAS Article Google Scholar 
  14. World Health Organization. R&D blueprint novel coronavirus COVID-19 therapeutic trial synopsis.18 February 2020. https://www.who.int/blueprint/priority-diseases/key-action/COVID-19_Treatment_Trial_Design_Master_Protocol_synopsis_Final_18022020.pdf. Accessed 15 April 2020.
  15. Yan R, Zhang Y, Li Y, et al. Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2. Science. 2020;367:1444–1448.CAS Article Google Scholar 
  16. Shang J, Ye G, Shi K, et al. Structural basis of receptor recognition by SARS-CoV-2. Nature. 2020 Mar 30; https://doi.org/10.1038/s41586-020-2179-y [Epub ahead of print].
  17. Hernández-Beeftink T, Guillen-Guio B, Villar J, Flores C. Genomics and the acute respiratory distress syndrome: current and future directions. Int J Mol Sci. 2019;20:E4004.Article Google Scholar 
  18. Cooling L. Blood groups in infection and host susceptibility. Clin Microbiol Rev. 2015;28:801–870.CAS Article Google Scholar 
  19. Yang Y, Muzny DM, Xia F, et al. Molecular findings among patients referred for clinical whole-exome sequencing. JAMA. 2014;312:1870–1879.CAS Article Google Scholar 
  20. Farwell KD, Shahmirzadi L, El-Khechen D, et al. Enhanced utility of family-centered diagnostic exome sequencing with inheritance model-based analysis: results from 500 unselected families with undiagnosed genetic conditions. Genet Med. 2015;17:578–586

COVID-19 alters human genes, explaining mystery behind coronavirus ‘long haulers’

Authors: Chris Melore Published APRIL 28, 2021 Study Finds

For some COVID-19 patients, getting over their infection is just the beginning of the recovery. Over the last year, COVID “long haulers” have continued experiencing a variety of symptoms months after the virus clears. These include anything from skin problems, to shortness of breath, to losing the sense of taste or smell. Now, researchers say they may know why this is happening. A new study finds coronavirus actually causes long-term changes to an infected patient’s genes.

Specifically, scientists reveal the spike protein of SARS-CoV-2, the virus causing COVID-19, creates long-lasting changes to human gene expression. These tiny spikes cover the surface of coronavirus cells. They allow the virus to bind to certain receptors on human cells and hijack their functions — leading to COVID infection. Once the spike cuts into a patient’s cells, the virus releases its own genetic material into the cell so it can replicate.

“We found that exposure to the SARS-CoV-2 spike protein alone was enough to change baseline gene expression in airway cells,” explains Nicholas Evans, a master’s student at the Texas Tech University Health Sciences Center, in a media release. “This suggests that symptoms seen in patients may initially result from the spike protein interacting with the cells directly.”

Spikes make long-term changes to human lung cells

Researchers examined how exposure to spike protein impacts cultured human airway cells in lab experiments. They also compared the results to studies using cell samples from actual COVID-19 patients.

The team notes culturing human airway cells requires time and specific conditions which help the cells mature. This allows the lab cells to develop into the different cells living in a real human airway. To do this, study authors refined a culturing technique called air-liquid interface so they could more closely simulate the conditions in an actual patient’s lungs.

After culturing, scientists exposed the cells to low and high concentrations of purified spike protein. The results reveal differences in gene expression which remained in the cells even after the infection passed. The most affected genes include ones controlling the body’s inflammatory response.

“Our work helps to elucidate changes occurring in patients on the genetic level, which could eventually provide insight into which treatments would work best for specific patients,” Evans explains.

Study authors now plan to use this approach to examine how long these genetic changes last. They also hope to reveal what other long-term consequences a COVID infection will have on a patient’s health.

The team is presenting their findings at Experimental Biology (EB) 2021, a virtual meeting of the American Society for Biochemistry and Molecular Biology.

Chair of Lancet COVID-19 Commission: Investigate Origins of COVID

Authors: Michelle Edwards -May 24, 2022

The Chairman of The Lancet’s COVID-19 Commission has called for an independent inquiry into the origins of the SARS-CoV-2 virus. Jeffrey Sachs, a world-renowned economics professor, stated on May 19 that U.S. laboratory experiments may have contributed to the emergence of COVID-19. In an argument published in PNAS, a peer-reviewed journal of the National Academy of Sciences (NAS), Sachs has called on universities to open up their databases for close examination amid fears that laboratories were genetically modifying viruses.

Sachs maintains that “there is much important information that can be gleaned from U.S.-based research institutions, information not yet made available for independent, transparent, and scientific scrutiny.” He insists that critical data available in the U.S. from these institutions “would explicitly include, but are not limited to, viral sequences gathered as part of the PREDICT project and other funded programs, as well as sequencing data and laboratory notebooks from U.S. laboratories.” He wrote:

“We call on U.S. government scientific agencies, most notably the NIH, to support a full, independent, and transparent investigation of the origins of SARS-CoV-2. This should take place, for example, within a tightly focused science-based bipartisan Congressional inquiry with full investigative powers, which would be able to ask important questions—but avoid misguided witch-hunts governed more by politics than by science.”

Sachs, who wrote the article with Neil L. Harrison, said it was apparent scientists from the University of North Carolina (UNC) and New York-based EcoHealth Alliance (EHA) had been working with the Wuhan Institute of Virology (WIV) to manipulate viruses. The authors note that the bulk of the work done at WIV “was part of an active and highly collaborative U.S.-China scientific research program funded by the U.S. Government (NIH, Defense Threat Reduction Agency [DTRA], and U.S. Agency for International Development [USAID]).”

Furthermore, they point out that although the work was coordinated by researchers at EcoHealth Alliance (EHA), it also involved researchers at several other U.S. institutions. The article states, “For this reason, it is important that U.S. institutions be transparent about any knowledge of the detailed activities that were underway in Wuhan and the United States,” adding, “The evidence may also suggest that research institutions in other countries were involved, and those too should be asked to submit relevant information (e.g., with respect to unpublished sequences).”

Indeed, in addition to EHA, participating U.S. institutions include the University of North Carolina (UNC), the University of California at Davis (UCD), the NIH, and the USAID. Under a series of NIH grants and USAID contracts, the authors note that EHA coordinated the collection of SARS-like bat CoVs from the field in Southwest China and Southeast Asia. Researchers then “coordinated the sequencing of these viruses, the archiving of these sequences (involving UCD), and the analysis and manipulation of these viruses (notably at UNC).” Undoubtedly, a large part of the research was done in the United States. The authors point out:

“The exact details of the fieldwork and laboratory work of the EHA-WIV-UNC partnership, and the engagement of other institutions in the United States and China, has not been disclosed for independent analysis. The precise nature of the experiments that were conducted, including the full array of viruses collected from the field and the subsequent sequencing and manipulation of those viruses, remains unknown.

Instead of disclosing their research activities to the U.S. scientific community and the general public, the EHA, UNC, NIH, USAID, and other research partners have insisted they were not involved in any experiments that could have resulted in the emergence of SARS-CoV-2. Specifically, the NIH has stated there is “a significant evolutionary distance between the published viral sequences and that of SARS-CoV-2 and that the pandemic virus could not have resulted from the work sponsored by NIH.”

The authors argue this assertion by the NIH is only as good as the limited data on which it is based, adding the validation of this assertion relies upon gaining access to any other unpublished viral sequences that are deposited in relevant U.S. and Chinese databases. They remarked:

“On May 11, 2022, Acting NIH Director Lawrence Tabak testified before Congress that several such sequences in a U.S. database were removed from public view, and that this was done at the request of both Chinese and U.S. investigators.”

Sachs and Harrison insist that even though the NIH and USAID have “strenuously resisted” full disclosure of the details of the EHA-WIV-UNC work program, several documents leaked to the public or released through the Freedom of Information Act (FOIA) have raised concerns. The experts refer to particular circumstances surrounding the presence of an “unusual furin cleavage site (FCS)” in SARS-CoV-2 that augments the pathogenicity and transmissibility of the virus related to viruses like SARS-CoV-1.” Describing their concern in more depth, they explain:

“SARS-CoV-2 is, to date, the only identified member of the subgenus sarbecovirus that contains an FCS, although these are present in other coronaviruses. A portion of the sequence of the spike protein of some of these viruses is illustrated in the alignment shown in Fig. 1, illustrating the unusual nature of the FCS and its apparent insertion in SARS-CoV-2. From the first weeks after the genome sequence of SARS-CoV-2 became available, researchers have commented on the unexpected presence of the FCS within SARS-CoV-2—the implication being that SARS-CoV-2 might be a product of laboratory manipulation. In a review piece arguing against this possibility, it was asserted that the amino acid sequence of the FCS in SARS-CoV-2 is an unusual, nonstandard sequence for an FCS and that nobody in a laboratory would design such a novel FCS.”

origins
This alignment of the amino acid sequences of coronavirus spike proteins in the region of the S1/S2 junction illustrates the sequence of SARS-CoV-2 (Wuhan-Hu-1) and some of its closest relatives. The furin cleavage site (FCS) is indicated (PRRAR’SVAS), and furin cuts the spike protein between R and S, as indicated by the red arrowhead. Adapted from Chan & Zhan (15).

Emphatically, the duo insists the argument that the FCS in SARS-COV-2 is “an unusual, nonstandard amino acid sequence” is false. Offering an in-depth explanation in their paper, Sachs and Harrison say they “do not know whether the insertion of the FCS was the result of evolution—perhaps via a recombination event in an intermediate mammal or human—or was the result of deliberate introduction of the FCS into a SARS-like virus as part of a laboratory experiment.” Noting that the researchers were already familiar “with several experiments involving the successful insertion of an FCS into SARS-CoV-1 and other coronaviruses,” they added:

“We do know that the insertion of such FCS sequences into SARS-like viruses was a specific goal of work proposed by the EHA-WIV-UNC partnership within a 2018 grant proposal (“DEFUSE”) that was submitted to the U.S. Defense Advanced Research Projects Agency (DARPA). The 2018 proposal to DARPA was not funded, but we do not know whether some of the proposed work was subsequently carried out in 2018 or 2019, perhaps using another source of funding.”

origins
Amino acid alignment of the furin cleavage sites of SARS-CoV-2 spike protein with (Top) the spike proteins of other viruses that lack the furin cleavage site and (Bottom) the furin cleavage sites present in the α subunits of human and mouse ENaC. Adapted from Anand et al. (16).

Harrison and Sachs write that the EHA-WIV-UNC research team would also be familiar with the FCS sequence and the FCS-dependent activation mechanism of human ENaC, which was extensively characterized at UNC. They insist while the “molecular mimicry of ENaC within the SARS-CoV-2 spike protein might be a mere coincidence,” it is unlikely that is the case. Indeed, they explain the exact FCS sequence present in SARS-CoV-2 was recently introduced into the spike protein of SARS-CoV-1 in the laboratory in a series of “elegant” experiments with predictable consequences in terms of improved viral transmissibility and pathogenicity.

Reflecting on the fact several researchers raised genuine concerns in Feb. 2020 over the possibility that SARS-CoV-2 emerged from a research-associated event, Sachs and Harrison maintain transparency from the federal government is essential. They explain, based on the previous work executed by these government-funded researchers, the probability of a lab producing and releasing a novel pathogen like COVID-19 is high, adding:

“These simple experiments show that the introduction of the 12 nucleotides that constitute the FCS insertion in SARS-CoV-2 would not be difficult to achieve in a lab. It would therefore seem reasonable to ask that electronic communications and other relevant data from U.S. groups should be made available for scrutiny.”

Whole genome sequencing reveals host factors underlying critical Covid-19

Authors: Athanasios KousathanasErola Pairo-CastineiraJ. Kenneth BaillieArticle

Published:  nature  articles  article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Abstract

Critical Covid-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalisation2–4 following SARS-CoV-2 infection. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from critically-ill cases with population controls in order to find underlying disease mechanisms. Here, we use whole genome sequencing in 7,491 critically-ill cases compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical Covid-19. We identify 16 new independent associations, including variants within genes involved in interferon signalling (IL10RBPLSCR1), leucocyte differentiation (BCL11A), and blood type antigen secretor status (FUT2). Using transcriptome-wide association and colocalisation to infer the effect of gene expression on disease severity, we find evidence implicating multiple genes, including reduced expression of a membrane flippase (ATP11A), and increased mucin expression (MUC1), in critical disease. Mendelian randomisation provides evidence in support of causal roles for myeloid cell adhesion molecules (SELEICAM5CD209) and coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of Covid-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication, or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between critically-ill cases and population controls is highly efficient for detection of therapeutically-relevant mechanisms of disease.

Author information

Author notes

  1. These authors contributed equally: Athanasios Kousathanas, Erola Pairo-Castineira
  2. These authors jointly supervised this work: Sara Clohisey Hendry, Loukas Moutsianas, Andy Law, Mark J Caulfield, J. Kenneth Baillie
  3. A list of authors and their affiliations appears in the Supplementary Information

Affiliations

  1. Genomics England, London, UKAthanasios Kousathanas, Alex Stuckey, Christopher A. Odhams, Susan Walker, Daniel Rhodes, Afshan Siddiq, Peter Goddard, Sally Donovan, Tala Zainy, Fiona Maleady-Crowe, Linda Todd, Shahla Salehi, Greg Elgar, Georgia Chan, Prabhu Arumugam, Christine Patch, Augusto Rendon, Tom A. Fowler, Richard H. Scott, Loukas Moutsianas & Mark J. Caulfield
  2. Roslin Institute, University of Edinburgh, Easter Bush, Edinburgh, UKErola Pairo-Castineira, Konrad Rawlik, Clark D. Russell, Jonathan Millar, Fiona Griffiths, Wilna Oosthuyzen, Bo Wang, Marie Zechner, Nick Parkinson, Albert Tenesa, Sara Clohisey Hendry, Andy Law & J. Kenneth Baillie
  3. MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, UKErola Pairo-Castineira, Lucija Klaric, Albert Tenesa, Chris P. Ponting, Veronique Vitart, James F. Wilson, Andrew D. Bretherick & J. Kenneth Baillie
  4. Centre for Inflammation Research, The Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh, UKClark D. Russell & J. Kenneth Baillie
  5. Wellcome Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UKTomas Malinauskas, Katherine S. Elliott & Julian Knight
  6. Institute for Molecular Bioscience, The University of Queensland, Brisbane, AustraliaYang Wu
  7. Biostatistics Group, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, ChinaXia Shen
  8. Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, Teviot Place, Edinburgh, UKXia Shen, Albert Tenesa & James F. Wilson
  9. Edinburgh Clinical Research Facility, Western General Hospital, University of Edinburgh, Edinburgh, UKKirstie Morrice, Angie Fawkes & Lee Murphy
  10. Intensive Care Unit, Royal Infirmary of Edinburgh, 54 Little France Drive, Edinburgh, UKSean Keating, Timothy Walsh & J. Kenneth Baillie
  11. Department of Critical Care Medicine, Queen’s University and Kingston Health Sciences Centre, Kingston, ON, CanadaDavid Maslove
  12. Clinical Research Centre at St Vincent’s University Hospital, University College Dublin, Dublin, IrelandAlistair Nichol
  13. NIHR Health Protection Research Unit for Emerging and Zoonotic Infections, Institute of Infection, Veterinary and Ecological Sciences University of Liverpool, Liverpool, UKMalcolm G. Semple
  14. Respiratory Medicine, Alder Hey Children’s Hospital, Institute in The Park, University of Liverpool, Alder Hey Children’s Hospital, Liverpool, UKMalcolm G. Semple
  15. Illumina Cambridge, 19 Granta Park, Great Abington, Cambridge, UKDavid Bentley & Clare Kingsley
  16. Regeneron Genetics Center, 777 Old Saw Mill River Rd., Tarrytown, USAJack A. Kosmicki, Julie E. Horowitz, Aris Baras, Goncalo R. Abecasis & Manuel A. R. Ferreira
  17. Geisinger, Danville, PA, USAAnne Justice, Tooraj Mirshahi & Matthew Oetjens
  18. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USADaniel J. Rader, Marylyn D. Ritchie & Anurag Verma
  19. Test and Trace, the Health Security Agency, Department of Health and Social Care, Victoria St, London, UKTom A. Fowler
  20. Department of Intensive Care Medicine, Guy’s and St. Thomas NHS Foundation Trust, London, UKManu Shankar-Hari
  21. Department of Medicine, University of Cambridge, Cambridge, UKCharlotte Summers
  22. William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UKCharles Hinds
  23. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, UKPeter Horby
  24. Department of Anaesthesia and Intensive Care, The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong, ChinaLowell Ling
  25. Wellcome-Wolfson Institute for Experimental Medicine, Queen’s University Belfast, Belfast, Northern Ireland, UKDanny McAuley
  26. Department of Intensive Care Medicine, Royal Victoria Hospital, Belfast, Northern Ireland, UKDanny McAuley
  27. UCL Centre for Human Health and Performance, London, UKHugh Montgomery
  28. National Heart and Lung Institute, Imperial College London, London, UKPeter J. M. Openshaw
  29. Imperial College Healthcare NHS Trust: London, London, UKPeter J. M. Openshaw
  30. Imperial College, London, UKPaul Elliott
  31. Intensive Care National Audit & Research Centre, London, UKKathy Rowan
  32. School of Life Sciences, Westlake University, Hangzhou, Zhejiang, ChinaJian Yang
  33. Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, ChinaJian Yang
  34. Great Ormond Street Hospital, London, UKRichard H. Scott
  35. William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London, UKMark J. Caulfield

Consortia

GenOMICC Investigators

23andMe

Covid-19 Human Genetics Initiative

Corresponding authors

Correspondence to Mark J. Caulfield or J. Kenneth Baillie.

Supplementary information

Supplementary Information

This file contains Supplementary Figures; Supplementary Tables and Supplementary References

STUDY: PFIZER VACCINE FUSES WITH HUMAN DNA IN THE LIVER: CAUTION

Authors: Jules Gomes  •  ChurchMilitant.com  •  March 2, 2022 

The mRNA from Pfizer’s COVID-19 vaccine converts into DNA within six hours of entering the body’s liver cells according to a groundbreaking study by Swedish researchers at Lund University. 

The explosive peer-reviewed research demolishes countless assertions by the “fact-checkers” and global health organizations.

Fact-Checkers Caught

UNICEF, for example, asserted that “mRNA vaccines are not live viral vaccines and do not interfere with human DNA.”

The Bill Gates–funded Global Alliance for Vaccines and Immunizations similarly claimed that “mRNA isn’t the same as DNA, and it can’t combine with our DNA to change our genetic code,” further stating that “there is no reason to think that they [COVID vaccines] will have a lasting effect on our biology.” 

“The genetic material delivered by mRNA vaccines never enters the nucleus of your cells — which is where your DNA is kept,” the U.S. Centers for Disease Control and Prevention categorically states on its webpage titled “Myths and Facts About COVID-19 Vaccines.”

UNICEF fact-checkers also quoted Prof. Jeffrey Almond of Oxford University, who said, “Injecting RNA into someone does nothing to human-cell DNA.” 

However, the new study, published Friday in the journal Current Issues in Molecular Biology, shows that the Pfizer COVID-19 mRNA vaccine “is able to enter the human liver cell line Huh7” and that “mRNA is reverse-transcribed intracellularly into DNA in as fast as six hours.”

The process of converting RNA to DNA is called “reverse transcription,” since it overturns the previous dogma of molecular biology — which held that such conversion was a one-way process (DNA into either DNA or RNA). 

If there is a viral genomic integration from these vaccines, there is the potential risk of subsequent autoimmune disease and cancer,” scientist Dr. Alan Moy, founder of the John Paul II Medical Research Institute, told Church Militant. 

“The implication of this data would suggest that the assertion that mRNA vaccines do not cause viral nucleic-acid integration is incorrect,” Moy emphasized, adding, “Did Pfizer and Moderna perform preclinical studies to evaluate this possibility? Did the FDA require these studies from Pfizer and Moderna in the first place?”

“This information is relevant from the recent revelation that the FDA and Pfizer have tried to delay the research findings to the public over 75 years,” Dr. Moy said, after consulting with peers on the paper titled “Intracellular Reverse Transcription of Pfizer BioNTech COVID-19 mRNA Vaccine BNT162b2 In Vitro in Human Liver Cell Line.”

Dr. Peter McCullough, the world’s leading expert on COVID-19, said that the discovery has “enormous implications of permanent chromosomal change and long-term constitutive spike synthesis driving the pathogenesis of a whole new genre of chronic disease.

“‘Injecting RNA does nothing to human-cell DNA.’

In February, McCullough, a widely published internist, cardiologist and epidemiologist, warned the Vatican to end its vaccine advocacy and mandates and “immediately apologize” for the “grievous error” of “violating a critical code of bioethics,” Church Militant reported.

Reminding Pope Francis that life is a gift, McCullough accused the pontiff of “giving a gift of the loss of life” and speculated that the Vatican “will have to account for potentially hundreds of thousands of lives lost due to the vaccine.”

The new study vindicates Church Militant’s publication of two academic articles by Dr. Massimo Citro Della Riva (with an introduction by Abp. Carlo Maria Viganò). The articles responded to the polemic against Viganò published in Corrispondenza Romana by pediatrician Gwyneth A. Spaeder. 

In her diatribe against Viganò, Spaeder, a senior Catholic doctor whose family profits from the pharmaceutical industry and has links to vaccine oligarchs, asserted:

Basic biology teaches us that messenger RNA (or mRNA) is a molecule that tells the body how to make proteins. Once the protein has been produced, the mRNA is broken down and exits the body with other waste products. This process occurs only in one direction. 

There is never a possibility of modification of the vaccine recipient’s genome, since the process that involves transcription and translation, by which proteins are produced, proceeds forward from DNA to RNA to protein. It can’t work the other way around.

Citing peer-reviewed research from 2020, Dr. Citro noted that “the reverse transcription of viral RNA [into DNA] is also possible (as occurs with other RNA viruses), which is then capable of triggering long-term chronic diseases.”

“The reverse transcription of the vaccine mRNA is, for now, only hypothetical — just as it is for the DNA of the adenovirus vector. It is, however, plausible due to the presence of retrotransposons,” Riva noted, adding “I would not be so sure that vaccine mRNA cannot reverse transcribe itself into our DNA. “‘There is no reason to think they will have a lasting effect on our biology.’

Riva’s hypothesis has now been confirmed by researchers — who have shown in vitro or inside a petri dish how the Pfizer mRNA vaccine is converted (on a liver cell line) into human DNA. 

Over the last year, fact-checkers and health regulators quoted pro-vaccination scientists to attack the claim of reverse transcription as a “conspiracy theory.” However, earlier studies did preempt the possibility of the risk of the Pfizer vaccine’s mRNA hacking into our DNA.

“A 2021 study by the Whitehead Institute, the National Cancer Center and MIT reported in the Proceedings of the National Academy of Sciences journal that the RNA from the COVID virus resulted in genetic integration in cultured human cells and in patients who were infected with COVID,” Dr. Moy told Church Militant.

“This background information led to the impetus for the more recent report to test whether the mRNA vaccine from Pfizer could similarly result in integration of the mRNA gene therapy into the human genome,” Moy explained.

Close relatives of MERS-CoV in bats use ACE2 as their functional receptors

Authors: Qing Xiong,  View ORCID , , ei Cao,  Chengbao Ma,  Chen Liu, Junyu Si,  Peng Liu,  Mengxue Gu,  Chunli Wang, Lulu Shi, Fei Tong, Meiling Huang, Jing Li, Chufeng Zhao,  Chao Shen,   Yu Chen,   Huabin Zhao,  Ke Lan,  Xiangxi Wang,  Huan Yan

Summary

Middle East Respiratory Syndrome coronavirus (MERS-CoV) and several bat coronaviruses employ Dipeptidyl peptidase-4 (DPP4) as their functional receptors14. However, the receptor for NeoCoV, the closest MERS-CoV relative yet discovered in bats, remains enigmatic5. In this study, we unexpectedly found that NeoCoV and its close relative, PDF-2180-CoV, can efficiently use some types of bat Angiotensin-converting enzyme 2 (ACE2) and, less favorably, human ACE2 for entry. The two viruses use their spikes’ S1 subunit carboxyl-terminal domains (S1-CTD) for high-affinity and species-specific ACE2 binding. Cryo-electron microscopy analysis revealed a novel coronavirus-ACE2 binding interface and a protein-glycan interaction, distinct from other known ACE2-using viruses. We identified a molecular determinant close to the viral binding interface that restricts human ACE2 from supporting NeoCoV infection, especially around residue Asp338. Conversely, NeoCoV efficiently infects human ACE2 expressing cells after a T510F mutation on the receptor-binding motif (RBM). Notably, the infection could not be cross-neutralized by antibodies targeting SARS-CoV-2 or MERS-CoV. Our study demonstrates the first case of ACE2 usage in MERS-related viruses, shedding light on a potential bio-safety threat of the human emergence of an ACE2 using “MERS-CoV-2” with both high fatality and transmission rate.

Introduction

Coronaviruses (CoVs) are a large family of enveloped positive-strand RNA viruses classified into four genera: Alpha-, Beta-, Gamma- and Delta-CoV. Generally, Alpha and Beta-CoV can infect mammals such as bats and humans, while Gamma- and Delta-CoV mainly infect birds, occasionally mammals68. It is thought that the origins of most coronaviruses infecting humans can be traced back to their close relatives in bats, the most important animal reservoir of mammalian coronaviruses 910. Coronaviruses are well recognized for their recombination and host-jumping ability, which has led to the three major outbreaks in the past two decades caused by SARS-CoV, MERS-CoV, and the most recent SARS-CoV-2, respectively1114.

MERS-CoV belongs to the linage C of Beta-CoV (Merbecoviruses), which poses a great threat considering its high case-fatality rate of approximately 35%15. Merbecoviruses have also been found in several animal species, including camels, hedgehogs, and bats. Although camels are confirmed intermediate hosts of the MERS-CoV, bats, especially species in the family of Vespertilionidae, are widely considered to be the evolutionary source of MERS-CoV or its immediate ancestor16.

Specific receptor recognition of coronaviruses is usually determined by the receptor-binding domains (RBDs) on the carboxyl-terminus of the S1 subunit (S1-CTD) of the spike proteins17. Among the four well-characterized coronavirus receptors, three are S1-CTD binding ectopeptidases, including ACE2, DPP4, and aminopeptidase N (APN)11819. By contrast, the fourth receptor, antigen-related cell adhesion molecule 1(CEACAM1a), interacts with the amino-terminal domain (NTD) of the spike S1 subunit of the murine hepatitis virus2021. Interestingly, the same receptor can be shared by distantly related coronaviruses with structurally distinct RBDs. For example, the NL63-CoV (an alpha-CoV) uses ACE2 as an entry receptor widely used by many sarbecoviruses (beta-CoV linage B)22. A similar phenotype of cross-genera receptor usage has also been found in APN, which is shared by many alpha-CoVs and a delta-CoV (PDCoV)7. In comparison, DPP4 usage has only been found in merbecoviruses (beta-CoV linage C) such as HKU4, HKU25, and related strains24.

Intriguingly, many other merbecoviruses do not use DPP4 for entry and their receptor usage remains elusive, such as bat coronaviruses NeoCoV, PDF-2180-CoV, HKU5-CoV, and hedgehog coronaviruses EriCoV-HKU3152325. Among them, the NeoCoV, infecting Neoromicia capensis in South Africa, represents a bat merbecovirus that happens to be the closest relative of MERS-CoV (85% identity at the whole genome level)2627. PDF-2180-CoV, another coronavirus most closely related to NeoCoV, infects Pipisrellus hesperidus native to Southwest Uganda2328. Indeed, NeoCoV and PDF-2180-CoV share sufficient similarity with MERS-CoV across most of the genome, rendering them taxonomically the same viral species2729. However, their S1 subunits are highly divergent compared with MERS-CoV (around 43-45% amino acid similarity), in agreement with their different receptor preference23.

In this study, we unexpectedly found that NeoCoV and PDF-2180-CoV use bat ACE2 as their functional receptor. The cryo-EM structure of NeoCoV RBD bound with the ACE2 protein from Pipistrellus pipistrellus revealed a novel ACE2 interaction mode that is distinct from how human ACE2 (hACE2) interacts with the RBDs from SARS-CoV-2 or NL63. Although NeoCoV and PDF-2180-CoV cannot efficiently use hACE2 based on their current sequences, the spillover events of this group of viruses should be closely monitored, considering their human emergence potential after gaining fitness through antigenic drift.

Results

Evidence of ACE2 usage

To shed light on the relationship between merbecoviruses, especially NeoCoV and PDF-2180-CoV, we conducted a phylogenetic analysis of the sequences of a list of human and animal coronaviruses. Maximum likelihood phylogenetic reconstructions based on complete genome sequences showed that NeoCoV and PDF-2180-CoV formed sister clade with MERS-CoV (Fig. 1a). In comparison, the phylogenetic tree based on amino acid sequences of the S1 subunit demonstrated that NeoCoV and PDF-2180-CoV showed a divergent relationship with MERS-CoV but are closely related to the hedgehog coronaviruses (EriCoVs) (Fig. 1b). A sequence similarity plot analysis (Simplot) queried by MERS-CoV highlighted a more divergent region encoding S1 for NeoCoV and PDF-2180-CoV compared with HKU4-CoV (Fig. 1c). We first tested whether human DPP4 (hDPP4) could support the infection of several merbecoviruses through a pseudovirus entry assay30. The result revealed that only MERS-CoV and HKU4-CoV showed significantly enhanced infection of 293T-hDPP4. Unexpectedly, we detected a significant increase of entry of NeoCoV and PDF-2180-CoV in 293T-hACE2 but not 293T-hAPN, both of which are initially set up as negative controls (Fig. 1dExtended Data Fig.1).

Extended Data Figure 1

Extended Data Figure 1

Expression level of coronaviruses spike proteins used for pseudotyping.

Fig. 1

Fig. 1A clade of bat merbecoviruses can use ACE2 but not DPP4 for efficient entry.

a-b, Phylogenetical analysis of merbecoviruses (gray) based on whole genomic sequences (a) and S1 amino acid sequences (b). NL63 and 229E were set as outgroups. Hosts and receptor usage were indicated. c, Simplot analysis showing the whole genome similarity of three merbecoviruses compared with MERS-CoV. The regions that encode MERS-CoV proteins were indicated on the top. Dashed box: S1 divergent region. d, Entry efficiency of six merbecoviruses in 293T cells stably expressing hACE2, hDPP4, or hAPN. e-f, Entry efficiency of NeoCoV in cells expressing ACE2 from different bats. EGFP intensity (e); firefly luciferase activity (f). g-h, Cell-cell fusion assay based on dual-split proteins showing the NeoCoV spike protein mediated fusion in BHK-21 cells expressing indicated receptors. EGFP intensity (g), live-cell Renilla luciferase activity (h). i, Entry efficiency of six merbecoviruses in 293T cells stably expressing the indicated bat ACE2 or DPP4. Mean±SEM for di; Mean±SD for f, and h.(n=3). RLU: relative light unit.

To further validate the possibility of more efficient usage of bat ACE2, we screened a bat ACE2 cell library individually expressing ACE2 orthologs from 46 species across the bat phylogeny, as described in our previous study31(Extended Data Figs.2-3, Supplementary Table 1). Interestingly, NeoCoV and PDF-2180-CoV, but not HKU4-CoV or HKU5-CoV, showed efficient entry in cells expressing ACE2 from most bat species belonging to Vespertilionidae (vesper bats). In contrast, no entry or very limited entry in cells expressing ACE2 of humans or bats from the Yinpterochiroptera group (Fig. 1e-fExtended Data Fig.4). Consistent with the previous reports, the infection of NeoCoV and PDF-2180-CoV could be remarkably enhanced by an exogenous trypsin treatment28(Extended Data Fig.5). As indicated by the dual split protein (DSP)-based fusion assay 32, Bat37ACE2 triggers more efficient cell-cell membrane fusion than hACE2 in the presence of NeoCoV spike protein expression (Fig. 1g-h). Notably, the failure of the human or hedgehog ACE2 to support entry of EriCoV-HKU31 indicates that these viruses have a different receptor usage (Extended Data Fig.6). In agreement with a previous study2328, our results against the possibility that bat DPP4 act as a receptor for NeoCoV and PDF-2180-CoV, as none of the tested DPP4 orthologs, from the vesper bats whose ACE2 are highly efficient in supporting vial entry, could support a detectable entry of NeoCoV and PDF-2180-CoV (Fig. 1iExtended Data Fig.7). Infection assays were also conducted using several other cell types from different species, including a bat cell line Tb 1 Lu, ectopically expressing ACE2 or DPP4 from Bat40 (Antrozous pallidus), and each test yielded similar results (Extended Data Fig.8).

Extended Data Figure 2

Extended Data Figure 2

Receptor function of ACE2 from 46 bat species in supporting NeoCoV and PDF-2180-CoV entry.

Extended Data Figure 3

Extended Data Figure 3

The expression level of 46 bat ACE2 orthologs in 293T cells as indicated by immunofluorescence assay detecting the C-terminal 3×FLAG Tag.

Extended Data Figure 4

Extended Data Figure 4

Entry efficiency of PDF-2180-CoV (a-b), HKU4-CoV (c), and HKU5-CoV (d) pseudoviruses in 293T cells expressing different bat ACE2 orthologs

Extended Data Figure 5

Extended Data Figure 5

TPCK-trypsin treatment significantly boosted the entry efficiency of NeoCoV and PDF-2180-CoV on 293T cells expressing different ACE2 orthologs.

Extended Data Figure 6

Extended Data Figure 6

Hedgehog ACE2 (hgACE2) cannot support the entry of Ea-HedCoV-HKU31. (a) The expression level of ACE2 was evaluated by immunofluorescence detecting the C-terminal fused Flag tag. (b) Viral entry of SARS-CoV-2 and HKU31 into cells expressing hACE2 or hgACE2.

Extended Data Figure 7

Extended Data Figure 7

ACE2 and DPP4 receptor usage of different merbecoviruses. a, Western blot detected the expression levels of ACE2 and DPP4 orthologs in 293T cells.b, The intracellular bat ACE2 expression level by immunofluorescence assay detecting the C-terminal 3×FLAG-tag. c-d, Viral entry (c) and RBD binding (d) of different coronaviruses on 293T cells expressing different ACE2 and DPP4 orthologs.

Extended data Figure 8

Extended data Figure 8

NeoCoV and PDF2180-CoV infection of different cell types expressing either Bat40ACE2 or Bat40DPP4. The BHK-21, 293T, Vero E6, A549, Huh-7, and Tb 1 Lu were transfected with either Bat40ACE2 or Bat40DPP4. The expression and viral entry (GFP) (a) and luciferase activity (c) were detected at 16 hpi.

S1-CTD mediated species-specific binding

The inability of NeoCoV and PDF-2180-CoV to use DPP4 is consistent with their highly divergent S1-CTD sequence compared with the MERS-CoV and HKU4-CoV. We produced S1-CTD-hFc proteins (putative RBD fused to human IgG Fc domain) to verify whether their S1-CTDs are responsible for ACE2 receptor binding. The live-cell binding assay based on cells expressing various bat ACE2 showed a species-specific utilization pattern in agreement with the results of the pseudovirus entry assays (Fig. 2a). The specific binding of several representative bat ACE2 was also verified by flow-cytometry (Fig. 2b). We further determined the binding affinity by Bio-Layer Interferometry (BLI) analysis. The results indicated that both viruses bind to the ACE2 from Pipistrellus pipistrellus (Bat37) with the highest affinity (KD=1.98nM for NeoCoV and 1.29 nM for PDF-2180-CoV). In contrast, their affinities for hACE2 were below the detection limit of our BLI analysis (Fig. 2cExtended Data Fig.9). An enzyme-linked immunosorbent assay (ELISA) also demonstrated the strong binding between NeoCoV/PDF-2180-CoV S1-CTDs and Bat37ACE2, but not hACE2 (Fig. 2d). Notably, as the ACE2 sequences of the hosts of NeoCoV and PDF-2180-CoV are unknown, Bat37 represents the closest relative of the host of PDF-2180-CoV (Pipisrellus hesperidus) in our study. The binding affinity was further verified by competitive neutralization assays using soluble ACE2-ectodomain proteins or viral S1-CTD-hFc proteins. Again, the soluble Bat37ACE2 showed the highest activity to neutralize viral infection caused by both viruses (Fig. 2e-f). Moreover, NeoCoV-S1-CTD-hFc could also potently neutralize NeoCoV and PDF-2180-CoV infections of cells expressing Bat37ACE2 (Fig. 2g). We further demonstrated the pivotal role of S1-CTD in receptor usage by constructing chimeric viruses and testing them for altered receptor usage. As expected, batACE2 usage was changed to hDPP4 usage for a chimeric NeoCoV with CTD, but not NTD, sequences replaced by its MERS-CoV counterpart (Fig. 2h). These results confirmed that S1-CTD of NeoCoV and PDF-2180-CoV are RBDs for their species-specific interaction with ACE2.

Extended data Figure 9

Extended data Figure 9

BLI analysis of the binding kinetics of PDF-2180-CoV S1-CTD interacting with different ACE2 orthologs.

Fig. 2

Fig. 2S1-CTD of NeoCoV and PDF-2180-CoV was required for species-specific ACE2 binding.

a, Binding of NeoCoV-S1-CTD-hFc with 293T bat ACE2 cells via immunofluorescence detecting the hFc. b, Flow cytometry analysis of NeoCoV-S1-CTD-hFc binding with 293T cells expressing the indicated ACE2. The positive ratio was indicated based on the threshold (dash line). c, BLI assays analyzing the binding kinetics between NeoCoV-S1-CTD-hFc with selected ACE2-ecto proteins. d, ELISA assay showing the binding efficiency of NeoCoV and PDF-2180-CoV S1-CTD to human and Bat37ACE2-ecto proteins. e, The inhibitory activity of soluble ACE2-ecto proteins against NeoCoV infection in 293T-Bat37ACE2. f, Dose-dependent competition of NeoCoV infection by Bat37ACE2-ecto proteins in 293T-Bat37ACE2 cells. g, The inhibitory effect of NeoCoV, PDF-2180-CoV S1-CTD-hFc and MERS-CoV RBD-hFC proteins on NeoCoV infection in 293T-Bat37ACE2 cells. h, Receptor preference of chimeric viruses with S1-CTD or S1-NTD swap mutations in cells expressing the indicated receptors. Mean±SD for deg, and h, (n=3).

Structural basis of ACE2 binding

To unveil the molecular details of the virus-ACE2 binding, we then carried out structural investigations of the Bat37ACE2 in complex with the NeoCoV and PDF-2180-CoV RBD. 3D classification revealed that the NeoCoV-Bat37ACE2 complex primarily adopts a dimeric configuration with two copies of ACE2 bound to two RBDs, whereas only a monomeric conformation was observed in the PDF-2180-CoV-Bat37ACE2 complex (Figs. 3a-bExtended Data Fig. 1011). We determined the structures of these two complexes at a resolution of 3.5 Å and 3.8 Å, respectively, and performed local refinement to further improve the densities around the binding interface, enabling reliable analysis of the interaction details (Figs. 3a-bExtended Data Fig. 1213 and Table 1-2). Despite existing in different oligomeric states, the structures revealed that both NeoCoV and PDF-2180-CoV recognized the Bat37ACE2 in a very similar way. We used the NeoCoV-Bat37ACE2 structure for detailed analysis (Figs. 3a-b and Extended Data Fig. 14). Like other structures of homologs, the NeoCoV RBD structure comprises a core subdomain located far away from the engaging ACE2 and an external subdomain recognizing the receptor (Fig. 3c and Extended Data Fig. 15). The external subdomain is a strand-enriched structure with four anti-parallel β strands (β6–β9) and exposes a flat four-stranded sheet-tip for ACE2 engagement (Fig. 3c). By contrast, the MERS-CoV RBD recognizes the side surface of the DPP4 β-propeller via its four-stranded sheet-blade (Fig. 3c). The structural basis for the differences in receptor usage can be inferred from two features: i) the local configuration of the four-stranded sheet in the external domain of NeoCoV shows a conformational shift of η3 and β8 disrupting the flat sheet-face for DPP4 binding and ii) relatively longer 6-7 and 8-9 loops observed in MERS-CoV impair their binding in the shallow cavity of bat ACE2 (Fig. 3c and Extended Data Fig. 15).

Extended data Figure 10

Extended data Figure 10

Flowcharts for cryo-EM data processing of Neo-CoV RBD-Bat37ACE2 complex.

Extended data Figure 11

Extended data Figure 11

Flowcharts for cryo-EM data processing of PDF-2180-CoV RBD-Bat37ACE2 complex.

Extended Data Figure 12

Extended Data Figure 12

Resolution Estimation of the EM maps, density maps, and atomics models of NeoCoV RBD-Bat37ACE2 complex.

Extended Data Figure 13

Extended Data Figure 13

Resolution Estimation of the EM maps, density maps, and atomics models of PDF-2180-CoV RBD-Bat37ACE2 complex.

Extended Data Figure 14

Extended Data Figure 14

Superimposition of overall structures of NeoCoV RBD-Bat37ACE2 complex (red) and PDF-2018-COV RBD-Bat37ACE2 complex (bule).

Extended Data Figure 15

Extended Data Figure 15

Structures and sequence comparison of RBDs from different merbecoviruses.

Fig. 3

Fig. 3Structure of the NeoCoV RBD-Bat37ACE2 and PDF-2018-CoV RBD-Bat37ACE2 complex.

ab, Cryo-EM density map and cartoon representation of NeoCoV RBD-Bat37ACE2 complex (a) and PDF-2018CoV RBD-Bat37ACE2 complex (b). The NeoCoV RBD, PDF-2180-CoV RBD, and Bat37ACE2 were colored by red, yellow, and cyan, respectively. c, Structure comparison between NeoCoV RBD-Bat37ACE2 complex (left) and MERS-CoV RBD-hDPP4 complex (right). The NeoCoV RBD, MERS-CoV RBD, NeoCoV RBM, MERS-CoV RBM, Bat37ACE2, and hDPP4 were colored in red, light green, light yellow, gray, cyan, and blue, respectively. d, Details of the NeoCOV RBD-Bat37ACE2 complex interface. All structures are shown as ribbon with the key residues shown with sticks. The salt bridges and hydrogen bonds are presented as red and yellow dashed lines, respectively. ef, Verification of the critical residues on NeoCoV RBD affecting viral binding (e), and entry efficiency (f) in 293T-Bat37ACE2 cells. gh, Verification of the critical residues on Bat37ACE2 affecting NeoCoV RBD binding (g), and viral entry efficiency(h). Mean±SD for f (n=3) and h (n=4).

In the NeoCoV-Bat37ACE2 complex structure, relatively smaller surface areas (498 Å2 in NeoCoV RBD and 439 Å2 in Bat37ACE2) are buried by the two binding entities compared to their counterparts in the MERS-CoV-DPP4 complex (880 Å2 in MERS-CoV RBD and 812 Å2 in DPP4; 956 Å2 in SARS-CoV-2 RBD and 893 Å2 in hACE2). The NeoCoV RBD inserts into an apical depression constructed by α11, α12 helices and a loop connecting α12 and β4 of Bat37ACE2 through its four-stranded sheet tip (Fig. 3d and Extended Data Table. 2). Further examination of the binding interface revealed a group of hydrophilic residues at the site, forming a network of polar-contacts (H-bond and salt-bridge) network and hydrophobic interactions. These polar interactions are predominantly mediated by the residues N504, N506, N511, K512, and R550 from the NeoCoV RBM and residues T53, E305, T334, D338, R340 from Bat37ACE2 (Fig. 3d, Extended Data Table. 2). Notably, the methyl group from residues A509 and T510 of the NeoCoV RBM are partially involved in forming a hydrophobic pocket with residues F308, W328, L333, and I358 from Bat37ACE2 at the interface. A substitution of T510 with F in the PDF-2180-CoV RBM further improves hydrophobic interactions, which is consistent with an increased binding affinity observed for this point mutation (Figs. 3d, Extended Data Table. 2). Apart from protein-protein contacts, the glycans of bat ACE2 at positions N54 and N329 sandwich the strands (β8–β9), forming π-π interactions with W540 and hydrogen bonds with N532, G545, and R550 from the NeoCoV RBD, underpinning virus-receptor associations (Fig. 3d and Extended Data Table. 2).

The critical residues were verified by introducing mutations and testing their effect on receptor binding and viral entry. As expected, mutations N504F/N506F, N511Y, and R550N in the NeoCoV RBD, abolishing the polar-contacts or introducing steric clashes, resulted in a significant reduction of RBD binding and viral entry (Fig.3e-f). Similarly, E305K mutation in Bat37ACE2 eliminating the salt-bridge also significantly impaired the receptor function. Moreover, the loss of function effect of mutation N54A on Bat37ACE2 abolishing the N-glycosylation at residue 54 confirmed the importance of the particular protein-glycan interaction in viral-receptor recognition. In comparison, N329A abolishing the N-glycosylation at site N329, located far away from the binding interface, had no significant effect on receptor function (Fig.3g-h).

Evaluation of zoonotic potential

A major concern is whether NeoCoV and PDF-2180-CoV could jump the species barrier and infect humans. As mentioned above, NeoCoV and PDF-2180-CoV cannot efficiently interact with human ACE2. Here we first examined the molecular determinants restricting hACE2 from supporting the entry of these viruses. By comparing the binding interface of the other three hACE2-using coronaviruses, we found that the SARS-CoV, SARS-COV-2, and NL63 share similar interaction regions that barely overlapped with the region engaged by NeoCoV (Fig. 4a). Analysis of the overlapped binding interfaces reveals a commonly used hot spot around residues 329-330 (Fig.4b). Through sequences alignment and structural analysis of hACE2 and Bat37ACE2, we predicted that the inefficient use of the hACE2 for entry by the viruses could be attributed to incompatible residues located around the binding interfaces, especially the difference in sequences between residues 337-342 (Fig.4c). We replaced these residues of hACE2 with those from the Bat37ACE2 counterparts to test this hypothesis (Fig.4c-d). The substitution led to an approximately 15-fold and 30-fold increase in entry efficiency of NeoCoV and PDF-2180-CoV, respectively, confirming that this region is critical for the determination of the host range. Further fine-grained dissection revealed that N338 is the most crucial residue in restricting human receptor usage (Fig. 4e-g).

Fig. 4

Fig. 4Molecular determinants affecting hACE2 recognition by the viruses.

a, Binding modes of ACE2-adapted coronaviruses. The SARS-CoV RBD, SARS-CoV-2 RBD, NL63-CoV RBD, and NeoCoV RBD were colored in purple, light purple, green, and red, respectively. b, A common virus-binding hot spot on ACE2 for the four viruses. Per residue frequency recognized by the coronavirus RBDs were calculated and shown. c, Schematic illustration of the hACE2 swap mutants with Bat37ACE2 counterparts. de, The expression level of the hACE2 mutants by Western blot (d) and immunofluorescence (e). fg, Receptor function of hACE2 mutants evaluated by virus RBD binding assay (f) and pseudovirus entry assay (g). h, Molecular dynamics (MD) analysis of the effect of critical residue variations on the interaction between NeoCoV and Bat37ACE2 by mCSM-PPI2. i, Structure of NeoCoV RBD-hACE2 complex modeling by superposition in COOT. The NeoCoV RBD and hACE2 were colored in red and sky blue, respectively. Details of the NeoCoV RBD key mutation T510F was shown. All structures are presented as ribbon with the key residues shown with sticks. j-k, The effect of NeoCoV and PDF-2180-CoV RBM mutations on hACE2 fitness as demonstrated by binding (j) and entry efficiency (k) on 293T-hACE2 and 293T-Bat37ACE2 cells. l, hACE2 dependent entry of NeoCoV-T510F in Caco2 cells in the presence of 50μg/ml of Anti-ACE2 (H11B11) or Anti-VSVG (I1). m, Neutralizing activity of SARS-CoV-2 vaccinated sera against the infection by SARS-CoV-2, NeoCoV, and PDF-2180-CoV. n, Neutralizing activity of MERS-RBD targeting nanobodies against the infections by MERS-CoV, NeoCoV, and PDF-2180-CoV. Mean± SD for g,k-ng(n=4),kl (n=3), gmn (n=10).

We further assessed the zoonotic potential of NeoCoV and PDF-2180-CoV by identifying the molecular determinants of viral RBM, which might allow cross-species transmission through engaging hACE2. After meticulously examining the critical residues based on the complex structures and computational prediction tool mCSM-PPI233(Fig. 4h, Extended Data Table. 4), we predicted increasing hydrophobicity around the residue T510 of NeoCoV might enhance the virus-receptor interaction on hACE2 (Fig. 4 i). Interestingly, the PDF-2180-CoV already has an F511 (corresponding to site 510 of NeoCoV), which is consistent with its slightly higher affinity to human ACE2 (Extended Data Fig.16). As expected, T510F substitutions in NeoCoV remarkably increased its binding affinity with hACE2 (KD=16.9 nM) and a significant gain of infectivity in 293T-hACE2 cells (Fig. 4 j-k, Extended Data Fig.17-18). However, PDF-2180-CoV showed much lower efficiency in using hACE2 than NeoCoV-T510F, indicating other unfavorable residues are restricting its efficient interaction with hACE2. Indeed, a G to A (corresponding to A509 in NeoCoV) mutation in site 510, increasing the local hydrophobicity, partially restored its affinity to hACE2 (Fig.4 j-k). In addition, the NeoCoV-T510F can enter the human colon cell line Caco-2 with much higher efficiency than wild-type NeoCoV. It enters the Caco-2 exclusively through ACE2 as the infection can be neutralized by an ACE2-targeting neutralizing antibody H11B1134 (Fig. 4l). Humoral immunity triggered by prior infection or vaccination of other coronaviruses might be inadequate to protect humans from NeoCoV and PDF-2180-CoV infections because neither SARS-CoV-2 anti-sera nor ten tested anti-MERS-CoV nanobodies can cross-inhibit the infection caused by these two viruses35. (Fig. 4m-n).

Extended Data Figure 16

Extended Data Figure 16

Comparison of the binding affinity of NeoCoV and PDF-2180-CoV RBD with hACE2 using SARS-CoV-2 RBD as a positive control.

Extended Data Figure 17

Extended Data Figure 17

Expression level of the NeoCoV and PDF-2180-CoV spike proteins and their mutants.

Extended Data Figure 18

Extended Data Figure 18

BLI analysis of the binding kinetics of NeoCoV S1-CTD WT and T510F interacting with human ACE2.

Discussion

The lack of knowledge of the receptors of bat coronaviruses has greatly limited our understanding of these high-risk pathogens. Our study provided evidence that the relatives of potential MERS-CoV ancestors like NeoCoV and PDF-2180-CoV engage bat ACE2 for efficient cellular entry. However, HKU5-CoV and EriCoV seem not to use bat DPP4 or hedgehog ACE2 for entry, highlighting the complexity of coronaviruses receptor utilization. It was unexpected that NeoCoV and PDF-2180-CoV use ACE2 rather than DPP4 as their entry receptors since their RBD core structures resemble MERS-CoV more than other ACE2-using viruses (Fig. 4aExtended Data Fig. 15).

Different receptor usage can affect the transmission rate of the viruses. Although it remains unclear whether ACE2 usage out-weight DPP4 usage for more efficient transmission, MERS-CoV appears to have lower transmissibility with an estimated R0 around 0.69. Comparatively, the ACE2 usage has been approved able to achieve high transmissibility. The SARS-CoV-2 estimated R0 is around 2.5 for the original stain, 5.08 for the delta variant, and even higher for the omicron variant3638. This unexpected ACE2 usage of these MERS-CoV close relatives highlights a latent biosafety risk, considering a combination of two potentially damaging features of high fatality observed for MERS-CoV and the high transmission rate noted for SARS-CoV-2. Furthermore, our studies show that the current COVID-19 vaccinations are inadequate to protect humans from any eventuality of the infections caused by these viruses.

Many sarbecoviruses, alpha-CoV NL63, and a group of merbecoviruses reported in this study share ACE2 for cellular entry. Our structural analysis indicates NeoCoV and PDF-2180-CoV bind to an apical side surface of ACE2, which is different from the surface engaged by other ACE2-using coronaviruses (Fig.4a). The interaction is featured by inter-molecular protein-glycan bonds formed by the glycosylation at N54, which is not found in RBD-receptor interactions of other coronaviruses. The different interaction modes of the three ACE2-using coronaviruses indicate a history of multiple independent receptor acquisition events during evolution22. The evolutionary advantage of ACE2 usage in different CoVs remains enigmatic.

Our results support the previous hypothesis that the origin of MERS-CoV might be a result of an intra-spike recombination event between a NeoCoV like virus and a DPP4-using virus26. RNA recombination can occur during the co-infection of different coronaviruses, giving rise to a new virus with different receptor usage and host tropisms3940. It remains unclear whether the event took place in bats or camels, and where the host switching happened. Although bat merbecoviruses are geographically widespread, the two known ACE2-using merbecoviruses are inhabited in Africa. Moreover, most camels in the Arabian Peninsula showing serological evidence of previous MERS-CoV infection are imported from the Greater Horn of Africa with several Neoromicia species41. Considering both viruses are inefficient in infecting human cells in their current form, the acquisition of the hDPP4 binding domain would be a critical event driving the emergence of MERS-CoV. Further studies will be necessary to obtain more evidence about the origin of MERS-CoV.

The host range determinants on ACE2 are barriers for cross-species transmission of these viruses. Our results show NeoCoV and PDF-2180-CoV favor ACE2 from bats of the Yangochiroptera group, especially vesper bats (Vespertilionidae), where their host belongs to, but not ACE2 orthologs from bats of the Yinpterochiroptera group. Interestingly, most merbecoviruses were found in species belonging to the Vespertilionidae group, a highly diverse and widely distributed family9. Although the two viruses could not use hACE2 efficiently, our study also reveals that single residue substitution increasing local hydrophobicity around site 510 could enhance their affinity for hACE2 and enable them to infect human cells expressing ACE2. Considering the extensive mutations in the RBD regions of the SARS-CoV-2 variants, especially the heavily mutated omicron variant, these viruses may hold a latent potential to infect humans through further adaptation via antigenic drift4243. It is also very likely that their relatives with human emergence potential are circulating somewhere in nature.

Overall, we identified ACE2 as a long-sought functional receptor of the potential MERS-CoV ancestors in bats, facilitating the in-depth research of these important viruses with zoonotic emergence risks. Our study adds to the knowledge about the complex receptor usage of coronaviruses, highlighting the importance of surveillance and research on these viruses to prepare for potential outbreaks in the future.

Supplementary Information

Methods

Receptor and virus sequences

The acquisition of sequences of 46 bat ACE2 and hACE were described in our previous study31. The five bat DPP4 and hDPP4 sequences were directly retrieved from the GenBank database (human DPP4, NM_001935.3; Bat37, Pipistrellus pipistrellus, KC249974.1) or extracted from whole genome sequence assemblies of the bat species retrieved from the GenBank database (Bat25, Sturnira hondurensis, GCA_014824575.2; Bat29, Mormoops blainvillei, GCA_004026545.1; Bat36, Aeorestes cinereus, GCA_011751065.1; Bat40, Antrozous pallidus, GCA_007922775.1). The whole genome sequences of different coronaviruses were retrieved from the GenBank database. The accession numbers are as follows: MERS-CoV (JX869059.2), Camel MERS-CoV KFU-HKU 19Dam (KJ650296.1), HKU4 (NC_009019.1), HKU5 (NC_009020.1), ErinaceusCoV/HKU31 strain F6 (MK907286.1), NeoCoV (KC869678.4), PDF-2180-CoV (NC_034440.1), ErinaceusCoV/2012-174 (NC_039207.1), BtVs-BetaCoV/SC2013 (KJ473821.1), BatCoV/H.savii/Italy (MG596802.1), BatCoV HKU25 (KX442564.1), BatCoV ZC45(MG772933.1) and SARS-CoV-2 (NC_045512.2), NL63 (JX504050.1) 229E (MT797634.1).

All gene sequences used in this study were commercially synthesized by Genewiz. The sources, accession numbers, and sequences of the receptors and viruses were summarized in Supplementary Table 1.

SARS-CoV-2 anti-sera collection

All the vaccinated sera were collected from volunteers at about 21 days post the third dose of the WHO-approved inactivated SARS-COV-2 vaccine (CorovaVac, Sinovac, China). All volunteers were provided informed written consent forms, and the whole study was conducted following the requirements of Good Clinical Practice of China.

Bioinformatic analysis

Protein sequence alignment was performed using the MUSCLE algorithm by MEGA-X software (version 10.1.8). For phylogenetic analysis, nucleotide or protein sequences of the viruses were first aligned using the Clustal W and the MUSCLE algorithm, respectively. Then, the phylogenetic trees were generated using the maximal likelihood method in MEGA-X (1000 Bootstraps). The model and the other parameters used for phylogenetic analysis were applied following the recommendations after finding best DNA/Protein Models by the software. The nucleotide similarity of coronaviruses was analyzed by SimPlot software (version 3.5.1) with a slide windows size of 1000 nucleotides and a step size of 100 nucleotides using gap-stripped alignments and the Kimura (2-parameter) distance model.

Plasmids

Human codon-optimized sequences of various ACE2 or DPP4 orthologs and their mutants were cloned into a lentiviral transfer vector (pLVX-IRES-puro) with a C-terminal 3×Flag tag (DYKDHD-G-DYKDHD-I-DYKDDDDK). The DNA sequences of human codon-optimized NeoCoV S protein (AGY29650.2), PDF-2180-CoV S protein (YP_009361857.1), HKU4-CoV S protein (YP_001039953.1), HKU5-CoV S protein (YP_001039962.1), HKU31 S protein (QGA70692.1), SARS-CoV-2 (YP_009724390.1), and MERS-CoV S protein (YP_009047204.1) were cloned into the pCAGGS vector with a C-terminal 13-15-amino acids deletion (corresponding to 18 amino-acids in SARS-CoV-2) or replacement by an HA tag (YPYDVPDYA) for higher VSV pseudotyping efficiency44. The plasmids expressing coronavirus RBD-IgG-hFc fusion proteins were generated by inserting the coding sequences of NeoCoV RBD (aa380-585), PDF-2180-CoV RBD (aa381-586), HKU4-CoV (aa382-593), HKU5-CoV RBD (aa385-586), HKU31-CoV RBD (aa366-575), SARS-CoV-2 RBD (aa331-524) and MERS-CoV RBD (aa377-588) into the pCAGGS vector with an N-terminal CD5 secretion leading sequence (MPMGSLQPLATLYLLGMLVASVL). The plasmids expressing soluble bat ACE2 and DPP4 proteins were constructed by inserting the ectodomain coding sequences into the pCAGGS vector with N-terminal CD5 leader sequence and C-terminal twin-strep tag and 3×Flag tag tandem sequences (WSHPQFEKGGGSGGGSGGSAWSHPQFEK-GGGRS-DYKDHDGDYKDHDIDYKDDDDK).

Virus spike proteins or receptor-related mutants or chimeras were generated by overlapping PCR. For Dual split protein (DSP) based cell-cell fusion assay, the dual reporter split proteins were expressed by pLVX-IRES-puro vector expressing the RLucaa1-155-GFP1-7(aa1-157) and GFP8-11(aa158-231)-RLuc-aa156-311 plasmids, which were constructed in the lab based on a previously study3245. The plasmids expressing the codon-optimized anti-ACE2 antibody (H11B11; GenBank accession codes MZ514137 and MZ514138) were constructed by inserting the heavy-chain and light-chain coding sequences into the pCAGGS vector with N-terminal CD5 leader sequences, respectively34. For anti-MERS-CoV nanobody-hFc fusion proteins, nanobody coding sequences were synthesized and cloned into the pCAGGS vector with N-terminal CD5 leader sequences and C-terminal hFc tags 35.

Protein expression and purification

The RBD-hFc (S1-CTD-hFc) fusion proteins of SARS-CoV-2, MERS-CoV, HKU4-CoV, HKU5-CoV, HKU31-CoV, NeoCoV, and PDF-2180-CoV, and the soluble ACE2 proteins of human, Bat25, Bat29, Bat36, Bat37, Bat38, and Bat40 were expressed by 293T by transfecting the corresponding plasmids by GeneTwin reagent (Biomed, TG101-01) following the manufacturers’ instructions. Four hrs post-transfection, the culture medium of the transfected cells was replenished by SMM 293-TII Expression Medium (Sino Biological, M293TII). The supernatant of the culture medium containing the proteins was collected every 2-3 days. The recombinant RBD-hFc proteins were captured by Pierce Protein A/G Plus Agarose (Thermo Scientific, 20424), washed by wash buffer W (100 mM Tris/HCl, pH 8.0, 150 mM NaCl, 1mM EDTA), eluted with pH 3.0 Glycine buffer (100mM in H2O), and then immediately balanced by UltraPure 1M Tris-HCI, pH 8.0 (15568025, Thermo Scientific). The twin-strep tag containing proteins were captured by Strep-Tactin XT 4Flow high capacity resin (IBA, 2-5030-002), washed by buffer W, and eluted with buffer BXT (100 mM Tris/HCl, pH 8.0, 150 mM NaCl, 1mM EDTA, 50mM biotin). The eluted proteins can be concentrated and buffer-changed to PBS through ultra-filtration. Protein concentrations were determined by Omni-Easy Instant BCA Protein Assay Kit (Epizyme, ZJ102). The purified proteins were aliquoted and stored at -80℃. For Cryo-EM analysis, NeoCoV RBD (aa380-588), PDF-2018-CoV RBD (381-589), and Bat37ACE2 (aa21-730) were synthesized and subcloned into the vector pCAGGS with a C-terminal twin-strep tag. Briefly, these proteins were expressed by transient transfection of 500 ml HEK Expi 293F cells (Gibco, Thermo Fisher, A14527) using Polyethylenimine Max Mw 40,000 (polysciences). The resulting protein samples were further purified by size-exclusion chromatography using a Superdex 75 10/300 Increase column (GE Healthcare) or a Superdex 200 10/300 Increase column (GE Healthcare) in 20mM HEPES, 100 mM NaCl, pH 7.5. For RBD-receptor complex (NeoCoV RBD-Bat37ACE2 / PDF-2180-CoV RBD-Bat37ACE2), NeoCoV RBD or PDF-2180-CoV RBD was mixed with Bat37ACE2 at the ratio of 1.2 :1, incubated for 30 mins on ice. The mixture was then subjected to gel filtration chromatography. Fractions containing the complex were collected and concentrated to 2 mg/ml.

Cell culture

293T (CRL-3216), VERO E6 cells (CRL-1586), A549 (CCL-185), BHK-21 (CCL-10), and Huh-7 (PTA-4583), Caco2 (HTB-37) and the epithelial bat cell line Tb 1 Lu (CCL-88) were purchased from American Type Culture Collection (ATCC) and cultured in Dulbecco’s Modified Eagle Medium, (DMEM, Monad, China) supplemented with 10% fetal bovine serum (FBS), 2.0 mM of L-glutamine, 110 mg/L of sodium pyruvate and 4.5 g/L of D-glucose. An I1-Hybridoma (CRL-2700) cell line secreting a neutralizing mouse monoclonal antibody against the VSV glycoprotein (VSVG) was cultured in Minimum Essential Medium with Earles’s balances salts and 2.0mM of L-glutamine (Gibico) and 10% FBS. All cells were cultures at 37℃ in 5% CO2 with the regular passage of every 2-3 days. 293T stable cell lines overexpressing ACE2 or DPP4 orthologs were maintained in a growth medium supplemented with 1 μg/ml of puromycin.

Stable cell line generation

Stable cell lines overexpressing ACE2 or DPP4 orthologs were generated by lentivirus transduction and antibiotic selection. Specifically, the lentivirus carrying the target gene was produced by cotransfection of lentiviral transfer vector (pLVX-EF1a-Puro, Genewiz) and packaging plasmids pMD2G (Addgene, plasmid no.12259) and psPAX2 (Addgene, plasmid no.12260) into 293T cells through Lip2000 Transfection Reagent (Biosharp, BL623B). The lentivirus-containing supernatant was collected and pooled at 24 and 48 hrs post-transfection. 293T cells were transduced by the lentivirus after 16 hrs in the presence of 8 μg/ml polybrene. Stable cells were selected and maintained in the growth medium with puromycin (1-2 μg/ml). Cells selected for at least ten days were considered stable cell lines and used in different experiments.

Cryo-EM sample preparation and data collection

For Cryo-EM sample preparation, the NeoCoV RBD-Bat37ACE2 or PDF-2018-CoV RBD-Bat37ACE2 complex was diluted to 0.5 mg/ml. Holy-carbon gold grid (Cflat R1.2/1.3 mesh 300) were freshly glow-discharged with a Solarus 950 plasma cleaner (Gatan) for 30s. A 3 μL aliquot of the mixture complex was transferred onto the grids, blotted with filter paper at 16℃ and 100% humidity, and plunged into the ethane using a Vitrobot Mark IV (FEI). For these complexes, micrographs were collected at 300 kV using a Titan Krios microscope (Thermo Fisher), equipped with a K2 detector (Gatan, Pleasanton, CA), using SerialEM automated data collection software. Movies (32 frames, each 0.2 s, total dose 60e−Å−2) were recorded at a final pixel size of 0.82 Å with a defocus of between -1.2 and -2.0 μm.

Image processing

For NeoCoV RBD-Bat37ACE2 complex, a total of 4,234 micrographs were recorded. For PDF-2018-CoV RBD-Bat37ACE2 complex, a total of 3,298 micrographs were recorded. Both data sets were similarly processed. Firstly, the raw data were processed by MotionCor2, which were aligned and averaged into motion-corrected summed images. Then, the defocus value for each micrograph was determined using Gctf. Next, particles were picked and extracted for two-dimensional alignment. The well-defined partial particles were selected for initial model reconstruction in Relion46. The initial model was used as a reference for three-dimensional classification. After the refinement and post-processing, the overall resolution of PDF-2018-CoV RBD-Bat37ACE2 complex was up to 3.8Å based on the gold-standard Fourier shell correlation (threshold = 0.143) 47. For the NeoCoV RBD-Bat37ACE2 complex, the C2 symmetry was expanded before the 3D refinement. Finally, the resolution of the NeoCoV RBD-Bat37ACE2 complex was up to 3.5Å. The quality of the local resolution was evaluated by ResMap48.

Model building and refinement

The NeoCoV RBD-Bat37ACE2 complex structures were manually built into the refined maps in COOT47. The atomic models were further refined by positional and B-factor refinement in real space using Phenix48. For the PDF-2018-CoV RBD-Bat37ACE2 complex model building, the refinement NeoCoV RBD-Bat37ACE2 complex structures were manually docked into the refined maps using UCSF Chimera and further corrected manually by real-space refinement in COOT. As the same, the atomic models were further refined by using Phenix. Validation of the final model was performed with Molprobity48. The data sets and refinement statistics are shown in Extended Data table 1.

Immunofluorescence assay

The expression levels of ACE2 or DPP4 receptors were evaluated by immunofluorescence assay detecting the C-terminal 3×FLAG-tags. The cells expressing the receptors were seeded in the 96-well plate (poly-lysine pretreated plates for 293T based cells) at a cell density of 1∼5×105/ml (100 μl per well) and cultured for 24 hrs. Cells were fixed with 100% methanol at room temperature for 10 mins, and then incubated with a mouse monoclonal antibody (M2) targeting the FLAG-tag (Sigma-Aldrich, F1804) diluted in 1% BSA/PBS at 37℃ for 1 hour. After one wash with PBS, cells were incubated with 2 μg/ml of the Alexa Fluor 594-conjugated goat anti-mouse IgG (Thermo Fisher Scientific, A32742) diluted in 1% BSA/PBS at room temperature for 1 hour. The nucleus was stained blue with Hoechst 33342 (1:5,000 dilution in PBS). Images were captured with a fluorescence microscope (Mshot, MI52-N).

Pseudovirus production and titration

Coronavirus spike protein pseudotyped viruses (CoV-psV) were packaged according to a previously described protocol based on a replication-deficient VSV-based rhabdoviral pseudotyping system (VSV-dG). The VSV-G glycoprotein-deficient VSV coexpressing GFP and firefly luciferase (VSV-dG-GFP-fLuc) was rescued by a reverse genetics system in the lab and helper plasmids from Karafast. For CoV-psV production, 293T or Vero E6 cells were transfected with the plasmids overexpressing the coronavirus spike proteins through the Lip2000 Transfection Reagent (Biosharp, BL623B). After 36 hrs, the transfected cells were transduced with VSV-dG-GFP-fLuc diluted in DMEM for 4 hrs at 37℃ with a 50 % tissue culture infectious dose (TCID50) of 1×106 TCID50/ml. Transduced cells were washed once with DMEM and then incubated with culture medium and I1-hybridoma-cultured supernatant (1:10 dilution) containing VSV neutralizing antibody to eliminate the infectivity of the residual input viruses. The CoV-psV-containing supernatants were collected at 24 hrs after the medium change, clarified at 4,000 rpm for 5 mins, aliquoted, and stored at -80℃. The TCID50 of pseudovirus was determined by a serial-dilution based infection assay on 293T-bat40ACE2 cells for NeoCoV and PDF-2180-CoV or 293T-hDpp4 cells for MERS-CoV and HKU4-CoV. The TCID50 was calculated according to the Reed-Muench method 4950. The relative luminescence unit (RLU) value ≥ 1000 is considered positive. The viral titer (genome equivalents) of HKU5-COV and HKU31-CoV without an ideal infection system was determined by quantitative PCR with reverse transcription (RT–qPCR). The RNA copies in the virus-containing supernatant were detected using primers in the VSV-L gene sequences (VSV-L-F: 5’-TTCCGAGTTATGGGCCAGTT-3’; VSVL-R: 5’-TTTGCCGTAGACCTTCCAGT-3’).

Pseudovirus entry assay

Cells for infection were trypsinized and incubated with different pseudoviruses (1×105 TCID50/well, or same genome equivalent) in a 96-well plate (5×104 /well) to allow attachment and viral entry simultaneously. For TPCK-trypsin treatment for infection boosting, NeoCoV and PDF-2180-CoV pseudovirus in serum-free DMEM were incubated with 100 μg/ml TPCK-treated trypsin (Sigma-Aldrich, T8802) for 10 mins at 25℃, and then treated with 100 μg/ml soybean trypsin inhibitor (Sigma-Aldrich, T6414) in DMEM+10% FBS to stop the proteolysis. At 16 hours post-infection (hpi), GFP images of infected cells were acquired with a fluorescence microscope (Mshot, MI52-N), and intracellular luciferase activity was determined by a Bright-Glo Luciferase Assay Kit (Promega, E2620) and measured with a SpectraMax iD3 Multi-well Luminometer (Molecular Devices) or a GloMax 20/20 Luminometer (Promega).

Pseudovirus neutralization Assay

For antibody neutralization assays, the viruses (2×105 TCID50/well) were incubated with the sera (50-fold diluted) or 10 μg/ml MERS-specific nanobodies at 37℃ for 30 mins, and then mixed with trypsinized BHK-21-Bat37ACE2 cells with the density of 2×104/well. After 16 hrs, the medium of the infected cells was removed, and the cells were lysed with 1×Bright-Glo Luciferase Assay reagent (Promega) for chemiluminescence detection with a SpectraMax iD3 Multi-well Luminometer (Molecular devices).

Western blot

After one wash with PBS, the cells were lysed by 2% TritonX-100/PBS containing 1mM fresh prepared PMSF (Beyotime, ST506) on ice for 10 mins. Then cell lysate was clarified by 12,000 rpm centrifugation at 4℃ for 5 mins, mixed with SDS loading buffer, and then incubated at 95 °C for 5 mins. After SDS-PAGE electrophoresis and PVDF membrane transfer, the membrane was blocked with 5% skim milk/PBST at room temperature for one hour, incubated with primary antibodies against Flag (Sigma, F1804), HA (MBL, M180-3), or glyceraldehyde-3-phosphate dehydrogenase (GAPDH) (AntGene, ANT011) at 1:10000 dilution in 1% milk/PBS overnight on a shaker at 4℃. After extensive wash, the membrane was incubated with the Horseradish peroxidase (HRP)-conjugated secondary antibody AffiniPure Goat Anti-Mouse IgG (H+L) (Jackson Immuno Reseach, 115-035-003) in 1% skim milk in PBST, and incubated for one hour. The blots were visualized using Omni-ECL Femto Light Chemiluminescence Kit (EpiZyme, SQ201) by ChemiDoc MP (Bio-Rad).

Coronavirus RBD-hFc live-cell binding assay

Recombinant coronavirus RBD-hFc proteins (1-16 μg/ml) were diluted in DMEM and then incubated with the cells for one hour at 37℃. Cells were washed once with DMEM and then incubated with 2 μg/ml of Alexa Fluor 488-conjugated goat anti-human IgG (Thermo Fisher Scientific; A11013) diluted in Hanks’ Balanced Salt Solution (HBSS) with 1% BSA for 1 hour at 37 ℃. Cells were washed twice with PBS and incubated with Hoechst 33342 (1:5,000 dilution in HBSS) for nucleus staining. Images were captured with a fluorescence microscope (MI52-N). For flow cytometry analysis, cells were detached by 5mM of EDTA/PBS and analyzed with a CytoFLEX Flow Cytometer (Beckman).

Biolayer interferometry (BLI) binding assay

The protein binding affinities were determined by BLI assays performed on an Octet RED96 instrument (Molecular Devices). Briefly, 20 μg/mL Human Fc-tagged RBD-hFc recombinant proteins were loaded onto a Protein A (ProA) biosensors (ForteBio, 18-5010) for 30s. The loaded biosensors were then dipped into the kinetic buffer (PBST) for 90s to wash out unbound RBD-hFc proteins. Subsequently, the biosensors were dipped into the kinetic buffer containing soluble ACE2 with concentrations ranging from 0 to 500 nM for 120s to record association kinetic and then dipped into kinetics buffer for 300s to record dissociation kinetics. Kinetic buffer without ACE2 was used to define the background. The corresponding binding affinity (KD) was calculated with Octet Data Analysis software 12.2.0.20 using curve-fitting kinetic analysis or steady-state analysis with global fitting.

Enzyme-linked immunosorbent assay (ELISA)

To evaluate the binding between viral RBD and the ACE2 in vitro, 96 well Immuno-plate were coated with ACE2 soluble proteins at 27 μg/ml in BSA/PBS (100 μl/well) overnight at 4℃. After three wash by PBS containing 0.1% Tween-20 (PBST), the wells were blocked by 3% skim milk/PBS at 37℃ for 2 hrs. Next, varying concentrations of RBD-hFc proteins (1-9 μg/ml) diluted in 3% milk/PBST were added to the wells and incubated for one hour at 37℃. After extensive wash, the wells were incubated with 1:2000 diluted HRP-conjugated goat anti-human Fc antibody (Sigma, T8802) in PBS for one hour. Finally, the substrate solution (Solarbio, PR1210) was added to the plates, and the absorbance at 450nm was measured with a SpectraMax iD3 Multi-well Luminometer (Molecular Devices).

Cell-cell fusion assay

Cell-cell fusion assay based on Dual Split proteins (DSP) was conducted on BHK-21 cells stably expressing different receptors32. The cells were separately transfected with Spike and RLucaa1-155-GFP1-10(aa1-157) expressing plasmids, and Spike and GFP11(aa158-231) RLuc-Caa156-311 expressing plasmids, respectively. At 12 hrs after transfection, the cells were trypsinized and mixed into a 96-well plate at 8×104/well. At 26 hrs post-transfection, cells were washed by DMEM once and then incubated with DMEM with or without 12.5 μg/ml TPCK-trypsin for 25 mins at RT. Five hrs after treatment, the nucleus was stained blue with Hoechst 33342 (1:5,000 dilution in HBSS) for 30min at 37℃. GFP images were then captured with a fluorescence microscope (MI52-N; Mshot). For live-cell luciferase assay, the EnduRen live cell substrate (Promega, E6481) was added to the cells (a final concentration of 30 μM in DMEM) for at least 1 hour before detection by a GloMax 20/20 Luminometer (Promega).

Statistical Analysis

Most experiments were repeated 2∼5 times with 3-4 biological repeats, each yielding similar results. Data are presented as MEAN±SD or MEAN±SEM as specified in the figure legends. All statistical analyses were conducted using GraphPad Prism 8. Differences between independent samples were evaluated by unpaired two-tailed t-tests; Differences between two related samples were evaluated by paired two-tailed t-tests. P<0.05 was considered significant. * p<0.05, ** p <0.01, *** p <0.005, and **** p <0.001.

Author contributions

H.Y. and X.X.W. conceived and designed the study. Q.X., L.C., C.B.M., C.L., J.Y.S., P.L., and F.T. performed the experiments. Q.X, L.C, C.B.M, C.L, C.F.Z., H.Y, and X.X.W analyzed the data. H.Y., X.X.W., Q.X, L.C, C.B.M, and C.L interpreted the results. H.Y and X.X.W wrote the initial drafts of the manuscript. H.Y., X.X.W., H.Y., X.X.W., L.C., and Q.X. revised the manuscript. C.B.M, C.L., P. L., M.X.G., C.L.W, L.L.S, F.T. M.L.H, J.L., C.S., Y.C., H.B.Z., and K.L. commented on the manuscript.

Competing interests

The authors declare no competing interests.

Data availability

The cryo-EM maps have been deposited at the Electron Microscopy Data Bank (www.ebi.ac.uk/emdb) and are available under accession numbers: EMD-32686 (NeoCoV RBD-Bat37ACE2 complex) and EMD-32693 (PDF-2180-CoV RBD-Bat37ACE2 complex). Atomic models corresponding to EMD-32686, EMD-32693 have been deposited in the Protein Data Bank (www.rcsb.org) and are available under accession numbers, PDB ID 7WPO, PDB ID 7WPZ, respectively. The authors declare that all other data supporting the findings of this study are available with the paper and its supplementary information files.

Additional Information

Supplementary Information is available for this paper.

Correspondence and requests for materials should be addressed to H.Y. (huanyan{at}whu.edu.cn)

The Epidemiology, Transmission, and Diagnosis of COVID-19

Authors: By: Neesha C. Siriwardane & Rodney Shackelford, DO, Ph.D. April 15, 2020

Introduction to COVID-19

Coronaviruses are enveloped single-stranded RNA viruses of the Coronaviridae family and order Nidovirales (1). The viruses are named for their “crown” of club-shaped S glycoprotein spikes, which surround the viruses and mediate viral attachment to host cell membranes (1-3). Coronaviruses are found in domestic and wild animals, and four coronaviruses commonly infect the human population, causing upper respiratory tract infections with mild common cold symptoms (1,4). Generally, animal coronaviruses do not spread within human populations, however rarely zoonotic coronaviruses evolve into strains that infect humans, often causing severe or fatal illnesses (4). Recently, three coronaviruses with zoonotic origins have entered the human population; severe acute respiratory syndrome coronavirus-2 (SARS) in 2003, Middle Eastern respiratory syndrome (MERS) in 2012, and most recently, coronavirus disease 2019 (COVID-19), also termed SARS-CoV-2, which the World Health Organization declared a Public Health Emergency of International Concern on January 31st, 2020 (4,5). 

COVID19 Biology, Spread, and Origin

COVID-19 replicates within epithelial cells, where the COVID-19 S glycoprotein attaches to the ACE2 receptor on type 2 pneumocytes and ciliated bronchial epithelial cells of the lungs. Following this, the virus enters the cells and rapidly uses host cell biochemical pathways to replicate viral proteins and RNA, which assemble into viruses that in turn infect other cells (3,5,6). Following these cycles of replication and re-infection, the infected cells show cytopathic changes, followed by various degrees of pulmonary inflammation, changes in cytokine expression, and disease symptoms (5-7). The ACE2 receptor also occurs throughout most of the gastrointestinal tract and a recent analysis of stool samples from COVID-19 patients revealed that up to 50% of those infected with the virus have a COVID-19 enteric infection (8).

COVID-19 was first identified on December 31st, 2020 in Wuhan China, when twenty-seven patients presented with pneumonia of unknown cause. Some of the patients worked in the Hunan seafood market, which sold both live and recently slaughtered wild animals (4,9).  Clusters of cases found in individuals in contact with the patients (family members and healthcare workers) indicated a human-to-human transmission pattern (9,10). Initial efforts to limit the spread of the virus were insufficient and the virus soon spread throughout China. Presently COVID-19 occurs in 175 countries, with 1,309,439 cases worldwide, with 72,638 deaths as of April 6th, 2020 (4). Presently, the most affected countries are the United States, Italy, Spain, and China, with the United States showing a rapid increase in cases, and as of April 6th, 2020 there are 351,890 COVID-19 infected, 10,377 dead, and 18,940 recovered (4).  In the US the first case presented on January 19th, 2020, when an otherwise healthy 35-year-old man presented to an urgent care clinic in Washington State with a four-day history of a persistent dry cough and a two-day history of nausea and vomiting.  The patient had a recent travel history to Wuhan, China. On January 20th, 2020 the patient tested positive for COVID-19.  The patient developed pneumonia and pulmonary infiltrates, and was treated with supplemental oxygen, vancomycin, and remdesivir. By day eight of hospitalization, the patient showed significant improvement (11). 

Sequence analyses of the COVID-19 genome revealed that it has a 96.2% similarity to a bat coronavirus collected in Yunnan province, China. These analyses furthermore showed no evidence that the virus is a laboratory construct (12-14). A recent sequence analysis also found that COVID-19 shows significant variations in its functional sites, and has evolved into two major types (termed L and S). The L type is more prevalent, is likely derived from the S type, and may be more aggressive and spread more easily (14,15). 

Transmission

While sequence analyses strongly suggest an initial animal-to-human transmission, COVID-19 is now a human-to-human contact spread worldwide pandemic (4,9-11). Three main transmission routes are identified; 1) transmission by respiratory droplets, 2) contract transmission, and 3) aerosol transmission (16). Transmission by droplets occurs when respiratory droplets are expelled by an infected individual by coughing and are inhaled or ingested by individuals in relatively close proximity.  Contact transmission occurs when respiratory droplets or secretions are deposited on a surface and another individual picks up the virus by touching the surface and transfers it to their face (nose, mouth, or eyes), propagating the infection. The exact time that COVID-19 remains infective on contaminated surfaces is unknown, although it may be up to several days (4,16). Aerosol transmission occurs when respiratory droplets from an infected individual mix with air and initiate an infection when inhaled (16). Transmission by respiratory droplets appears to be the most common mechanism for new infections and even normal breathing and speech can transmit the virus (4,16,17). The observation that COVID-19 can cause enteric infections also suggests that it may be spread by oral-fecal transmission; however, this has not been verified (8). A recent study has also demonstrated that about 30% of COIVID-19 patients present with diarrhea, with 20% having diarrhea as their first symptom. These patients are more likely to have COVID-19 positive stool upon testing and a longer, but less severe disease course (18).  Recently possible COVID-19 transmission from mother to newborns (vertical transmission) has been documented. The significance of this in terms of newborn health and possible birth defects is currently unknown (19). 

The basic reproductive number or R0, measures the expected number of cases generated by one infection case within a population where all the individuals can become infected. Any number over 1.0 means that the infection can propagate throughout a susceptible population (4). For COVID-19, this value appears to be between 2.2 and 4.6 (4,20,21). Unpublished studies have stated that the COVID10 R0 value may be as high as 6.6, however, these studies are still in peer review. 

COVID-19 Prevention

There is no vaccine available to prevent COVID-19 infection, and thus prevention presently centers on limiting COVID-19 exposures as much as possible within the general population (22). Recommendations to reduce transmission within community include; 1) hand hygiene with simultaneous avoidance of touching the face, 2) respiratory hygiene, 3) utilizing personal protective equipment (PPE) such as facemasks, 4) disinfecting surfaces and objects that are frequently touched, and 5) limiting social contacts, especially with infected individuals  (4,9,17,22). Hand hygiene includes frequent hand-washing with soap and water for twenty seconds, especially after contact with respiratory secretions produced by activities such as coughing or sneezing. When soap and water are unavailable, hand sanitizer that contains at least 60% alcohol is recommended (4,17,22). PPE such as N95 respirators are routinely used by healthcare workers during droplet precaution protocols when caring for patients with respiratory illnesses. One retrospective study done in Hunan, China demonstrated N95 masks were extremely efficient at preventing COVID-19 transfer from infected patients to healthcare workers (4,22-24). It is also likely that wearing some form of mask protection is useful to prevent COVID19 spread and is now recommended by the CDC (25). 

Although transmission of COVID-19 is primarily through respiratory droplets, well-studied human coronaviruses such as HCoV, SARS, and MERS coronaviruses have been determined to remain infectious on inanimate surfaces at room temperature for up to nine days. They are less likely to persist for this amount of time at a temperature of 30°C or more (26). Therefore, contaminated surfaces can remain a potential source of transmission. The Environmental Protection Agency has produced a database of appropriate agents for COVID-19 disinfection (27). Limiting social contact usually has three levels; 1) isolating infected individuals from the non-infected, 2) isolating individuals who are likely to have been exposed to the disease from those not exposed, and 3) social distancing. The later includes community containment, were all individuals limit their social interactions by avoiding group gatherings, school closures, social distancing, workplace distancing, and staying at home (28,29). In an adapted influenza epidemic simulation model, comparing scenarios with no intervention to social distancing and estimated a reduction of the number of infections by 99.3% (28). In a similar study, social distancing was estimated to be able to reduce COVID-19 infections by 92% (29). Presently, these measured are being applied in many countries throughout the world and have been shown to be at least partially effective if given sufficient time (4,17,30). Such measures proved effective during the 2003 SARS outbreak in Singapore (30). 

Symptoms, Clinical Findings, and Mortality 

On average COVID-19 symptoms appear 5.2 days following exposure and death fourteen days later, with these time periods being shorter in individuals 70-years-old or older (31,32). People of any age can be infected with COVID-19, although infections are uncommon in children and most common between the ages of 30-65 years, with men more affected than women (32,33). The symptoms vary from asymptomatic/paucisymptomatic to respiratory failure requiring mechanical ventilation, septic shock, multiple organ dysfunction, and death (4,9,32,33). The most common symptoms include a dry cough which can become productive as the illness progresses (76%), fever (98%), myalgia/fatigue (44%), dyspnea (55%), and pneumoniae (81%), with less common symptoms being headache, diarrhea (26%), and lymphopenia (44%) (4,32,33). Rare events such as COVID-19 acute hemorrhagic necrotizing encephalopathy have been documented and one paper describes conjunctivitis, including conjunctival hyperemia, chemosis, epiphora, or increased secretions in 30% of COVID-19 patients (34,35). Interestingly, about 30-60% of those infected with COVID-19 also experience a loss of their ability to taste and smell (36). 

The clinical features of COVID-19 include bilateral lung involvement showing patchy shadows or ground-glass opacities identified by chest X-ray or CT scanning (34). Patients can develop atypical pneumoniae with acute lung injury and acute respiratory distress syndrome (33). Additionally, elevations of aspartate aminotransferase and/or alanine aminotransferase (41%), C-reactive protein (86%), serum ferritin (63%), and increased pro-inflammatory cytokines, whose levels correlate positively with the severity of the symptoms (4,31-33,37-39).

About 81% of COVID-19 infections are mild and the patients make complete recoveries (38). Older patients and those with comorbidities such as diabetes, cardiovascular disease, hypertension, and chronic obstructive pulmonary disease have a more difficult clinical course (31-33,37-39). In one study, 72% of patients requiring ICU treatment had some of these concurrent comorbidities (40). According to the WHO 14% of COVID-19 cases are severe and require hospitalization, 5% are very severe and will require ICU care and likely ventilation, and 4% will die (41). Severity will be increased by older age and comorbidities (4,40,41). If effective treatments and vaccines are not found, the pandemic may cause slightly less than one-half billion deaths, or 6% of the world’s population (41). Since many individuals infected with COVID-19 appear to show no symptoms, the actual mortality rate of COIVD-19 is likely much less than 4% (42). An accurate understanding of the typical clinical course and mortality rate of COVID-19 will require time and large scale testing.         

COVID-19 Diagnosis

COVID-19 symptoms are nonspecific and a definitive diagnosis requires laboratory testing, combined with a thorough patient history.  Two common molecular diagnostic methods for COVID-19 are real-time reverse polymerase chain reaction (RT-PCR) and high-throughput whole-genome sequencing.  RT-PCR is used more often as it is cost more effective, less complex, and has a short turnaround time. Blood and respiratory secretions are analyzed, with bronchoalveolar lavage fluid giving the best test results (43). Although the technique has worked on stool samples, as yet stool is less often tested (8,43). RT-PCR involves the isolation and purification of the COVID-19 RNA, followed by using an enzyme called “reverse transcriptase” to copy the viral RNA into DNA. The DNA is amplified through multiple rounds of PCR using viral nucleic acid-specific DNA primer sequences. Allowing in a short time the COVID-19 genome ti be amplified millions of times and then easily analyzed (43). RT-PCR COVID-19 testing is FDA approved and the testing volume in the US is rapidly increasing (44,45). The FDA has also recently approved a COVID-19 diagnostic test that detects anti-COVID-19 IgM and IgG antibodies in patient serum, plasma, or venipuncture whole blood (43). As anti-COVID-19 antibody formation takes time, so a negative result does not completely preclude a COVID-19 infection, especially early infections. Last, as COVID-19 often causes bilateral pulmonary infiltrates, correlating diagnostic testing results with lung chest CT or X-ray results can be helpful (4,31-33,37-39).  

Testing for COVID-19 is based on a high clinical suspicion and current recommendations suggest testing patients with a fever and/or acute respiratory illness. These recommendations are categorized into priority levels, with high priority individuals being hospitalized patients and symptomatic healthcare facility workers. Low priority individuals include those with mild disease, asymptomatic healthcare workers, and symptomatic essential infrastructure workers. The latter group will receive testing as resources become available (41,46,47). 

COVID-19 Possible Treatments

Presently research on possible COVIS-19 infection treatments and vaccines are underway (48). At the writing of this article many different drugs are being examined, however any data supporting the use of any specific drug treating COVID-19 is thin as best. A few drugs that might have promise are:  

Hydroxychloroquine

Hydroxychloroquine has been used to treat malarial infections for seventy years and in cell cultures it has anti-viral effects against COVID-19 (49). In one small non-randomized clinical trial in France, twenty individuals infected with COVID-19 who received hydroxychloroquine showed a reduced COVID-19 viral load, as measured on nasopharyngeal viral carriage, compared to untreated controls (50). Six individuals who also received azithromycin with hydroxychloroquine had their viral load lessened further (50). In one small study in China, a similar drug (chloroquine) was superior in reducing COVID-19 viral levels in treated individuals compared to untreated control individuals (51).  These results are preliminary, but promising. 

Remdesivir

Remdesivir is a drug that showed value in treating patients infected with SARS (52). COVID-19 and SARS show about 80% sequence similarity and since Remdesivir has been used to treat SARS, it might have value in treating COVID-19 (52). These trials are underway (48). Remdesivir was also used to treat the first case of COIVD-19 identified within the US (11). There are many other drugs being examined to treat COVID-19 infections, however, the data on all of them is presently slight to none, and research has only begun. There is an enormous research effort underway, and progress should be rapid (48). 

Conclusion

Our understanding of COVID-19 is changing extremely rapidly and new findings come out daily. Combating COVID-19 effectively will require multiple steps; including slowing the spread of the virus through socially isolating and measures such as hand washing. The development of effective drug treatments and vaccines is already a priority and rapid progress is being made (48). Additionally, many areas of the world, such as South American and sub-Saharan Africa, will be affected by the COVID-19 pandemic and are likely to have their economies and healthcare systems put under extreme stress. Dealing with the healthcare crisis in these countries will be very difficult. Lastly, several recent viral pandemics (SARS, MERS, and COVID-19) have come from areas where wildlife is regularly traded, butchered, and eaten in conditions that favor the spread of dangerous viruses between species, and eventually into human populations. The prevention of new viral pandemics will require improved handling of wild species, better separation of wild animals from domestic animals, and better regulated and lowered trade in wild animals, such as bats, which are known to be a risk for carrying potentially deadly viruses to human populations (53). 

References

  1. Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol 2019;17:181-92. 
  2. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 2020;367:1260-3.
  3. Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2. Science 2020;367:1444-8. 
  4. CDC. 2019 Novel coronavirus, Wuhan, China. 2020.  https://www.cdc.gov/coronvirus/2019-nCoV/summary.html. Accessed 6 April 2020. 
  5. SARS and MERS: recent insights into emerging coronaviruses.Nat Rev Microbiol 2016;14:523-34. 
  6. Coronaviruses post-SARS: update on replication and pathogenesis. Nat Rev Microbiol. 2009;7:439-50.
  7. The Novel Coronavirus: A Bird’s Eye View. Int J Occup Environ Med. 2020;11:65-71. 
  8. Evidence for Gastrointestinal Infection of SARS-CoV-2. Gastroenterology 2020;S0016-5085:30282-1. 
  9. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med. 2020;382:1199-1207. 
  10. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating a person-to-person transmission: a study of a family cluster. Lancet 2020;395:514-23. 
  11. First Case of 2019 Novel Coronavirus in the United States.N Engl J Med. 2020;382:929-36. 
  12. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020;579:270-3. 
  13. Full-genome evolutionary analysis of the novel coronavirus (2019-nCoV) rejects the hypothesis of emergence as a result of a recent recombination event. Infect Genet Evol. 2020;79:104212. 
  14. The proximal origin of SARS-CoV-2. Nat Med. 2020. https://doi.org/10.1038/s41591-020-0820-9
  15. On the origin and continuing evolution of SARS-CoV-2 Natl Sci Review 2020. https://doi.org/10.1093/nsr/nwaa036 
  16. National Health Commission of People’s Republic of China. Prevent guidelines of 2019-nCoV. 2020. http://www.nhc.gov.cn/xcs/yqfkdt/202001/bc661e49bc487dba182f5c49ac445b.shtml. Accessed 6 April 2020.
  17. Transmission Potential of SARS-CoV-2 in Viral Shedding Observed at the University of Nebraska Medical Center https://doi.org/10.1101/2020.03.23.20039446
  18. Digestive symptoms in CVOID-19 patients with mild disease severity: Clinical presentation, stool viral RNA testing, and outcomes. https://journals.lww.com/ajg/Documents/COVID19_Han_et_al_AJG_Preproof.pdf 
  19. Neonatal Early-Onset Infection With SARS-CoV-2 in 33 Neonates Born to Mothers With COVID-19 in Wuhan, China. JAMA Pediatr.  doi:10.1001/jamapediatrics.2020.0878. 
  20. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med. 2020;382:1199-1207. 
  21. Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PLoS One 2020;15:e0230405. 
  22. Covid-19 – Navigating the Uncharted. N Engl J Med. 2020;382:1268-9. 
  23. Rational use of face masks in the COVID-19 pandemic. Lancet Respir Med. 2020;S2213-2600(20)30134-X. 
  24. Association between 2019-nCoV transmission and N95 respirator use. J Hosp Infect. 2020i:S0195-6701(20)30097-9. 
  25. Recommendation Regarding the Use of Cloth Face Coverings, Especially in Areas of Significant Community-Based Transmission. https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/cloth-face-cover.html
  26. COVID-19 outbreak on the Diamond Princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures. J Travel Med. 2020 Feb 28. 
  27. https://www.epa.gov/pesticide-registration/list-n-disinfectants-use-against-sars-cov-2
  28. Interventions to mitigate early spread of SARS-CoV-2 in Singapore: a modelling study. Lancet Infect Dis. 2020;S1473-3099(20)30162-6.
  29. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. Lancet, Public Health https://doi.org/10.1016/S2468-2667(20)30073-6
  30. SARS in Singapore–key lessons from an epidemic. Ann Acad Med Singapore 2006;35:301-6. 
  31. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med. 2020;382:1199-207. 
  32. Updated understanding of the outbreak of 2019 novel coronavirus (2019-nCoV) in Wuhan, China. J Med Virol. 2020;92:441-7. 
  33. Clinical features of patents infected with novel 2019 coronavirus in Wuhan, China. Lancet 2020;395:497-506.
  34. COVID-19-associated Acute Hemorrhagic Necrotizing Encephalopathy: CT and MRI Features. Radiology 2020:201187. 
  35. Characteristics of Ocular Findings of Patients With Coronavirus Disease 2019 (COVID-19) in Hubei Province, ChinaJAMA Ophthalmol. 2020. doi: 10.1001/jamaophthalmol.2020.1291. 
  36. A New Symptom of COVID-19: Loss of Taste and Smell. 38. Obesity. 2020. doi: 10.1002/oby.22809.
  37. Clinical Characteristics of Coronavirus Disease 2019 in China. N Engl J Med. 2020 DOI: 10.1056/NEJMoa2002032.
  38. Clinical course and mortality risk of severe COVID-19. Lancet 2020;395:507-13. 
  39. The cytokine release syndrome (CRS) of severe COVID-19 and Interleukin-6 receptor (IL-6R) antagonist Tocilizumab may be the key to reduce the mortality. Int J Antimicrob Agents 2020;28:105954. 
  40. Clinical Characteristics of 138 Hospitalized Patients with 2019 Novel Coronavirus–Infected Pneumonia in Wuhan, China.  JAMA 2020; 323:1061-9. 
  41. Unknown unknowns – COVID-19 and potential global mortality. Early Hum Dev. 2020;144:105026. 
  42. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2). Science 2020 Mar 16. pii: eabb3221.
  43. Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCREuro Surveill. 2020;25. 44. https://www.fda.gov/media/136598/download
  44. https://www.fda.gov/media/136622/download
  45. Centers for Disease Control and Prevention. Evaluating and Testing Persons for Coronavirus Disease 2019 (COVID-19) https://www.cdc.gov/coronavirus/2019-nCoV/hcp/clinical-criteria.html
  46. Infectious Diseases Society of America. COVID-19 Prioritization of Diagnostic Testing. https://www.idsociety.org/globalassets/idsa/public-health/covid-19-prioritization-of-dx-testing.pdf
  47. Race to find COVID-19 treatments accelerates. Science  2020:367;1412-3.
  48. Hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting SARS-CoV-2 infection in vitro. Cell Discov. 202018;6:16. 
  49. Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial. Int J Antimicrob Agents 2020:105949. 
  50. Breakthrough: Chloroquine phosphate has shown apparent efficacy in treatment of COVID-19 associated pneumonia in clinical studies. Biosci Trends 2020;14(1):72-73.
  51. Remdesivir for severe acute respiratory syndrome coronavirus 2 causing COVID-19: An evaluation of the evidence. Travel Med Infect Dis. 2020 2:101647.
  52. Permanently ban wildlife consumption. https://science.sciencemag.org/content/367/6485/1434.2

The origin of SARS-CoV-2 furin cleavage site remains a mystery

Authors: By Dr. Liji Thomas, MD Feb 17 2021

The ongoing pandemic of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has largely defied attempts to contain its spread by non-pharmaceutical interventions (NPIs). With the massive loss of life and economic damage, the only way out, in the absence of specific antiviral therapeutics, has been the development of vaccines to achieve population immunity.

A new study on the Preprints server discusses the origin of the furin cleavage site on the SARS-CoV-2 spike protein, which is responsible for the virus’s relatively high infectivity compared to relatives in the betacoronavirus subgenus.

The furin cleavage site

The SARS-CoV-2 is a betacoronavirus, and is most closely related to the bat SARS-related coronavirus (SARSr-CoV) represented by the genome sequence RaTG13, which shares 96% identity with the former. This has made the bat virus the most probable precursor of the virus in current circulation.

The origin of this strain is linked to the emergence of the novel furin cleavage site in the viral spike glycoprotein. The furin is a serine protease widely expressed in human cells, that cleaves the SARS-CoV-2 spike at the interface of its two subunits. It is encoded by a gene on chromosome 15.

Furin acts on substrates with single or paired basic residues during the processing of proteins within cells. Such a polybasic furin cleavage site is found in various proteins from many viruses, including Betacoronavirus Embecoviruses, and the Merbecovirus. However, within the betacoronaviruses of the sarbecovirus lineage B, this type of site is unique to SARS-CoV-2.

The study used a bioinformatic approach using the genomic data available on the National Center for Biotechnological Information (NCBI) databases, to identify the origin of the furin cleavage site.

Same ancestor

They found three coronaviruses that were very similar to the SARS-CoV-2 at the genomic level. These are Pangolin-CoVs (2017, 2019), Bat-SARS-like (CoVZC45, CoVZXC21) and bat RatG13.

The three genomic fingerprints used to identify these matches include fingerprint 1, in the orf1a RNA polymerase gene, including the nsp2 and nsp3 genes; fingerprint 2, at the beginning of S gene, covering the part encoding the N-terminal domain and the receptor-binding domain (RBD) that mediates attachment to the host cell receptor, the angiotensin-converting enzyme 2 (ACE2).; and fingerprint 3, the orf8 gene.

These fingerprints are distinctive to the three closely related coronaviruses only at the RNA level, but the amino acid sequences in the translated proteins are similar to other sarbecoviruses.

The sharing of these genomic sequences indicates their common ancestry, supported by other short sequence features, with one deletion and three insertions. All three strains show the same deletion-insertion pattern at the same four different locations in the spike gene.

Spike gene recombination in a common ancestor

The analysis of the phylogeny of these three strains showed that the first to diverge was the pangolin coronavirus, with the RatG13 being the closest. However, when only the spike is analyzed, there is a high similarity between the pangolin CoV, RaTG13 and SARS-CoV-2.

This may indicate the occurrence of recombination events between the Pangolin-CoV (2017) and RatG13 ancestors. This was followed by the shift of the pangolin CoV to pangolin hosts.Phylogenetic tree of the closely related SARS-CoV-2 coronaviruses based on complete genomesPhylogenetic tree of the closely related SARS-CoV-2 coronaviruses based on complete genomes.

Unique codons encoding arginines in the furin cleavage site

Related Stories

The furin cleavage site consists of four amino acids PRRA, which are encoded by 12 inserted nucleotides in the S gene. A characteristic feature of this site is an arginine doublet.

This insertion could have occurred by random insertion mutation, recombination or by laboratory insertion. The researchers say the possibility of random insertion is too low to explain the origin of this motif.

Surprisingly, the CGGCGG codons encoding the two arginines of the doublet in SARS-CoV-2 are not found in any of the furin sites in other viral proteins expressed by a wide range of viruses.

Even within the SARS-CoV-2, where arginine is encoded by six codons, only a minority of arginine residues are encoded by the CGG codon. Again, only two of the 42 arginines in the SARS-CoV-2 spike are encoded by this codon – and these are in the PRRA motif.

For recombination to occur, there must be a donor, from another furin site and probably from another virus. In the absence of a known virus containing this arginine doublet encoded by the CGGCGG codons, the researchers discount the recombination theory as the mechanism underlying the emergence of PRRA in SARS-CoV-2.

For More Information: https://www.news-medical.net/news/20210217/The-origin-of-SARS-CoV-2-furin-cleavage-site-remains-a-mystery.aspx