Coronavirus gene sequences removed from research discovered

To date, the new type of coronavirus has caused 3.9 million deaths worldwide. At the beginning of the pandemic

investigations into the origins of SARS-CoV-2 have been hampered by a lack of access to information from China, where cases first emerged.

Now, a Seattle-based researcher has discovered deleted files from Google Cloud that reveal 13 partial genetic sequences from some of the earliest cases of COVID-19 in Wuhan.

Sequences don't tip the balancefrom or from many theories about how SARS-CoV-2 originated. For example, they do not support the theory that the virus leaked from a high-security laboratory in Wuhan. Yet evidence suggests the novel coronavirus was circulating even before the first major outbreak was discovered at a seafood market in provincial China.

To pinpoint exactly how and where happeneda virus, scientists need to find the so-called precursor virus from which all other strains originated. So far, the earliest sequences have been mostly taken from cases at the Huanan Seafood Market in Wuhan. It was originally speculated that SARS-CoV-2 first appeared in late December 2019. However, cases from early December to November of that year had no market link. This indicates that the virus originated from a different location.

Cases found in the market include threemutations that are absent from virus samples detected weeks later outside the market. Viruses without mutations more closely matched the coronaviruses found in horseshoe bats. Scientists are confident that the new coronavirus somehow originated from bats, so it is logical to assume that the progenitor did not have these mutations either.

And now Jesse Bloom from the Medical InstituteHoward Hughes in Seattle discovered that the deleted sequence data (probably some from the earliest samples of the virus) were also devoid of these mutations.

About a year ago, 241 geneticthe sequence from coronavirus patients has gone missing from the online Sequence Read Archive, which is maintained by the National Institutes of Health (NIH).

Bloom noticed the missing sequences,when I stumbled upon a spreadsheet in a study published in May 2020 in PeerJ. They were part of the Wuhan University project PRJNA612766 and were supposedly uploaded to the archive. The scientist searched the archive database for sequences and received the message "Items not found."

His investigation revealed that the deletedthe sequences are collected by Wuhan University Hospital. At the same time, the preprint of a study published based on these sequences suggests that they were taken from samples of nasal swabs from outpatients with suspected COVID-19 at the beginning of the epidemic.

Bloom was unable to find any explanation as to why the sequences were removed and his emails to the study authors were not answered.

The scientist notes that “there is no convincing scientificreasons for deleting data." The point is that the sequences fully correspond to the samples described in the work. There are no corrections in the document. Additionally, the study emphasizes that the samples were obtained voluntarily from individuals, and sequencing shows no evidence of plasmid contamination or contamination of the samples. “It seems likely that the sequences were removed to conceal their existence,” Bloom concludes.

An article with his findings was published on the biorxiv preprint website.

Read more

The largest comet in history is seen in the solar system: it is almost a planet

New method instantly turns carbon into graphene or diamonds

Found an inexpensive way to save the centers of megacities from overheating