Rare genetic diseases can now be detected in patients, and tumor-specific mutations identified -- a milestone made possible by DNA sequencing, which transformed biomedical research decades ago. In recent years, the introduction of new sequencing technologies (next-generation sequencing) has driven a wave of breakthroughs. During 2020 and 2021, for instance, these methods enabled the rapid decoding and worldwide monitoring of the SARS-CoV-2 genome.

At the same time, an increasing number of researchers are making their sequencing results publicly accessible. This has led to an explosion of data, stored in major databases such as the American SRA (Sequence Read Archive) and the European ENA (European Nucleotide Archive). Together, these archives now hold about 100 petabytes of information -- roughly equivalent to the total amount of text found across the entire internet, with a single petabyte equaling one million gigabytes.

Until now, biomedical scientists needed enormous computing resources to search through these vast genetic repositories and compare them with their own data, making comprehensive searches nearly impossible. Researchers at ETH Zurich have now developed a way to overcome that limitation.

Full-text search instead of downloading entire data sets

The team created a tool called MetaGraph, which dramatically streamlines and accelerates the process. Instead of downloading entire datasets, MetaGraph enables direct searches within the raw DNA or RNA data -- much like using an internet search engine. Scientists simply enter a genetic sequence of interest into a search field and, within seconds or minutes depending on the query, can see where that sequence appears in global databases.

"It's a kind of Google for DNA," explains Professor Gunnar Rätsch, a data scientist in ETH Zurich's Department of Computer Science. Previously, researchers could only search for descriptive metadata and then had to download the full datasets to access raw sequences. That approach was slow, incomplete, and expensive.

According to the study authors, MetaGraph is also remarkably cost-efficient. Representing all publicly available biological sequences would require only a few computer hard drives, and large queries would cost no more than about 0.74 dollars per megabase.

Because the new DNA search engine is both fast and accurate, it could significantly accelerate research -- particularly in identifying emerging pathogens or analyzing genetic factors linked to antibiotic resistance. The system may even help locate beneficial viruses that destroy harmful bacteria (bacteriophages) hidden within these massive databases.

Compression by a factor of 300

In their study published on October 8 in Nature, the ETH team demonstrated how MetaGraph works. The tool organizes and compresses genetic data using advanced mathematical graphs that structure information more efficiently, similar to how spreadsheet software arranges values. "Mathematically speaking, it is a huge matrix with millions of columns and trillions of rows," Rätsch explains.

Creating indexes to make large datasets searchable is a familiar concept in computer science, but the ETH approach stands out for how it connects raw data with metadata while achieving an extraordinary compression rate of about 300 times. This reduction works much like summarizing a book -- it removes redundancies while preserving the essential narrative and relationships, retaining all relevant information in a much smaller form.

"We are pushing the limits of what is possible in order to keep the data sets as compact as possible without losing necessary information," says Dr. André Kahles, who, like Rätsch, is a member of the Biomedical Informatics Group at ETH Zurich. By contrast with other DNA search masks currently being researched, the ETH researchers' approach is scalable. This means that the larger the amount of data queried, the less additional computing power the tool requires.

Half of the data is already available now

First introduced in 2020, MetaGraph has been steadily refined. The tool is now publicly accessible for searches (https://metagraph.ethz.ch/search[1]) and already indexes millions of DNA, RNA, and protein sequences from viruses, bacteria, fungi, plants, animals, and humans. Currently, nearly half of all available global sequence datasets are included, with the remainder expected to follow by the end of the year. Since MetaGraph is open source, it could also attract interest from pharmaceutical companies managing large volumes of internal research data.

Kahles even believes it is possible that the DNA search engine will one day be used by private individuals: "In the early days, even Google didn't know exactly what a search engine was good for. If the rapid development in DNA sequencing continues, it may become commonplace to identify your balcony plants more precisely."

Read more …A revolutionary DNA search engine is speeding up genetic discovery

Two types of processed hard fats commonly found in foods like baked goods, margarines, and spreads appear to have little impact on heart health when eaten in realistic amounts.

Researchers from King's College London and Maastricht University conducted the investigation, which was published in the American Journal of Clinical Nutrition. The study focused on interesterified (IE) fats that are high in either palmitic acid (sourced from palm oil) or stearic acid (derived from other plant fats).

These fats are frequently used in place of trans fats and animal fats, both of which are known to raise the risk of heart disease.

Testing the Health Effects of Processed Fats

In the experiment, forty-seven healthy adults participated in a double-blind randomized crossover trial. This design ensured that neither participants nor researchers knew which type of fat was being consumed during each phase.

Each participant followed two separate six-week diets that included muffins and spreads made with either palmitic acid-rich fats or stearic acid-rich fats. These fats provided about 10% of the participants' total daily energy intake.

The researchers then evaluated a range of cardiometabolic health indicators, including cholesterol, triglycerides, insulin sensitivity, liver fat levels, inflammation, and blood vessel function.

Results showed no meaningful differences between the two types of fats in blood cholesterol or triglyceride levels, including the ratio of total to HDL cholesterol, a key measure of cardiovascular risk.

The study also found no signs of harm related to inflammation, insulin resistance, liver fat accumulation, or vascular health.

"Not All Food Processing Is Bad for Us"

Professor Sarah Berry, senior author and Professor of Nutritional Sciences at King's College London, explained: "With the current demonization of everything processed, this research highlights that not all food processing is bad for us! The process of interesterification allows the generation of hard fats in place of harmful trans fats, whilst also enabling manufacturers to reduce the saturated fat content of spreads and foods. Given the widespread use of the process of interesterification of fats and the fearmongering around food processing, this research is timely."

The results indicate that both palmitic acid and stearic acid-rich interesterified fats, when consumed in normal dietary amounts, do not appear to raise short-term risk factors linked to heart disease.

Professor Wendy Hall, lead author and Professor of Nutritional Sciences at King's College London, said: "Our findings provide reassuring evidence that industrially processed fats currently used in everyday foods, whether rich in palmitic or stearic acid, are unlikely to have harmful effects on cardiovascular health when consumed in amounts that people could achieve in their everyday diets. This is important given the widespread use of these fats in processed foods such as margarines, pastries, and confectionery."

More Research Needed for Long-Term Effects

Although the six-week study was long enough to detect important changes in cholesterol and related markers, the researchers note that longer studies are needed to explore potential long-term effects.

This research was conducted jointly by King's College London and Maastricht University and was supported by the Malaysian Palm Oil Board.

Read more …Surprising study finds processed fats may not harm heart health

When you open the refrigerator and find a wedge of cheese flecked with green mold, or a package of chicken that smells faintly sour, it can be tempting to gamble with your stomach rather than waste food.

But the line between harmless fermentation and dangerous spoilage is sharp. Consuming spoiled foods exposes the body to a range of microbial toxins and biochemical by-products, many of which can interfere with essential...

Read more

More Articles …