1/2/2024 0 Comments Revo speed adult![]() ![]() The score was based on how many ‘true positives’ the algorithm retrieved (that is, proteins scoring above a certain similarity threshold according to atomic modelling) before retrieving a false positive. They fed 100 of these shapes into Foldseek and asked it to rank, for each one, the most similar proteins in the database. To test Foldseek, Steinegger’s team used a database of 365,000 proteins whose shapes had been predicted using AlphaFold 2. ![]() The ability to compare proteins on the basis of their shape “allows you to see much farther back in evolutionary time, which allows you to identify very distant relatives that evolved from the same precursor” protein, she says. “Biology occurs in three dimensions,” says Janet Thornton, a computational biologist at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK. By focusing on these spatial bridges, Steinegger says, Foldseek’s ‘3D-interaction alphabet’ better captures global structure. Foldseek assigns each amino acid one of 20 letters, on the basis of its distance from, and orientation relative to, the amino acid that’s closest in the folded-up protein. However, that approach overlooks interactions between amino acids that are far apart in the linear chain, but nearby in 3D space. Other search tools typically assign each amino acid a letter on the basis of its orientation relative to the amino acids immediately before and after it in the protein sequence. His assessment: Foldseek is “amazingly clever”.įoldseek is not the first algorithm to reduce protein structure to an alphabet. Some of the proteins, he found, formed the viruses’ outer shells others were enzymes 2. Gloor used ColabFold, a cloud-based c omputational-notebook interface to AlphaFold 2, to predict the structures of the bacteriophage proteins he found, and then Foldseek to match them to known proteins. What's next for AlphaFold and the AI protein-folding revolution “One of the key ideas was that in order to produce a good structural search, it is important to get the encoding right,” says Martin Steinegger, a biologist at Seoul National University and one of the Foldseek paper’s lead authors. With Foldseek, researchers got the best of both worlds: the software represents a protein’s shape as a string of letters - a ‘structural alphabet’ - thereby offering the sensitivity of shape-based searches but at the speed of sequence-based ones. Structure-based search methods look for shapes instead of sequences, but this can take thousands of times longer, because it’s computationally difficult to compare complex 3D objects. But they often miss good matches because proteins with similar shapes can have vastly different sequences. Sequence searches are fast, like searching a hard drive for a file name. If the functions of those related proteins are known, researchers can make a guess as to what the new protein might do. The conventional computational approach to determining the function of an unfamiliar protein is to look for proteins with similar amino-acid sequences. Foldseek makes it possible to quickly search those databases for proteins that have similar shapes - and presumably, similar functions - to a protein of interest. Researchers have used AlphaFold 2, from Google DeepMind in London RoseTTAFold, from a team at the University of Washington, Seattle and other such tools to compile databases containing hundreds of millions of structures. In the past few years, artificial-intelligence tools that predict a protein’s 3D structure from its amino-acid sequence alone - as opposed to determining that structure experimentally - have improved drastically. Proteins are built of chains of amino acids, and their folded shape dictates their function. His project “went from basically impossible to possible”. Then Gloor learned of a search tool called Foldseek, first shared by its creators in 2021 and described in May in Nature Biotechnology 1. Unfortunately, a search of databases of known proteins for matches came up empty. As a proof of concept, he started looking at the proteins expressed by viruses called bacteriophages that infect those bacteria. When you discover a protein, how do you determine what it does? That’s the problem Gregory Gloor was facing.Ī biochemist at the University of Western Ontario in London, Canada, Gloor was studying bacterial communities at an oil-refinery wastewater treatment plant, hoping to identify the proteins that help them to degrade toxic substances. Foldseek allows researchers to identify proteins whose shape resembles that of other proteins. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |