Sheffield scientists and AstraZeneca develop new machine learning model for inverse protein folding

In a study published in Nature Machine Intelligence, researchers from the University of Sheffield, in collaboration with AstraZeneca and the University of Southampton, have developed a machine learning framework that demonstrates improved accuracy in inverse protein folding compared to existing methods.

Inverse protein folding involves identifying amino acid sequences that form a desired 3D protein structure. The process is essential in protein engineering, particularly in drug development, where proteins must bind to specific biological targets. Due to the complexity of protein folding, predicting how amino acid sequences interact to form stable and functional structures remains a challenge.

Machine learning models trained on known protein sequences and structures have become critical tools in addressing this challenge. The new model, called MapDiff, was tested in simulated environments and showed improved prediction performance over current state-of-the-art artificial intelligence approaches.

Haiping Lu, Professor of Machine Learning at the University of Sheffield and corresponding author of the study, said, “This work represents a significant step forward in using AI to design proteins with desired structures. By learning how to generate amino acid sequences that are likely to fold into specific 3D structures, our method opens new possibilities for designing new therapeutic proteins, which can be used in various therapeutic applications. It’s exciting to see AI helping us tackle such a fundamental challenge in biology.”

Peizhen Bai, Senior Machine Learning Scientist at AstraZeneca, developed MapDiff during his PhD at the University of Sheffield’s School of Computer Science. He said, “During my PhD, I was motivated by the potential of AI to accelerate biological discovery. I’m proud that our method, MapDiff, helps design protein sequences that are more likely to fold into desired 3D structures — a key step towards advancing next-generation therapeutics.”

The study builds on prior work between the University of Sheffield and AstraZeneca, including the development of DrugBAN, an AI model that predicts drug-target binding. That research also appeared in Nature Machine Intelligence and became one of its most cited papers in 2023.

The latest paper is titled Mask-prior-guided denoising diffusion improves inverse protein folding and is now available in Nature Machine Intelligence.

Sheffield scientists and AstraZeneca develop new machine learning model for inverse protein folding

New AI model MapDiff outperforms current methods in inverse protein folding, offering potential applications in therapeutic protein design

Related Posts

Aurobindo Pharma USA receives FTC approval for Lannett acquisition

Six business lessons from Cipla’s nine-decade journey

DIA Medical Writing & Scientific Communications Conference 2026 scheduled in Bengaluru

Lupin launches Azilsartan Medoxomil tablets in the United States

We’re building for global markets, not just domestic scale

USFDA and EUDRA dashboards offer lessons for strengthening India’s regulatory transparency

From fintech to aesthetic injectables: Lessons in navigating regulatory formalisation