Beyond Static Snapshots: How AI Is Now Capturing Proteins in Motion
Researchers at EPFL have created an artificial intelligence system that captures not just the shape of proteins, but how they move and dance in three dimensions. While DeepMind's AlphaFold revolutionized protein structure prediction by determining static shapes with near-atomic accuracy, a new framework called Latent Diffusion for Full Protein Generation (LD-FPG) goes further, generating complete all-atom models that show proteins in motion. This represents a meaningful shift in computational biology, moving from snapshots to movies of how proteins actually function inside cells.
What's Missing From Today's Protein Prediction Tools?
AlphaFold and similar systems excel at predicting where atoms sit in space, but they miss something crucial: the subtle rearrangements that happen constantly as proteins work. These tiny movements, particularly in structures called side chains, influence how proteins interact with drugs and other molecules. For drug developers, this is a significant limitation. When a drug candidate binds to a protein like a key fitting into a lock, the protein doesn't stay frozen in one position. It shifts, flexes, and adapts. Understanding these dynamics is essential for designing medicines that actually work.
The challenge is computational. Systems like AlphaFold require vast amounts of computing power and expertise because they try to predict the exact spatial position of every single atom. EPFL researchers took a different approach, simplifying the problem in a way that makes capturing motion possible.
How Does LD-FPG Generate Protein Dynamics?
- Graph Neural Networks: The system treats each protein as a mathematical graph where atoms are nodes and bonds are edges, compressing complex structural data into a simplified map that's easier to work with.
- Latent Space Learning: An AI model studies this simplified map and learns the patterns of how protein structures change and move, then generates new latent data for entirely new structures.
- High-Resolution Reconstruction: The simplified data are converted back into high-resolution proteins complete with side chains and dynamic movements that reveal how the protein actually behaves.
The team demonstrated the framework's power by generating high-fidelity, dynamic representations of the dopamine D2 receptor in both its active and inactive states. This protein detects the neurotransmitter dopamine and controls key cellular responses, making it one of the most-studied G-protein coupled receptors (GPCRs), a major focus of the global drug development industry. The researchers published this dataset with open access to facilitate further research.
"Proteins are like tiny machines that dance and switch on and off to work but generating this 'movie' in full detail has been an unsolved challenge. Our LD-FPG framework is the first to do this. Instead of trying to predict the exact coordinates of atoms in space, our model learns a low-dimensional map of the protein's shape changes. This conceptual shift is what makes generating all-atom dynamics possible," explained Aditya Sengar, researcher at EPFL's Laboratory of Protein and Cell Engineering.
Aditya Sengar, Researcher at EPFL's Laboratory of Protein and Cell Engineering
Why Should Drug Developers Care About Protein Movement?
The practical implications are substantial. Virtual screening, a core part of drug discovery, currently involves significant trial and error as researchers test thousands of potential compounds against protein targets. By understanding how proteins actually move and change shape, researchers can design drugs that interact more effectively with their targets. This could meaningfully accelerate the drug discovery process, reducing both time and cost.
The work also opens doors to an entirely new approach to medicine design. Rather than targeting a protein's static shape, researchers can now design drugs that target a protein's dynamic behavior, its movements, and its transitions between different states. This represents a fundamental shift in how we think about drug-protein interactions.
"LD-FPG opens the door to designing new medicines that target a protein's dynamic behavior, not just its shape. Our work represents a new paradigm for computational biology, and a meaningful step forward at the interface of AI and structural biology," stated Patrick Barth, head of the Laboratory of Protein and Cell Engineering at EPFL.
Patrick Barth, Head of Laboratory of Protein and Cell Engineering at EPFL
What Are the Limitations and Next Steps?
The EPFL team acknowledges that their work is not without constraints. The framework currently works best with smaller to medium-sized proteins, and the team aims to streamline the AI framework for even greater accuracy and realism while enabling it to model larger proteins. There's also a critical dependency on data quality that researchers emphasize should not be overlooked.
"Many assume that feeding massive datasets to AI models will automatically solve scientific problems or replace researchers. However, much of that data is noisy or poorly evaluated. We need human scientists to produce the clean data and rigorous benchmarks AI requires, much like we need journalists to safeguard against disinformation," noted Pierre Vandergheynst, researcher at EPFL's Signal Processing Laboratory.
Pierre Vandergheynst, Researcher at EPFL's Signal Processing Laboratory
This insight reflects a broader truth in AI-driven science: the quality of the underlying data matters more than the sophistication of the algorithm. Garbage in, garbage out, as the saying goes. The LD-FPG framework is only as good as the biological data it learns from, which means human expertise remains irreplaceable in validating and curating that data.
The research was published in the Proceedings of NeurIPS 2025, one of the world's premier machine learning conferences, signaling its significance to both the AI and structural biology communities. As the team continues refining the framework, the potential applications extend beyond drug discovery to antibody design, protein engineering, and understanding disease mechanisms at the molecular level.