Computer scientist Jonathan De Vita used specialist programming languages as part of his degree studies, specialising in coding and AI. This article will look at MDGen, a new system relying on generative AI to emulate the dynamics of molecules.
The capabilities of generative AI models continue to grow at a staggering pace, transforming simple text prompts into realistic images and even video clips. More recently, AI has been leveraged by biologists and chemists studying proteins, DNA and static molecules. Models like AlphaFold have made it possible to predict molecular structures, potentially accelerating drug discovery. Meanwhile, the MIT-assisted RFdiffusion is being leveraged to design new proteins.
One key challenge faced by scientists lies in the fact that molecules are constantly moving, a behaviour that it is vital to model when developing new proteins and drugs. Molecular dynamics is a technique that enables researchers to simulate these movements using physics. However, it can be prohibitively expensive, requiring billions of time steps on supercomputers.
Enter a team from MIT Computer Science and Artificial Intelligence Laboratory’s Department of Mathematics who have developed a generative AI model that learns from existing data. The system, MDGen, takes frames of 3D molecules, simulating what would happen next like a video, connecting separate steps and even filling in missing frames.
By enabling researchers to effectively hit the play button on molecules, MDGen could be leveraged by chemists to design new molecules. The technology could also allow researchers to study how well drug prototypes for diseases like cancer interact with the molecular structures they are designed to impact.
Bowen Jing, co-lead author on the MIT paper, explains that MDGen is an early proof of concept, hailing it as marking the beginning of an exciting new research direction. While early generative AI models were capable of producing videos, these were somewhat rudimentary, for example featuring a dog wagging its tail or a person blinking. Jing, a PhD Student at MIT Computer Science and Artificial Intelligence Laboratory, suggests that the more sophisticated models such as Veo or Sora can be useful in all sorts of interesting new ways. Bowen Jing relayed the team’s intention of instilling a similar vision for the molecular world, removing noise from molecular video, animating what is between frames and guessing what is hidden.
The MIT team’s findings suggest a paradigm shift from previous comparable works, with MDGen facilitating the use of generative AI in a much broader range of applications. Previous approaches were ‘autoregressive’, starting from the very first frame to create video sequences, relying on the previous still frame to build the next. MDGen, on the other hand, generates frames in parallel with diffusion. In experiments, Bowen Jing and his team found that MDGen’s simulations were akin to running physical simulations directly while providing trajectories 10 to 100 times faster.
Molecular dynamics plays a vital role in drug discovery by providing detailed insights into the movement of molecules over time. Unlike static snapshots, MD simulations capture the dynamic behaviour of proteins, revealing binding pathways, conformational changes and interaction forces. All of these insights are vital in terms of deducing how drugs interact with their targets, particularly in complex biological environments where motion and flexibility have a considerable impact on binding affinity and specificity. By simulating molecular dynamics trajectories, the MDGen generative model presents scope for potentially game-changing advancements in biophysics, chemistry and AI-driven molecular design.
MDGen’s molecular inpainting facility could provide invaluable insights into molecular structures, facilitating more precise design and repair. In the realms of synthetic biology, the technology could be used to engineer novel molecular pathways, while in applications like mutation repair MDGen could play a pivotal role in predicting the impact of mutations and suggesting compensatory structures to restore cell function.