Equivariant Generative Models for Protein Structure

Machine learning has revolutionized the fields of protein structure prediction and protein design in the past few years, as recognized by the Nobel Prize in Chemistry in 2024. Driven by advances in generative modeling [1,2] and the development of powerful equivariant architectures [3], it is now possible to generate complex, biophysically plausible protein structures on a large scale. By conditioning on desired properties, this enables the design of new drugs, therapies and materials.

However, many limitations remain: The generated structures are overall too stiff, the properties that can be used for conditioning are restricted, inference is expensive, and the samples are often structurally redundant – a problem that we recently addressed by proposing a new model based on Geometric Algebra (https://www.h-its.org/projects/gafl) [4].

Building on this line of research, we are currently working on overcoming more of these limitations and generally on extending the capabilities of state-of-the-art methods for protein structure generation. Projects in the framework of a Master thesis in computer science or physics entail the improvement of existing methods such as loss function modifications and the development of entirely new approaches, for example, conditioning on dynamical properties.

Project type:

Master thesis in computer science or physics at Heidelberg University / KIT

Prerequisites:

  • Familiarity with Deep Learning and PyTorch
  • Basic understanding of Group Theory, especially of SO(3)
  • No prior knowledge on proteins needed

Contact:

References:

[1] Yaron Lipman et al. Flow matching for generative modeling. International Conference on Learning Representations, 2023.
[2] Jonathan Ho et al. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 2020.
[3] John Jumper et al. Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, August 2021.
[4] Simon Wagner et al. Generating Highly Designable Proteins with Geometric Algebra Flow Matching. Advances in Neural Information Processing Systems, 2024.

Switch to the German homepage or stay on this page