A Digital Locksmith Has Decoded Biology’s Molecular Keys


Neural networks have been taught to quickly read the surfaces of proteins — molecules critical to many biological processes. The advance is already being used to create defenses for the virus responsible for COVID-19.

The computational biologist Bruno Correia used to have a rule in his lab: No machine learning allowed. He didn’t consider it real science. Now Correia has used it to detect potential interactions between proteins — the complex folded molecules responsible for many biological processes — 40,000 times faster than conventional methods. The journal Nature Methods featured his system on its cover in February 2020. Correia said of his early reluctance to embrace machine learning, “I was wrong, and I’m glad I was wrong.”

What changed his mind? Geometric deep learning: an emerging subfield of artificial intelligence that can learn patterns on curved surfaces.

Proteins interact by fitting their bumpy, irregular shapes together like three-dimensional puzzle pieces. Researchers have spent decades trying to figure out how they do so. The well-known protein folding problem, which has challenged scientists since the mid-20th century, attempts to understand protein interaction by decoding the link between a protein’s constituent amino acids and its final 3D shape. In 1999, IBM began developing its line of Blue Gene supercomputers to tackle the folding problem; 20 years later, DeepMind applied state-of-the-art deep learning algorithms to it.

To read the full article, click here.