top of page
Midway Tutors

Machine learning in protein structure prediction



Nathnael Bekele


Proteins are important in understanding how different aspects of our body operates. Proteins are made up of smaller molecules called amino acids. These amino acids come together and form chains. These chains are organized in different structures and these structures define the function of the proteins. The protein structures come to be through protein folding during which a linear chain of amino acids organize so that the protein is functional. If the protein does not fold properly, it ends up being nonfunctional or in the worst case toxic. This results in diseases such as Alzheimer's disease, Parkinson's disease, Huntington's disease, and more (Chaudhuri and Paul).


“Having a protein structure provides a greater level of understanding of how a protein works, which can allow us to create hypotheses about how to affect it, control it, or modify it. For example, knowing a protein’s structure could allow you to design site-directed mutations with the intent of changing function” (Darnell).


Hence, knowing the structure of a protein could be very beneficial in the medical industry. It would help in better understanding diseases caused by changes in proteins and also in the treatment options available. In biochemistry, knowing protein structures can help improve our understanding of molecular biology. It enables scientists to “establish structure-functional relationships in molecules” (Dokholyan).


It is difficult to predict what the structure of a protein will be from the polypeptide chain. This is because there are so many computations that have to be considered. A protein structure is viable when it can form into the lowest energy state. It is very hard to justify why one low energy state is going to be the resulting structure and not another.


Currently, protein structures are figured out by using imaging technologies such as nuclear magnetic resonance (NMR), negative staining and cryo-electron microscopy, and X-ray crystallography (Dokholyan). By finding out what the structures of proteins are using these imaging techniques, scientists formed a new field called molecular biology where they investigate how molecular structures affect biological processes.


However, there are a plethora of possible permutations of amino acids when forming proteins. Hence, it is inefficient to use imaging techniques to figure out what the structures of all proteins are. As a result, scientists are trying to use machine learning to predict what the structure of a protein might be using information about its building amino acids.


Since proteins fold into lower energy states in order to be stable, machine learning is used to go through possible permutations of amino acids in order to find what structural changes decrease the energy stored in the proteins. The most efficient machine learning protein structure prediction software is called AlphaFold (Ornes).


“AlphaFold achieved scores above 90% on many of the target proteins. Other AI-driven entrants in the contest reached accuracies above 70%, which just two years earlier would have been unimaginable.” (Ornes)


This contest is the Critical Assessment of protein Structure Prediction (CASP). Here, different softwares which employ AI are asked to predict the structure of known proteins and are scored depending on how accurate they are. With improved knowledge in machine learning software such as AlphaFold are more and more accurate. In the future, we will hopefully be able to have even more accurate abilities to predict protein structures.


Sources:


Chaudhuri, Tapan K, and Subhankar Paul. “Protein-Misfolding Diseases and Chaperone-Based Therapeutic Approaches.” The FEBS Journal, U.S. National Library of Medicine, pubmed.ncbi.nlm.nih.gov/16689923/#:~:text=Protein%20misfolding%20is%20believed%20to,other%20degenerative%20and%20neurodegenerative%20disorders.


Darnell, Steve. “Why Structure Prediction Matters.” DNASTAR, 21 July 2020, www.dnastar.com/blog/structural-biology/why-structure-prediction-matters/#:~:text=Having%20a%20protein%20structure%20provides,the%20intent%20of%20changing%20function.


Dokholyan, Nikolay V. “Experimentally-Driven Protein Structure Modeling.” Journal of Proteomics, U.S. National Library of Medicine, 30 May 2020, www.ncbi.nlm.nih.gov/pmc/articles/PMC7214187/.


Ornes, Stephen. “Researchers Turn to Deep Learning to Decode Protein Structures.” Proceedings of the National Academy of Sciences , 2 Mar. 2022, www.pnas.org/doi/10.1073/pnas.2202107119.


2 views0 comments

Recent Posts

See All

Commenti


bottom of page