Nathnael Bekele
Machine learning is a very sought-after field since it enables scientists to analyze a large amount of data quickly and thoroughly. Machine learning generally describes algorithms that can sort, predict, and recognize patterns based on existing data (Tarca et al). Some algorithms require the user to feed the algorithm data that has already been organized so it can understand the pattern. This is a supervised algorithm. The labeled data set is used to train the algorithm.
On the other hand, there are algorithms that don’t require preorganized data sets. These are unsupervised algorithms. “These algorithms discover hidden patterns in data without the need for human intervention” (Delua).
In modern biology, there is a lot of data available for analysis. It is impossible for a human to analyze this data in a reasonable amount of time since there is so much data available. This is especially true when dealing with genetics. Machine learning is often used to aid humans in understanding what can be concluded from the avalanche of data.
In genetics, machine learning is being used for gene sequencing and gene editing.
“Sequencing DNA means determining the order of the four chemical building blocks - called "bases" - that make up the DNA molecule. The sequence tells scientists the kind of genetic information that is carried in a particular DNA segment.” (National Human Genome Research Institute).
This knowledge can be used in understanding what kinds of genetic variations lead to diseases. Machine learning is used in processes such as Next Generation Sequencing (NGS). NGS enables scientists to “identify gene coding regions in genomes” (Rahul). With the use of NGS, genes are sequenced faster and cheaper than traditional methods.
As discussed in previous blogs, gene editing is a growing field because it gives hope of treating so many diseases. “CRISPR/Cas9 edits genes by precisely cutting DNA and then letting natural DNA repair processes to take over” (CRISPR). However, this requires finding the genes that need to be edited. This is a time-tasking process. Machine learning allows scientists to “identify the correct target audience, significantly reducing the cost and time required to perform gene editing” (Lisowski).
Other applications of machine learning include protein structure analysis in order to understand diseases, medical imagining and analysis of these images, and prediction of neurobiological diseases such as strokes.
Sources
“CRISPR/Cas9.” CRISPR, www.crisprtx.com/gene-editing/crispr-cas9.
Delua, Julianna. “Supervised vs. Unsupervised Learning: What's the Difference?” IBM, 12 Mar. 2021, www.ibm.com/cloud/blog/supervised-vs-unsupervised-learning.
“DNA Sequencing Fact Sheet.” Genome.gov, www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Fact-Sheet.
Lisowski, Edwin. “The Role of Machine Learning in Bioinformatics and Biology.” Addepto, 21 Sept. 2021, addepto.com/the-role-of-machine-learning-in-bioinformatics-and-biology/.
Rahul. “How Machine Learning Is Transforming the Field of Biology.” IndustryWired, 1 Mar. 2022, industrywired.com/how-machine-learning-is-transforming-the-field-of-biology/.
Tarca, Adi L, et al. “Machine Learning and Its Applications to Biology.” PLOS Computational Biology, Public Library of Science, 29 June 2007, journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.0030116.
Comments