Machine Learning Prediction of Enzymes’ Optimal Catalytic Temperatures

Finlinson Porter, Camille
Göteborgs universitet/Institutionen för data- och informationsteknikswe
University of Gothenburg/Department of Computer Science and Engineeringeng
2023-03-22T14:07:42Z
2023-03-22T14:07:42Z
2023-03-22
Enzymes that have been genetically engineered to withstand high temperatures are used by industry to make products with less waste and pollution. Different features of protein structure affect the optimal catalytic temperature ("topt") at which enzymes catalyze reactions most efficiently. We sought to use information from protein structures to predict the topt. To do this, we analyzed the structures and optimal catalytic temperatures of 1379 proteins in 7 different ways. For a set of analyses based on Delaunay atomic interactions, the atoms for each protein were categorized by their Tsai atomic group, Popelier atomic group, or by their amino acid, and the nearest neighbors of each atom were then found by Delaunay triangulation. Next, the neighbors were classified by their atomic group and their frequencies calculated. For a separate analysis of atomic interactions (“threshold residue atomic interactions”), the atoms for each protein were categorized by the beta carbon of their amino acids. Any beta carbons within 8Å were found to be interacting. A third set of analyses based on the frequencies of each category of atom on the protein interior and surface was also performed. Each atom was again categorized by Tsai atomic group, Popelier atomic group, or amino acid residue. All of the frequencies in these seven groups were separately used as the predictor variables in regression to predict the response variable, the optimal catalytic temperature. Four different kinds of regression were tried: elastic net, sparse group lasso, decision tree, and support vector. The predictions had maximum testing R2 values of 0.4. These results are similar to results in previous work done by Ulfenborg 2020. We found that being very detailed in defining interactions and categories did not give better results.en
https://hdl.handle.net/2077/75679
engen
Technology
enzymeen
proteinen
amino aciden
protein structureen
optimal catalytic temperatureen
Machine Learning Prediction of Enzymes’ Optimal Catalytic Temperaturesen
text
Student essay
H2

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CSE 22-02 Porter.pdf
Size:
4.45 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
876 B
Format:
Item-specific license agreed upon to submission
Description:

Collections