Evaluating and optimizing Transformer models for predicting chemical reactions
| Manohar Koki, Siva | ||
| Kancharla, Supriya | ||
| Göteborgs universitet/Institutionen för data- och informationsteknik | swe | |
| University of Gothenburg/Department of Computer Science and Engineering | eng | |
| 2023-10-23T11:43:58Z | ||
| 2023-10-23T11:43:58Z | ||
| 2023-10-23 | ||
| In this thesis, we assess the effectiveness of a transformer model specifically trained to predict chemical reactions. The model, named Chemformer, is a sequence-tosequence model that uses the transformer’s encoder and decoder stacks. Here, we employ a pre-trained Chemformer model to predict single-step retrosynthesis and evaluate its performance for diverse chemical reaction categories using various metrics such as Top-k accuracies, and Tanimoto similarity. We compare and analyse the results of the evaluations to those of the present template-based model. Based on the findings of the analysis, we fine-tuned the Chemformer model for specific chemical reactions, such as Ugi, Suzuki-Coupling, Rearrangement, Diels-Alder and Ring-Forming. In this project, we address five research questions, including whether the Chemformer model has higher accuracy than template-based model, which reactions it performs better and worse on top-k accuracies, the level of diversity in the results, and what fine-tuning strategies should be employed to enhance its performance. Using attention-based explainable AI, we scrutinize the input features that impact the transformation in the produced molecule. The results presented here may be used in the future to design fine-tuning strategies. The evaluation results of pre-trained Chemformer model yields average Top-k accuracies across most of the reaction classes suggesting that the model struggles to accurately predict the reactions on in-house test data. When evaluating the model’s performance on USPTO data, we found similar results. While the results demonstrate that the pre-trained model outperforms the template-based model, there is still potential for enhancing its performance. This potential for further improvement paves the way for the fine-tuning process. By applying fine-tuning to specific sub-tasks such as Ugi, Suzuki-Coupling etc., we managed to significantly enhance the model’s performance. The fine-tuned model consistently outperforms the both pre-trained and template-based models, exhibiting a notable 50% improvement in accuracy over the pre-trained model. This substantial progression reinforces the effectiveness of transfer learning as a powerful approach for enhancing Chemformer models. | en | |
| https://hdl.handle.net/2077/78918 | ||
| eng | en | |
| Technology | ||
| Chemformer | en | |
| transformer | en | |
| evaluation | en | |
| explainable AI | en | |
| fine-tuning | en | |
| machine learning | en | |
| Evaluating and optimizing Transformer models for predicting chemical reactions | en | |
| text | ||
| Student essay | ||
| H2 |