Key Sentence Extraction From CRISPR-Cas9 Articles Using Sentence Transformers
Abstract
The annotation of CRISPR-related articles and extraction of key content has traditionally relied on manual efforts. Manual annotation is error-prone and timeconsuming. This thesis presents an alternative approach using transfer learning and pre-trained models based on the Transformer architecture. Specifically, Sentence Transformer models are fine-tuned using a CRISPR-related dataset. The dataset contains articles and key sentences, enabling automatic extraction of keyphrases. The study explores various modifications to the models and data to enhance performance for this task.
The results demonstrate the effectiveness of fine-tuning Sentence Transformer models for keyphrase extraction, achieving an Average R-precision of 90.4 %. Future research could focus on alternative approaches or further automation to identify entities
and relations within key sentences. Key sentence extraction is complex due to the varying definitions of key content, content location, and specific use cases. However, the potential benefits of time savings and improved workflow efficiency make this approach highly valuable.
Degree
Student essay
Collections
View/ Open
Date
2023-11-09Author
Henningsson, Sandra
Stranden Lae, Brage
Keywords
NLP
Transformers
CRISPR
semantic search
keyphrase extraction
Language
eng