• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Masteruppsatser / Master in Language Technology
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Masteruppsatser / Master in Language Technology
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

From Abstract Syntax to Natural Language Addressing Natural Language Generation Challenges in Arabic Using GFWordnet as Lexical Resources.

From Abstract Syntax to Natural Language Addressing Natural Language Generation Challenges in Arabic Using GFWordnet as Lexical Resources.

Sammanfattning
This thesis explores the development and evaluation of Arabic natural language generation using the Grammatical Framework (GF) within GFPedia. GFPedia is a framework that generates multilingual content using predefined abstract syntax trees (ASTs) and dynamic placeholders for lexical entries from GFWordNet. The primary goal is to assess how effectively GF can generate grammatically correct sentences based on the available abstract syntax trees (ASTs) in the GFPedia. The research involves building Arabic lexical resources and integrating them into GFPedia. The system’s output is evaluated (a) automatically, using Levenshtein distance to measure deviations from reference texts and (b) manually by analyzing the grammatical and morphological correctness. Results highlight significant challenges in Arabic sentence generation, including issues with word structure, definiteness, syntactic alignment, and the need for context-aware translations. To address these challenges, the thesis proposes the introduction of a semantic layer into the GFPedia framework. By leveraging ontological and contextual information from resources like Wikidata, the semantic layer can select appropriate words, word order, sentence types, and other linguistic features based on the semantic content of the information. This approach aims to reduce the dependency on deep knowledge of the Resource Grammar Library (RGL) and language-specific grammar, facilitating a more efficient and scalable content development process. Additionally, the thesis suggests using Large Language Models (LLMs) to assist in generating lexical resources using Retrieval-Augmented Generation (RAG).
Examinationsnivå
Student essay
URL:
https://hdl.handle.net/2077/84379
Samlingar
  • Masteruppsatser / Master in Language Technology
Fil(er)
Master Thesis (719.1Kb)
Datum
2024-11-28
Författare
Zarzoura, Mohamed
Nyckelord
Language technology, Natural language generation, NLG
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV