Corpus Christi – En korpuslingvistisk studie av latinets semantiska utveckling i kristendomens spår.

Lafage, David
University of Gothenburg/Department of Languages and Literatureseng
Göteborgs universitet/Institutionen för språk och litteraturerswe
2025-10-14T08:01:19Z
2025-10-14T08:01:19Z
2025-10-14
The objective of this master thesis is to measure the effect of Christianity on Latin semantics with a specially trained Large Language Model and departing from a corpus-driven approach. First, I am investigating if we can confirm that words selected in the literature about Christian Latin de facto have undergone a measurable semantic shift in the Christian age, and if we can enrich this list with previously unnoticed words. Next, I want to find out if the results differ significantly depending on how Christian Latin is defined. The methodology is based on theories of distributional semantics and the Distributional Hypothesis, and follows other works in the field. First, an existing BERT model (LatinBERT) is trained on the Patrologia Latina corpus, under the assumption that this corpus is representative of Christian Latin. An algorithm is then selected from a metastudy to perform a Graded Change Detection and three different tests are performed in order to evaluate the model’s performance. Finally, the results are computed and analyzed quantitatively and qualitatively, and inferential statistics are applied to the data. The results show that the new model, XPLatinBERT (XPL), outperforms the SemEval2020 models for Latin on a benchmark based on a similar task. By and large all the words in the literature on Christian Latin are confirmed and other words are proposed by using a corpus based (the third quantile) and a corpus-driven approach. Due to lemmatization issues in the corpora under investigation, some words are false-positives, which calls for a deeper qualitative investigation of the results. Although a difference can be observed in the dataset as a whole, as well as on specific words, this difference is not strong enough to be statistically significant. It is therefore possible to consider Christian Latin as a register and to regard deviations as an effect of other factors such as Late and Medieval Latin, but more work has to be done. XPL can now be found on Github (Lafage, 2025b)sv
https://hdl.handle.net/2077/89891
swesv
SPL 2025-048sv
HumanitiesTheology
NLPsv
LatinBERTsv
XPLatinBERTsv
corpus linguisticssv
distributive semanticssv
Christian Latinsv
Sondersprachesv
word embeddingssv
MLMsv
machine-learningsv
Corpus Christi – En korpuslingvistisk studie av latinets semantiska utveckling i kristendomens spår.sv
Text
Student essay
H2

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
H24_LAT240_DL.pdf
Size:
2.35 MB
Format:
Adobe Portable Document Format
Description:
Student essay

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
4.68 KB
Format:
Item-specific license agreed upon to submission
Description: