CORPUS EXPLORATION AND DIALOGUE SYSTEM DESIGN FOR A VIRTUAL LIBRARIAN
Korpusutforskning och design av dialogsystem för en virtuell biblotikarie
This thesis is a part of the virtual librarian project for the City Library Gothenburg (Stadsbibliotek Göteborg), which is a public city library. The objective of the project is to develop a virtual librarian using machine learning and AI approaches to replace the current webchat solution to reduce the workload of human librarians and increase satisfaction among the patrons. This thesis offers a systematic approach for the development practice based on small existing corpora for small and middle-size institutions, in which resources, especially technical development resources, are limited. The methods take the workload off from the side of the principal1 significantly, using requirement analysis with a narrative interview; topic-session based annotation with expandable tag set without detailed annotation guidelines, which requires less linguistic pre-knowledge and training process; and intent identification through corpus analysis with the assignment of priorities. Furthermore, this thesis offers a classification of intents based on the patterns of system behavior, which simplifies the formation of a complete intent list. Since Rasa is the preliminarily prioritized platform for the implementation of the virtual librarian, this thesis also engages a short competitive product analysis of the dialogue systems in the Rasa showcase. In the end, some technical suggestions for Rasa implementation are given, reflecting the requirements from the City Library Gothenburg.