Machine Learning for Detecting Hate Speech in Low Resource Languages

Rodriguez, David; Saynova, Denitsa

dc.contributor.author	Rodriguez, David
dc.contributor.author	Saynova, Denitsa
dc.date.accessioned	2020-07-08T11:39:08Z
dc.date.available	2020-07-08T11:39:08Z
dc.date.issued	2020-07-08
dc.identifier.uri	http://hdl.handle.net/2077/65590
dc.description.abstract	This work examines the role of both cross-lingual zero-shot learning and data augmentation in detecting hate speech online for low resource set-ups. The proposed solutions for situations where the amount of labeled data is scarce are to use a language with more resources during training or to create synthetic data points. Cross-lingual zero-shot results suggest some knowledge transfer is occurring. However, results seem greatly influenced by the specific training data set selected. This is further supported by cross-data set experimentation within the same language, where results were also found to fluctuate based on training data without the need for cross-lingual transfer. Meanwhile, data augmentation methods show an improvement, especially for low amounts of data. Furthermore, a detailed discussion on how the proposed data augmentation techniques impact the data is presented in this work.	sv
dc.language.iso	eng	sv
dc.relation.ispartofseries	CSE 20-16	sv
dc.subject	machine learning	sv
dc.subject	natural language processing	sv
dc.subject	BERT	sv
dc.subject	cross-lingual zeroshot learning	sv
dc.subject	data augmentation	sv
dc.subject	hate speech	sv
dc.subject	classification	sv
dc.subject	Twitter	sv
dc.title	Machine Learning for Detecting Hate Speech in Low Resource Languages	sv
dc.title.alternative	Machine Learning for Detecting Hate Speech in Low Resource Languages	sv
dc.type	text
dc.setspec.uppsok	Technology
dc.type.uppsok	H2
dc.contributor.department	Göteborgs universitet/Institutionen för data- och informationsteknik	swe
dc.contributor.department	University of Gothenburg/Department of Computer Science and Engineering	eng
dc.type.degree	Student essay

Files in this item

Name:: gupea_2077_65590_1.pdf
Size:: 6.395Mb
Format:: PDF
Description:: Master thesis

View/Open

This item appears in the following Collection(s)

Masteruppsatser

Show simple item record