University of Limerick Institutional Repository

Towards accurate detection of offensive language in online communication in Arabic

DSpace Repository

Show simple item record Alakrot, Azalden Murray, Liam Nikolov, Nikola S. 2019-06-10T09:40:15Z 2019-06-10T09:40:15Z 2018
dc.description peer-reviewed en_US
dc.description.abstract We present the results of predictive modelling for the detection of anti-social behaviour in online communication in Arabic, such as comments which contain obscene or offensive words and phrases. We collected and labelled a large dataset of YouTube comments in Arabic which contains a broad range of both offensive and inoffensive comments. We used this dataset to train a Support Vector Machine classifier and experimented with combinations of word-level features, N-gram features and a variety of pre-processing techniques. We summarise the pre-processing steps and features that allow training a classifier which is more precise, with 90.05% accuracy, than classifiers reported by previous studies on Arabic text. en_US
dc.language.iso eng en_US
dc.publisher Elsevier en_US
dc.relation.ispartofseries Procedia Computer Science;142 pp,315-320
dc.relation.ispartofseries 4th International conference on arabic computational linguistics 2018 Dubai
dc.subject Anti-social behaviour online en_US
dc.subject offensive language detection en_US
dc.subject harassment detection en_US
dc.subject Arabic dataset en_US
dc.subject text mining en_US
dc.subject SVM for offensive language detection in Arabic en_US
dc.title Towards accurate detection of offensive language in online communication in Arabic en_US
dc.type info:eu-repo/semantics/conferenceObject en_US
dc.type.supercollection all_ul_research en_US
dc.type.supercollection ul_published_reviewed en_US
dc.identifier.doi 10.1016/j.procs.2018.10.491
dc.rights.accessrights info:eu-repo/semantics/openAccess en_US
dc.internal.rssid 2905300

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ULIR


My Account