THREAT: A Large Annotated Corpus for Detection of Violent Threats

Hammer, Hugo Lewi; Riegler, Michael Alexander; Øvrelid, Lilja; Velldal, Erik

Hammer, Hugo Lewi; Riegler, Michael Alexander; Øvrelid, Lilja; Velldal, Erik

Conference object

Accepted version

Åpne

CBMI2019_YouTube_Threat_Corpus.pdf (700.0Kb)

Permanent lenke

https://hdl.handle.net/10642/8506

Utgivelsesdato

2019-10-21

Metadata

Vis full innførsel

Samlinger

TKD - Institutt for informasjonsteknologi [945]

Originalversjon

Hammer, Riegler, Øvrelid, Velldal: THREAT: A Large Annotated Corpus for Detection of Violent Threats. In: Gurrin CG, Jónsson BT, Peteri R. Proceedings of Content Based Multimedia Information (CBMI 2019), 2019. IEEE conference proceedings p. 1-5

Sammendrag

Understanding, detecting, moderating and in extreme cases deleting hateful comments in online discussions and social media are well-known challenges. In this paper we present a dataset consisting of a total of around 30000 sentences from around 10000 YouTube comments. Each sentence is manually annotated as either being a violent threat or not. Violent threats is the most extreme form of hateful communication and is of particular importance from an online radicalization and national security perspective. This is the ﬁrst publicly available dataset with such an annotation. The dataset can further be useful to develop automatic moderation tools or may even be useful from a social science perspective for analyzing the characteristics of online threats and how hateful discussions evolve.

Utgiver

Institute of Electrical and Electronics Engineers (IEEE)

Serie

International Workshop on Content-Based Multimedia Indexing, CBMI;2019 International Conference on Content-Based Multimedia Indexing (CBMI)