dc.contributor.author | Hammer, Hugo Lewi | |
dc.contributor.author | Yazidi, Anis | |
dc.contributor.author | Bai, Aleksander | |
dc.contributor.author | Engelstad, Paal E. | |
dc.date.accessioned | 2017-02-03T12:16:36Z | |
dc.date.accessioned | 2017-02-14T10:09:44Z | |
dc.date.available | 2017-02-03T12:16:36Z | |
dc.date.available | 2017-02-14T10:09:44Z | |
dc.date.issued | 2016 | |
dc.identifier.citation | Hammer HL, Yazidi A, Bai A, Engelstad P.E.: Improving classification of tweets using word-word co-occurrence information from a large external corpus. In: Ossowski S. Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC '16), 2016. Association for Computing Machinery (ACM) p. 1174-1177 | language |
dc.identifier.uri | https://hdl.handle.net/10642/3723 | |
dc.description.abstract | Classifying tweets is an intrinsically hard task as tweets are
short messages which makes traditional bags of words based
approach ine cient. In fact, bags of words approaches ig-
nores relationships between important terms that do not
co-occur literally.
In this paper we resort to word-word co-occurence informa-
tion from a large corpus to expand the vocabulary of another
corpus consisting of tweets. Our results show that we are
able to reduce the number of erroneous classi cations by
14% using co-occurence information. | language |
dc.language.iso | en | language |
dc.publisher | Association for Computing Machinery (ACM) | language |
dc.subject | Classification | language |
dc.subject | Lasso regression | language |
dc.subject | Twitter | language |
dc.subject | Word-word co-occurence | language |
dc.title | Improving classification of tweets using word-word co-occurrence information from a large external corpus | language |
dc.title.alternative | Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC '16) | language |
dc.type | Chapter | |
dc.type | Peer reviewed | language |
dc.type | Chapter | |
dc.date.updated | 2017-02-03T12:16:36Z | |
dc.description.version | acceptedVersion | language |
dc.identifier.doi | http://dx.doi.org/10.1145/2851613.2851986 | |
dc.identifier.cristin | 1383941 | |
dc.source.isbn | 978-1-4503-3739-7 | |