Building domain specific sentiment lexicons combining information from many sentiment lexicons and a domain specific corpus
Peer reviewed, Chapter
The final publication is available at springer via http://dx.doi.org/20.1007/978-3-319-19578-0_17
View/ Open
Date
2015Metadata
Show full item recordCollections
Original version
Hammer, H., Yazidi, A., Bai, A., & Engelstad, P. (2015). Building Domain Specific Sentiment Lexicons Combining Information from Many Sentiment Lexicons and a Domain Specific Corpus. In A. Amine, L. Bellatreche, Z. Elberrichi, J. E. Neuhold, & R. Wrembel (Eds.), Computer Science and Its Applications: 5th IFIP TC 5 International Conference, CIIA 2015, Saida, Algeria, May 20-21, 2015, Proceedings (pp. 205-216). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-19578-0_17Abstract
Most approaches to sentiment analysis requires a sentiment
lexicon in order to automatically predict sentiment or opinion in a text.
The lexicon is generated by selecting words and assigning scores to the
words, and the performance the sentiment analysis depends on the quality
of the assigned scores. This paper addresses an aspect of sentiment
lexicon generation that has been overlooked so far; namely that the most
appropriate score assigned to a word in the lexicon is dependent on the
domain. The common practice, on the contrary, is that the same lexicon
is used without adjustments across different domains ignoring the
fact that the scores are normally highly sensitive to the domain. Consequently,
the same lexicon might perform well on a single domain while
performing poorly on another domain, unless some score adjustment is
performed. In this paper, we advocate that a sentiment lexicon needs
some further adjustments in order to perform well in a specific domain.
In order to cope with these domain specific adjustments, we adopt a
stochastic formulation of the sentiment score assignment problem instead
of the classical deterministic formulation. Thus, viewing a sentiment
score as a stochastic variable permits us to accommodate to the
domain specific adjustments. Experimental results demonstrate the feasibility
of our approach and its superiority to generic lexicons without
domain adjustments.