Conference article

Twitter Topic Modeling by Tweet Aggregation

Asbjørn Ottesen Steinskog
Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway

Jonas Foyn Therkelsen
Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway

Björn Gambäck
Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway

Download article

Published in: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Linköping Electronic Conference Proceedings 57:10, p. 77-86

NEALT Proceedings Series 29:10, p. 77-86

Show more +

Published: 2017-05-08

ISBN: 978-91-7685-601-7

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

Conventional topic modeling schemes, such as Latent Dirichlet Allocation, are known to perform inadequately when applied to tweets, due to the sparsity of short documents. To alleviate these disadvantages, we apply several pooling techniques, aggregating similar tweets into individual documents, and specifically study the aggregation of tweets sharing authors or hashtags. The results show that aggregating similar tweets into individual documents significantly increases topic coherence.

Keywords

No keywords available

References

No references available

Citations in Crossref