Logo des Repositoriums
 
Konferenzbeitrag

Classifying business types from twitter posts using active learning

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2010

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

Today, many companies have adopted Twitter as an additional marketing medium to advertise and promote their business activities. One possible solution for organizing a large number of posts is to classify them into a predefined category of business types. Applying normal text categorization technique on Twitter is ineffective due to the short-length (140-character limit) characteristic of each post and a large number of unlabeled data. In this paper, we propose a text categorization approach based on the active learning technique for classifying Twitter posts into three business types, i.e., airline, food and computer & technology. By applying the active learning, we started by constructing an initial text categorization model from a small set of labelled data. Using this text categorization model, we obtain more positive data instances for constructing a new model by selecting the test data which are predicted as positive. As shown from the experimental results, our proposed approach based on active learning helped increase the classification accuracy over the normal text categorization approach.

Beschreibung

Thongsuk, Chanattha; Haruechaiyasak, Choochart; Meesad, Phayung (2010): Classifying business types from twitter posts using active learning. 10th International Conferenceon Innovative Internet Community Systems (I2CS) – Jubilee Edition 2010 –. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-259-8. pp. 180-189. Regular Research Papers. Bangkok, Thailand. June 3-5, 2010

Schlagwörter

Zitierform

DOI

Tags