®®®® SIIA Público

Título del libro: Kdir: Proceedings Of The 8th International Joint Conference On Knowledge Discovery, Knowledge Engineering And Knowledge Management - Vol. 1
Título del capítulo: Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

Autores UNAM:
JUAN MANUEL TORRES MORENO; AZUCENA MONTES RENDON; GERARDO SIERRA DIAZ;
Autores externos:

Idioma:
Inglés
Año de publicación:
2016
Palabras clave:

Text Mining; Machine Learning; Classification; n-grams; POS; Blogs; Tweets; Social Network


Resumen:

In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, n-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character flooding, capital letters, references to other users, hyperlinks, hashtags, etc.). Experiments with SVM showed up to 90% of performance.


Entidades citadas de la UNAM: