Construcción de un corpus marcado con emociones para el análisis de sentimientos en Twitter en español

Grigori Sidorov, Sofía Natalia Galicia Haro, Vanessa Alejandra Camacho Vázquez


El análisis de sentimientos (AS) trata de manera computacional opiniones, sentimientos y subjetividad en textos, así como crece exponencialmente en redes sociales y sobre varios problemas. Presentamos el desarrollo de un corpus emocional basado en tweets para el español analizado a mano con la finalidad de crear automáticamente recursos de mayor tamaño y calidad. Se muestra que Twitter se usa tanto para expresar emociones positivas que negativas, al menos en español. También nuestra investigación explora los estudios más relevantes del área en Twitter para brindar un panorama a investigaciones futuras exponiendo carencias y posibles direcciones.

Texto completo:



Alberto Acerbi, Vasileios Lampos, Philip Garnett, R. Alexander Bentley

«The Expression of Emotions in 20th Century Books.» PLOS ONE 8.3, pp. 1-6.

Alejandro Mosquera López, Paloma Moreda

«DLSI en Tweet-Norm 2013: Normalización de Tweets en Español.» Tweet Normalization Workshop at SEPLN, pp. 25-29.

Alexander Pak, Patrick Paroubek

«Twitter as a Corpus for Sentiment Analysis and Opinion Mining.» Seventh International Conference on Language Resources and Evaluation (LREC’10). Valleta, Malta: European Language Resources Association, pp. 1320-1326.

Andranik Tumasjan, Timm O. Sprenger, Philipp G. Sandner, Isabell M. Welpe

«Predicting elections with twitter: What 140 characters reveal about political sentiment.» Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media. Washington DC: Association for the Advancement of Artificial Intelligence, pp. 178-185.

Antonio Fernández Anta, Luis Núnez Chiroque, Philippe Morere, Agustín Santos

«Sentiment analysis and topic detection of Spanish Tweets: A comparative study of NLP techniques.» Procesamiento de Lenguaje Natural 50, pp. 45-52.

Antonio Reyes, Paolo Rosso, Davide Buscaldi

«From humor recognition to irony detection: The figurative language of social media.» Data & Knowledge Engineering. Applications of Natural Language to Information Systems 74, pp. 1-12.

Archivist, Tweet

Tweet Archivist Simple Poderoso Accesible Análisis de Twitter. 2013-2014. 01 de mayo de 2014.

Brendan O’Connor, Ramnath Balasubramanyan, Bryan R. Routledge, Noah A. Smith.

«From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series.» Fourth International AAAI Conference on Weblogs and Social Media. Washington, D.C., pp. 1-8.

David Pinto, Darnes Vilariño Ayala, Yuridiana Alemán, Helena Gómez-Adorno, Nahun Loya, Héctor Jiménez-Salazar

«The Soundex Phonetic Algorithm Revisited for SMS Text Representation.» Text, Speech and Dialogue 7499, pp. 47-55.

Diccionario. SINÓNIMOS

s.f. Diccionario de sinónimos en español. 05 de mayo de 2014.

Efron, Miles

«Hashtag Retrieval in a Microblogging Environment.» Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. New York, NY, USA: ACM Press, pp. 787-788.

«Information search and retrieval in microblogs.» Journal of the American Society for Information Science and Technology 62(6), pp. 996-1008.

Ekman, Paul

«An Argument for Basic Emotions.» Cognition and Emotion 6.3-4, pp. 169-200.

Eugenio Martínez-Cámara, M. Teresa Martín-Valdivia, L. Alfonso Ureña-López.

«Opinion classication techniques applied to a spanish corpus.» Natural Language Processing and Information Systems 6716, pp. 169-176.

Georgios Paltoglou, Mike Thelwall

«Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media.» ACM Transactions on Intelligent Systems and Technology 3.4, 66:1, 66:19.

Gonzalo Blázquez Gil, Antonio Berlanga de Jesús, José M. Molina Lopéz

«Combining Machine Learning Techniques and Natural Language Processing to Infer Emotions Using Spanish Twitter Corpus.» Highlights on Practical Applications of Agents and Multi-Agent Systems 365, pp. 149-157.

Grigori Sidorov, Francisco Velasquez, Alexander Gelbukh, Efstathios Stamatatos, & Liliana Chanona-Hernández

«Syntactic Dependency-Based N-grams: More Evidence of Usefulness in Classification.» LNCS. Heidelberg: Springer, pp. 13-24.

Grigori Sidorov, Francisco Velasquez, Alexander Gelbukh, Efstathios Stamatatos, Liliana Chanona-Hernández

«Syntactic Dependency-Based N-grams as Classification Features.» MICAI 2012, Part II LNAI. Heidelberg: Springer, pp. 1-11.

Grigori Sidorov, Sabino Miranda-Jiménez, Francisco Viveros-Jiménez, Alexander Gelbukh, Noé Castro-Sánchez, Francisco Velásquez, Ismael Díaz-Rangel, Sergio Suárez-Guerra, Alejandro Treviño, Juan Gordon

«Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets.» Advances in Artificial Intelligence 7629, pp. 1-14.

H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, Lyle H. Ungar

«Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach.» PLOS ONE 8.9, pp. 1-16.

Hui Yang, Alistair Willis, Anne de Roeck, Bashar Nuseibeh

«A Hybrid Model for Automatic Emotion Recognition in Suicide Notes.» Biomedical Informatics Insights 5.1 pp. 1-14.

Julian Brooke, Milan Tofiloski, Maite Taboada

«Cross-Linguistic Sentiment Analysis: From English to Spanish.» International Conference RANLP. Borovets, Bulgaria: Association for Computational Linguistics, pp. 50-54.

Julio Villena-Román, Janine García-Morera, Sara Lana-Serrano, José Carlos González-Cristóbal

«TASS 2013 - A Second Step in Reputation Analysis in Spanish.» Procesamiento del Lenguaje Natural 52, pp. 37-44.

Justin T, Gajšek R, Štruc V, Dobrišek S.

«Comparison of different classification methods for emotion recognition.» In MIPRO, 2010 Proceedings of the 33rd International Convention. Opatija, Croatia : IEEE , pp. 700-703.

Kirk Roberts, Sanda M. Harabagiu

«Statistical and Similarity Methods for Classifying Emotion in Suicide Notes.» Biomedical Informatics Insights 5.1 pp. 195-204.

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, Manfred Stede

«Lexicon-based methods for sentiment analysis.» Computational Linguistics 37.2, pp. 267-307.

Mario Cataldi, Luigi Di Caro, Claudio Schifanella

«Emerging Topic Detection on Twitter based on Temporal and Social Terms Evaluation.» In Proceedings of the Tenth International Workshop on Multimedia Data Mining. New York: ACM Press, pp. 1-10.

Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten

«The WEKA Data Mining Software: An Update.» SIGKDD Explorations 11.1, pp. 10-18.

Mike Thelwall, Kevan Buckley, Georgios Paltoglou

«Sentiment in Twitter Events.» Journal of the American Society for Information Science and Technology 62.2, pp. 406-418.

«Sentiment strength detection for the social web.» Journal of the American Society for Information Science and Technology 63.1, pp. 163-173.

Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai

«Sentiment Strenght Detection in Short Informal Text.» Journal of the American Society for Information Science and Technology 61.12, pp. 2544-2558.

Mohammad, Saif M.

«#Emotional tweets.» SemEval ‘12 Proceedings of the First Joint Conference on Lexical and Computational Semantics. Volume 1: Proceedings of the main conference and the shared task and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. Stroudsburg, USA: Association for Computational Linguistics (ACM), pp. 246-255.

Nicholas A. Diakopoulos, David A. Shamma

«Characterizing Debate Performance via Aggregated Twitter Sentiment.» In Proceedings of the 28th International Conference on Human Factors in Computing Systems. New York, USA: ACM Press, pp. 1195-1198.

Padró, Lluís, Reese S., Agirre E., y Soroa A.

«Semantic services in freeling 2.1: Wordnet and ukb.» In Global Wordnet Conference 2010, pp. 99-105.

Padró Lluís, Evgeny Stanilovsky

«FreeLing 3.0: Towards Wider Multilinguality.» In Proceedings of the Eighth International Conference on Language Resourses and Evaluation (LREC 2012). Istanbul, Turkey, pp. 2473-2479.

Porter, M. F.

Readings in information retrieval. An algorithm for suffix stripping. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

Python, Central

-2014 Introduction to tweepy, Twitter for Python. 02 de mayo de 2014. .

Saif M. Mohammad, Svetlana Kiritchenko

«Using Hashtags to Capture Fine Emotion Categories from Tweets.» Semantic Analysis in Social Media, Computational Intelligence 30.3, pp. 1-22.

Saif Mohammad, Cody Dunne, Bonnie Dorr

«Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus.» Proceedings of the 2009 Conference on EMNLP. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 599-608.

Sidorov, Grigori

Non-linear construction of n-grams in computational linguistics: syntactic, filtered, and generalized n-grams. primera. D.F.: Sociedad Mexicana de Inteligencia Artificial.

Steven Bird, Ewan Klein, Edward Lope

Natural Language Processing with Python. Sebastopol: O’Reilly.

Ted Pedersen, Siddharth Patwardhan, Jason Michelizzi

«Wordnet::similarity - measuring the relatedness of concepts.»

Proceeding HLT-NAACL- Demonstration Papers 2004. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 38-41 .

Thelwall M., Buckley K., Paltoglou G.

«Sentiment strength detection for the social web.» Journal of the American Society for Information Science and Technology 63.1, pp. 163-173.

Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A.

«Sentiment strength detection in short informal text.» Journal of the American Society for Information Science and Technology 61.12, pp. 2544-2558.

Whissell, Cynthia

«Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language.» Psychological Reports 105.2, pp. 509-521.


Diccionario de sinónimos y antónimos - WordReference.

Yao Lu, Xiangfei Kong, Xiaojun Quan, Wenyin Liu, Yinlong Xu

«Exploring the Sentiment Strenght of User Reviews.» Web-Age Information Management. Lecture Notes in Computer Science 6184, pp. 471-482.

Younggue Bae, Hongchul Lee

«Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers.» Journal of the American Society for Information Science and Technology 63.12, pp. 2521-2535.

Enlaces refback

  • No hay ningún enlace refback.