Acompanhamento, Análise e Pesquisa de Opinião Pública e Sentimentos

Acompanhamento, Análise e Pesquisa de Opinião Pública e Sentimentos

 

This project has two main goals. The first is to design an opinion mining system capable of measuring, almost in real-time, sentiments vis-à-vis parties and political actors and the economy in the contents of both conventional web-based media (online newspapers) and the so-called social media (blogs and micro-blogs). The second is to use the data collected in such a way to explore and explain the relationship between trends in sentiments as expressed in the conventional media, the social media and the public opinion polls and surveys in Portugal.

The concept of "public opinion" that prevails in modern social scientific research consists of the aggregation of individual attitudes, preferences and beliefs, as captured by polls and surveys using randomly selected samples. This, however, should not lead us to believe that surveys can be the only theoretically and empirically relevant source of data for the study of "public opinion." On the one hand, as means to capture mass beliefs and attitudes, they have been facing increasing challenges in dealing with sources of bias in inferences, caused by coverage problems in telephone polls - the rise of "cell only" individuals and households - and rising non-response rates. On the other hand, understanding and explaining mass public opinion has always required more than the use of surveys. Media content is not only a relevant object of study on its own for communication scholars, but can also provide crucial insights into the very sources of mass public opinion. Are mass attitudes somehow explained by the media messages to which individuals are exposed? Are purely news-based or editorial contents equally influential, and is that distinction clear cut? Is "public opinion" driven by the cues and frames provided by "published opinion"? Or instead, are the views conveyed by elites and media agents affected by the preferences of mass publics? Answering such questions requires the collection of data beyond the provided by survey research.

The relationship between political and economic events, how citizens come to apprehend them and how they react to them has become more complex with the rise of the social media. Blogs and micro-blogs (such as Twitter) perform several functions in this relationship. First, they can constitute additional sources of politically relevant information and stimuli for citizens. Second, they are themselves sources of relevant information and stimuli for journalists and political elites, raising the possibility that social media messages and conversations indirectly influence public opinion way beyond that the size of their readership might suggest. Finally, they may provide a window into mass public opinion itself: although bloggers, micro-bloggers and those who engage in online communication are certainly not a representative cross-section of the population at large, the small but rapidly increasing research on the content of social media messages suggests that their frequency and tone provide valid indications of trends and even, in some cases, work as a leading indicator of electoral results.

As Drezner and Farrell note, one of the problems faced by scholars in this regard is that "the proper exploitation of this data requires skills and expert knowledge of a kind that social scientists frequently don't have". This project addresses this problem by constituting a truly multidisciplinary team composed by computers engineers, linguists, political scientists and economists with the technical and theoretical expertise required to meet the project's main goals.

 

Project POPSTAR - Public Opinion and Sentiment Tracking, Analysis and Research - PTDC/CPJ-CPO/116888/2010 - Financed by FCT

 

Estatuto: 
Proponent entity
Financed: 
Yes
Entidades: 
Fundação para a Ciência e Tecnologia
Keywords: 

Public Opinion;

 

Social Web Mining;

 

Online Sentiment;

 

Time series

 

This project has two main goals. The first is to design an opinion mining system capable of measuring, almost in real-time, sentiments vis-à-vis parties and political actors and the economy in the contents of both conventional web-based media (online newspapers) and the so-called social media (blogs and micro-blogs). The second is to use the data collected in such a way to explore and explain the relationship between trends in sentiments as expressed in the conventional media, the social media and the public opinion polls and surveys in Portugal.

The concept of "public opinion" that prevails in modern social scientific research consists of the aggregation of individual attitudes, preferences and beliefs, as captured by polls and surveys using randomly selected samples. This, however, should not lead us to believe that surveys can be the only theoretically and empirically relevant source of data for the study of "public opinion." On the one hand, as means to capture mass beliefs and attitudes, they have been facing increasing challenges in dealing with sources of bias in inferences, caused by coverage problems in telephone polls - the rise of "cell only" individuals and households - and rising non-response rates. On the other hand, understanding and explaining mass public opinion has always required more than the use of surveys. Media content is not only a relevant object of study on its own for communication scholars, but can also provide crucial insights into the very sources of mass public opinion. Are mass attitudes somehow explained by the media messages to which individuals are exposed? Are purely news-based or editorial contents equally influential, and is that distinction clear cut? Is "public opinion" driven by the cues and frames provided by "published opinion"? Or instead, are the views conveyed by elites and media agents affected by the preferences of mass publics? Answering such questions requires the collection of data beyond the provided by survey research.

The relationship between political and economic events, how citizens come to apprehend them and how they react to them has become more complex with the rise of the social media. Blogs and micro-blogs (such as Twitter) perform several functions in this relationship. First, they can constitute additional sources of politically relevant information and stimuli for citizens. Second, they are themselves sources of relevant information and stimuli for journalists and political elites, raising the possibility that social media messages and conversations indirectly influence public opinion way beyond that the size of their readership might suggest. Finally, they may provide a window into mass public opinion itself: although bloggers, micro-bloggers and those who engage in online communication are certainly not a representative cross-section of the population at large, the small but rapidly increasing research on the content of social media messages suggests that their frequency and tone provide valid indications of trends and even, in some cases, work as a leading indicator of electoral results.

As Drezner and Farrell note, one of the problems faced by scholars in this regard is that "the proper exploitation of this data requires skills and expert knowledge of a kind that social scientists frequently don't have". This project addresses this problem by constituting a truly multidisciplinary team composed by computers engineers, linguists, political scientists and economists with the technical and theoretical expertise required to meet the project's main goals.

 

Project POPSTAR - Public Opinion and Sentiment Tracking, Analysis and Research - PTDC/CPJ-CPO/116888/2010 - Financed by FCT

 

Objectivos: 
<p>1. The design of an opinion mining system able to harvest texts from web-based conventional media (news items in mainstream media sites) and social media (blogs and Twitter) and to process those texts, recognizing topics and political actors, analyzing relevant linguistic units, and generating indicators of both frequency of mention and polarity (positivity/negativity) of mentions to political actors and economic policies across sources, types of sources, and across time.</p><p>2. The statistical analysis of the relationship between the previous indicators both among themselves and with public opinion data collected by polls and surveys, testing hypotheses about the directionality of the relationship between conventional media and social media contents, as well as their relationship with mass public opinion.</p><p>Final products include a prototype of an opinion mining system specifically designed to treat Portuguese content, whose results will be made available to the public at large and will be used to investigate the importance of social media in Portugal as a source of political information and a means for politically relevant discussion. </p><p> </p>
State of the art: 
<p>The literature that is directly relevant for this project can be organized around two fundamental questions: 1.How can mainstream and social media contents be automatically extracted and measured in a valid and reliable way?</p><p>Content analysis of mass media has an established tradition in the social sciences, particularly in the study of effects of media messages, encompassing topics as diverse as those addressed in seminal studies of newspaper editorials (Lasswell et al. 1952), media agenda-setting (McCombs &amp; Shaw 1972), or the uses of political rhetoric (Moen 1990), among many others. By 1997, Riffe &amp; Freitag (1997), reported an increase in the use of content analysis in communication research and suggested that digital text and computerized means for its extraction and analysis would reinforce such trend. Their expectation has been fulfilled: the use of automated content analysis has by now surpassed the use of hand coding (Neuendorf 2002). The increase in the digital sources of text, on the one hand, and current advances in computation power and design, on the other, are making this development both necessary and possible, while also raising awareness about the inferential pitfalls involved (Hopkins &amp; King 2010).</p><p>One particularly promising avenue of research concerns the use of opinion mining (or sentiment analysis), i.e., the automatic extraction and representation of subjective content underlying texts (Pang &amp; Lee 2008). Different computational approaches have been explored to process sentiment in text, namely machine learning and linguistic based methods (Pang et al. 2002 and Choi &amp; Cardie 2008, respectively). In practice, algorithms often combine both strategies. Recently, O'Connor et al. (2010) showed that a simple sentiment detector can be effective in capturing trends on specific topics from Twitter messages. In Portugal, a preliminary study by Carvalho et al. (2010) on a collection of comments posted by the readers of a daily newspaper to a set of news articles covering the 2009 Portuguese parliamentary election debates shows that negative opinions tend to greatly outnumber positive opinions.</p><p>The sentiment classifiers and visualization software to be developed under this project will explore different strategies, taking into account the types and genres of opinionated text we want to process. Our task is made easier by the fact that we can rely on existing technology and resources developed to classify and collected news and social media by some of the proponents of POPSTAR, namely under the REACTION project (http://xldb.fc.ul.pt/wiki/Reaction).</p><p>2. What can we learn about public opinion and its formation by examining media and social media contents?</p><p>The relationship between media communication and mass level beliefs and attitudes has been a central concern of public opinion studies (for a comprehensive review, see Preiss 2007). The massive expansion of blogs and micro-blogs (such as Twitter) raises questions that seem, at first glance, quite similar. What segments of public opinion are exposed to blogs and use them as a source of information (Eveland &amp; Dylko 2007)? How do they rate them in terms of credibility as compared to other sources (Banning and Trammell 2006)? What determines exposure to political content contained in blog posts (Johnson et al. 2009)?</p><p>However, there are several obvious differences between social and conventional media that raise new relevant lines of inquiry, which are the ones we will pursue in POPSTAR. The first is related to the additional mechanisms through which social media contents might affect public opinions. Although social media contents still reach directly only a small and unrepresentative segment of the population, social media can influence public beliefs and attitudes indirectly, by shaping the agendas and views of journalists, politicians and other actors who communicate with broader audiences. Are the agenda and tone of conventional media outlets influenced by social media contents, or do blogs follow the conventional media agendas (Lloyd et al. 2006; Drezner and Farrell 2008; Wallsten 2010)?</p><p>A second main line of inquiry concerns the extent to which the contents of online communications through social media can serve as a leading indicator of changes in public opinion. Although those who actively engage in broadcasting messages through either blog or micro-blogs are likely to be an even narrower and more polarized cross-section of the population, social media messages conveyed by these independent agents can provide an &quot;aggregative function&quot; through which important political parameters of interest can be estimated (Munger 2008). Social media can be seen as systems of peer-production that can generate high quality information (Tapscott &amp; Williams 2007; Watts 2009). Tumasjan et al. (2010) show that the mere frequency of mentions of political parties in Twitter messages serves as a good predictor of electoral results. O'Connor et al. (2010) and Gonzalez-Bailon et al. (2010) explore the relationship between sentiment measures extracted from Twitter with consumer confidence and presidential job approval polls, suggesting that automatic sentiment detection of Twitter could monitor public opinion about popular topics.</p><p>POPSTAR proceeds along these two lines of inquiry. In what concerns the relationship between contents of mainstream media, blogs and micro-blogs, are the paths of mutual influence that have been detected in terms of their agendas also found when one focuses in other aspects of contents, such as the tone, polarity, and intensity of opinions? What are the answers to these questions when we compare blogging with micro-blogging? And how are indicators derived from the analysis of the polarity and intensity of mainstream and social media messages related to indicators derived from conventional survey and poll data on political and economic issues, such as approval of political leaders and parties and consumer confidence?</p>
Parceria: 
National network
Luís Aguiar-Conraria
Maria Eduarda Rodrigues
Mário Gaspar Silva
Matko Bosnjak
Paula Cristina Quaresma da Fonseca Carvalho

POPSTAR

Coordenador ICS 
Referência externa 
PROJ10/2012
Start Date: 
21/03/2012
End Date: 
20/03/2014
Duração: 
24 meses
Closed