Carnegie Mellon: Signals from Twitter Mimic Traditional Public Opinion Polls -- THE Journal

Collaboration

Carnegie Mellon: Signals from Twitter Mimic Traditional Public Opinion Polls

By Dian Schaffhauser
05/17/10

Twitter could become the new mechanism for pollsters to gauge public opinion as natural language processing improves, according to research by Carnegie Mellon. A team of people from the School of Computer Science used simple text analysis on a billion microblog messages posted to Twitter during 2008 and 2009--posts averaging 11 words long--to identify messages about the economy or politics and then to find words within the text that indicated positive or negative sentiments.

Computer analysis of sentiments showed that they were fairly similar to those of well established public opinion polls, such as Consumer Sentiment (ICS) from Reuters/University of Michigan Surveys of Consumers, Pollster.com, and the Gallup Organization's Economic Confidence Index.

"The findings suggest that analyzing the text found in streams of tweets could become a cheap, rapid means of gauging public opinion on at least some subjects," the university reported in a statement.

The measurement of opinions derived from Twitter was much more volatile day to day than the polling data. But when the researchers "smoothed" the results by averaging them over a period of days, the results often correlated closely with the polling data, said Brendan O'Connor, a graduate student in Carnegie Mellon's Language Technologies Institute and one of the authors of the study. As an example, consumer confidence followed the same general slide through 2008 and the same rebound in February/March of 2009 as was seen in the poll data. The researchers said the ICS and Gallup data had a correlation of 86 percent over the period; the Twitter-generated sensibilities had between 72 percent and 79 percent correlation with the Gallup data, depending on the number of days averaged to smooth the data.

"With 7 million or more messages being tweeted each day, this data stream potentially allows us to take the temperature of the population very quickly," said Noah Smith, assistant professor of language technologies and machine learning in the School of Computer Science. "The results are noisy, as are the results of polls. Opinion pollsters have learned to compensate for these distortions, while we're still trying to identify and understand the noise in our data. Given that, I'm excited that we get any signal at all from social media that correlates with the polls."

"The Web is so mainstream now that there's no question that the Web is representative somehow of the population," O'Connor said. But pinning down Web demographics is still difficult, he noted, pointing out that Twitter traffic alone increased by a factor of 50 during the two-year span of the study.

Improved natural language processing tools, as well as query-driven analysis and use of demographic and time stamp data available on some social media sites, could increase the sophistication and reliability of microblog analysis, the researchers reported.

A paper on the topic, "From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series," will be presented at the International Conference on Weblogs and Social Media in Washington, DC in late May.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

Meta Assembling 'Superintelligence Group' to Pursue Artificial General Intelligence

Meta CEO Mark Zuckerberg is forming a team focused on achieving artificial general intelligence (AGI), amid internal dissatisfaction with the performance of its current AI offerings. The team, known internally as the superintelligence group, is part of a broader effort to enhance Meta’s AI capabilities.
Copilot Chat and Microsoft 365 Copilot to Become Available for Teen Students

This summer, Microsoft is expanding availability of its Copilot Chat and Microsoft 365 Copilot products for students aged 13 and older. Administrators will be able to grant access for students based on their institution's plans and preferences, the company announced in a blog post.
Data Privacy a Top Concern as Orgs Scale Up AI Agents

As organizations race to integrate AI agents into their cloud operations and workflows, they face a crucial reality: while enthusiasm is high, major adoption barriers remain, according to a new Cloudera report. Chief among them is the challenge of safeguarding sensitive data.
Google to Acquire Cloud Security Firm Wiz in $32 Billion Deal

Google has announced it will acquire cloud security startup Wiz for $32 billion. If completed, the acquisition — an all-cash deal — would mark the largest in Google's history.