Culturnomics 2.0: News For The Next Century


by Tim Moran

Culturomics 2.0 is upon us, thanks to the power of supercomputing and the powerful brain of Kalev Leetaru, assistant director for Text and Digital Media Analytics at the Institute for Computing in the Humanities, Arts, and Social Science at the University of Illinois, and Center Affiliate of the National Center for Supercomputing Applications.

The first incarnation of Culturnomics, explains Leetaru, explored "broad cultural trends through the computerized analysis of vast digital book archives, offering novel insights into the functioning of human society". All well and good; yet this represented a "digested history," not something living and functional.

Leetaru's Culturomics 2.0 changes that.

He writes: "News is increasingly being produced and consumed online, supplanting print and broadcast to represent nearly half of the news monitored across the world today by Western intelligence agencies. Recent literature has suggested that computational analysis of large text archives can yield novel insights to the functioning of society, including predicting future economic events. Applying tone and geographic analysis to a 30-year worldwide news archive, global news tone is found to have forecast the revolutions in Tunisia, Egypt, and Libya, including the removal of Egyptian President Mubarak, predicted the stability of Saudi Arabia (at least through May 2011), estimated Osama Bin Laden's likely hiding place as a 200-kilometer radius in Northern Pakistan that includes Abbotabad, and offered a new look at the world's cultural affiliations."

Leetaru contends the news is much more than facts and details--it offers a wealth of "cultural and contextual influences" that strongly impact how events are framed for a given outlet's audience. This, he believes, offers a window into national consciousness. Watching and measuring this "tone," as he calls it, in real-time can help forecast many different kinds of broad social behaviors.

In its original form, Culturomics treats each word and phrase as a generic object--there's no associated meaning--and it merely measures changes in the frequency of its usage over time.

The Culturomics 2.0 approach as introduced by Leetaru, "focuses on extending this model by imbuing the system with higher-level knowledge about each word, specifically focusing on 'news tone' and geographic location, given their importance to the understanding of news coverage." He has even translated the results of this data in mappable geographic references (see image below, a global geocoded tone of all Summary of World Broadcasts content, January 1979-April 2011 mentioning "bin Laden".)

Mapping news.JPG

. Studying news in this way effectively "passively crowdsources" the global mood about each country in the world, which can offer highly accurate short-term forecasts of national stability.

And how is all of this data processed? In part by the National Science Foundation using Teragrid resources on the Nautilus SGI UV supercomputer at the National Institute for Computational Sciences.

This synergy of online news (about which we know a little something), supercomputing (about which we know very, very little), and big-data mining (about which we know even less), nevertheless, appears to be a truly 21st century tool that is only beginning to be explored and utilized.