Map-Based Data Visualizations Reveal Patterns in Human Behavior

Billions of tweets. Millions of check-ins. How can we make sense out of such staggering amounts of data? One answer: Maps.

Social media companies and researchers use map-based visualizations to link virtual information with the physical world, surfacing patterns of human behavior that dazzle and educate.

Twitter’s data science team recently visualized all geotagged tweets in two and three dimensions. Both reveal where tweets concentrate, two-dimensional geography maps through color and three-dimensional topography graphics through peaks and valleys.

Spikes of tweets tower over New York.

Spikes of tweets tower over New York. Screenshot from Twitter blog. (Click image to see original.)

Foursquare created a zoomable map of 500 million check-ins and time-lapse videos of check-in data from New York and Tokyo. For the introspective or quantified selfers, a company called Etch will create a personalized map of individual Foursquare users’ check-ins.

Yelp used keywords from reviews on its site to visualize restaurant types in various cities. (For patio dining in Boston, head to Harvard Square or the Back Bay and prepare to run into hipsters.)

This visualization technique extends beyond real-time social media data. The New York Times mapped user suggestions of quiet spaces in the city. The Guardian displayed an interactive map of global protests that occurred this year. Last year, University of Illinois researcher Kalev Leetaru mapped the sentiment of Wikipedia articles to show emotions around history over the past two centuries.

A map of global protests in 2013.

Global protests in 2013. Map by John Beieler and Josh Stevens, screenshot from The Guardian. (Click image to see post.)

In his book Rewire: Digital Cosmopolitans in the Age of Connection, Ethan Zuckerman compares infrastructure maps, which show what’s possible, to flow maps, which show what occurs. Unsurprisingly, most of the data shown on these maps hews closely to physical borders and man-made developments such as roads. But these maps also reveal the contours of behavior: where people like to announce their presence or grab a bite to eat.

Such maps can help us better understand cities, society, and human behavior, said Visualizing.org founder Adam Bly in a 2011 South by Southwest Interactive presentation. Comparisons of where tourists and locals snap photos help urban planners, business owners, or local chambers of commerce interested in economic development. Mapping fast food and healthy food locations offers insight to public health officials, teachers, and policy makers who want to ensure access to nutritious food. Analyzing mobile phone data and realizing that human movement is 93 percent predictable influences where public transportation or energy grids should go.

Visualization of photos taken in Boston

Visualization of photos taken by locals (blue) tourists (red) and both (yellow) in Boston. Eric Fischer/Flickr, CC BY-SA 2.0

As cheap data storage abounds and visualization tools proliferate, maps offer a window into how humans live, in addition to guidance on how to get around.

Citizen Sensing and Crisis Informatics: Twitter and Disaster Response

In a piece published in May in Smithsonian, “The World According to Twitter, in Maps,” Twitter use in the Western hemisphere was compared to electrification and lighting use. Studies reveal remarkably similar rates, such that a map illuminated by tweets looks very much like a satellite image of artificial light use. It seems ambitious to suggest Twitter will become as ubiquitous as light, but the findings are nonetheless illuminating. Growing global Internet saturation and increasing Twitter usage might tell us a great deal about the relationship of humans and technology, but how might Twitter analysis shed light on our relationships with each other and the environment?

Since it started as a platform designed for cell phone use in 2006, Twitter has had an arguably resounding presence in the world. An estimated 554 million registered users send about 9,000 tweets per second and produce 58 million tweets each day on average. In people’s everyday lives and at the level of national and regional politics in places like Egypt and Turkey, the microblogging service is redefining relationships, invigorating information sharing, and shifting power structures, both online and offline.

The vast number of tweets and other user-generated bits of content online has prompted new approaches to data analysis, including “data philanthropy,” which claims to use big data to mitigate crisis and potentially avert social, financial and environmental disasters. In April, the Skoll World Forum brought together world experts on Big Data and its application, including the  president of  non-profit technology company Benetech, who explained:

Massive amounts of data are collected on the pollution in our cities and the changes in our climate. The more we use technology in our education and health systems, the more data we collect about how people learn and what keeps us healthy or makes us sick. These information-centric areas are built for Big Data – data that if better understood could help provide a pathway to maximize our human potential, instead of maximizing profits.

More than just a microblogging service on the Internet, Twitter is a platform for peer-to-peer education, a tool for real-time technology-mediated learning, and a potential gold mine for citizen sensing, which engages citizens as sensors in generating geo-referenced information. Twitter’s open API feature means that tweets are downloadable as raw data. This enables Twitter mapping – a form of research that turns topical tags and tweets into spaciotemporal nuggets that researchers analyze and apply toward myriads of social, political, and environmental situations, including humanitarian responses to natural disasters. Researchers at the Institute of Environment and Sustainability claim ever-growing access to broadband connections and enthusiastic adoption of social media has created “the potential of up to 6 billion human sensors to monitor the state of the environment, validate global models with local knowledge, contribute to crisis situations awareness and provide information that only humans can capture.” Human-machine relationships mediated through sites like Twitter offer optimal conditions for rapid dissemination of useful information, collective thought, and social action.

Crisis Informatics is a research field that combines targeted information extraction and information management with coordination efforts and sensemaking processes. It emerged from collaboration among social media, emergency responders, and computer sciences. In the case of a disaster, such as a flood or tornado, crisis informatics provide knowledge about similar past disasters and response strategies. The clearinghouse of prior knowledge helps first responders predict and manage events as they unfold. Additionally, crisis informatics offers details about the extent of damage and number of fatalities, which enables more focused and efficient emergency medical responses. Mapping projects like GDELT and tools like Twitris enable real-time monitoring and multi-faceted analysis across space, time, populations, networks, emotions, and sentiment. Numbers and locations are important, but data may reveal more than numbers.

Along these lines, recent analysis of more than 2 million “disaster tweets” related to the May 2013 Oklahoma tornado presents an interesting case study. As Patrick Meier of iREvolution details in his blogpost “Analyzing 2 Million Disaster Tweets for Oklahoma Tornado,” research conducted by Hemant Purohit and colleagues at the Qatar Computing Research Institute further blurs the lines between computer science, social science, and humanitarian work. Purohit and his colleagues found 7% of tweets in the first 48 hours after the tornado were related to helping meet people’s immediate survival needs- donation of water, food, and clothing. Certainly, such findings reveal Twitter users’ humanitarian intentions, but they do not reveal whether the people in need actually received the supplies and services or how they fared beyond the initial 48 hours. How the individuals and communities affected by the tornado are doing today are questions for further ethnographic study that would compliment the rich statistical analysis we have and dig deeper into the relationship of Internet and society.

Learn more about Twitter analysis in a study just released by QCRT that looks at the confluence of crisis mapping, citizen sensing, and social media through the lens of citizens’ roles in coordinating crisis response.

How Facebook Can Prompt Real-Life Action

This is a guest post.

Given its role in the Arab Spring, many people have emphasized Facebook as an effective tool for online activism. There’s much debate on the ability of Facebook to effect social change, but a handful of campaigns by the social networking site over the past several years have demonstrated just how powerful social media might be in prompting real-life actions.

On May 1, 2012, Facebook rolled out a new feature allowing users to share their status as an organ donor on their timeline. A study published last month in the American Journal of Transplantation shows that the experiment coincided with an incredible spike in organ donor sign-ups in the United States—13,012 on the first day of the campaign, or 21 times the daily average number of registrations. The organ donor registration rate remained higher than average for nearly two weeks.

Although the registration rate tailed off 12 days later, it was still two times higher than the average baseline rate by the end of the study period. By the end of two weeks, the number of new registrations reached nearly 40,000. In an article in Slate, study author and Johns Hopkins associate professor Andrew Cameron said that “Having [organ donor registration] be on Facebook makes it easier for people.” The next step will be to find ways to sustain the gains in donor sign-ups. As noted by Cameron, “we need to find a way to keep the conversation going”.

A second study, published last fall in Nature, showed similar results with respect to voter turnout.  The study authors worked with Facebook to randomly display to Facebook users either: 1) a message encouraging them to vote along with a link to polling places, an “I voted” button to click, up to six profile pictures of friends who had clicked the same button, and a total count of all friends who had reported voting; 2) the same message, without the photos or friend counter; or 3) no message.  By examining voting records, the authors were able to determine that users who received the first message were 0.39 percent more likely to vote than users who received no message—an effect the authors say led directly to an increase in voter turnout by 60,000.

Early last year, Peter Leone, a professor of medicine with University of North Carolina, began experimenting with Facebook as a tool for predicting and preventing STD transmission.  While working with patients with HIV and syphilis, Leone concluded that the friend networks people have could reveal patterns about the spread of STDs. He reasoned that a person’s circle of Facebook friends often have similar risk-taking patterns, and are best way to spread information about the risk of infection—and that potentially, using Facebook to prompt people to be tested and encourage them to share information about testing could help destigmatize the process.

While debates continue about the affect of social media on events in Egypt, Turkey, Syria, and elsewhere, these early studies show that in some cases, what users see online can influence their real-world behavior.

#imweekly: July 1, 2013

North & South Korea
Hackers brought down several government and news websites in North and South Korea on June 25, the anniversary of the start of the Korean War. Online security company Symantec traced parts of this attack, as well as four years of cyberattacks on South Korea, to the DarkSeoul Gang. Symantec could not determine where the group is based, but a South Korean government investigation points to North Korea. It is unclear who is responsible for the attacks that hit North Korea on Tuesday, but the hacker group Anonymous had said via Twitter it would attack sites in that country, according to the New York Times.

Bahrain
A Bahraini court sentenced 17-year-old high school student Ali Faisal Alshofa to one year in prison after accusing him of posting a tweet that insulted the country’s king on the account @alkawarahnews. Alshofa denied affiliation with the Twitter account, which appeared to keep operating while he was detained and on trial. Over the past year, courts have sentenced twelve people in Bahrain to a total of 106 months in prison for information posted to social network sites.

Taiwan
Taiwanese netizens are protesting several amendments that could make it easier for the government to censor online content. A Copyright Act amendment would allow Taiwan’s IP office to review content reported as infringing copyright and order ISPs to block it. A National Security Law amendment would encourage people to report content they think harms national security. An amendment to the Telecommunications Act would also require ISPs to remove content that “disturbs public order and decent morals.” Bloggers compared these measures to the U.S. SOPA bill that Congress proposed in 2012 as well as the U.S. Department of Justice’s  investigation into Aaron Swartz, surveillance of the Associated Press, and prosecution of Bradley Manning.

Ecuador
Several provisions in a communication law that Ecuador’s National Assembly passed in June worry journalists and others concerned with freedom of expression. One article appears to lump together every type of media organization (e.g., public, private, and community organizations that provide any type of mass communication that can be replicated online) under the same regulations. Broad interpretation could hold a tweet to the same standards as a radio broadcast. While the law prohibits censorship, it also tasks a Superintendent’s Office for Information and Communications with overseeing the media. Finally, the law holds third parties accountable for comments posted on their site unless site owners monitor comments or require users to identify themselves.

#IMweekly is a weekly round-up of news about Internet content controls and activity around the world. To subscribe via RSS, click here.

Law Enforcement and Mining Social Media: Where’s the Oversight?

A Pennsylvania detective had a nickname of a suspect, but no real name. He turned to Facebook, found a picture, and eventually apprehended the person. This anecdote from a Washington Post investigation into law enforcement use of facial recognition software illustrates how social media can be a boon for catching criminals.

As people share more about their thoughts and actions on social media and as algorithms grow more sophisticated, law enforcement’s ability to mine such information for clues into how to prevent crimes raises concerns of profiling and questions of oversight.

Law enforcement profiling predates social media. From 1956 to 1971, FBI counterintelligence program COINTELPRO tracked political organizations and their leaders, including Martin Luther King, Jr.

Recently, the ACLU’s “Mapping the FBI” project uncovered intelligence gathering that used racial and ethnic mapping. The project’s documentation includes a 2009 memo from the bureau’s Detroit office that called Michigan’s Middle East and Muslim community “prime territory for attempted radicalization and recruitment by” terrorist groups. The FBI’s reason: most State Department-labeled terrorist groups originate in the Middle East and South Asia.

In 2010 a 20-year-old Arab-American man in California found a tracking device on his car and learned the FBI had been surveilling him, a US citizen, for months, if not longer. Since 2011 the Associated Press has investigated the NYPD’s spying on Muslim communities, documenting what The Atlantic calls “horrifying effects” on both those surveilled, who have not been accused of any crimes, and on counterterrorism efforts as a whole—in six years, the program did not generate a single lead.

The NSA and British intelligence agency GCHQ collect raw Internet traffic that includes email, social media, and chats. US law enforcement agencies at all levels can obtain information from Internet and communication companies with court orders. But police don’t need permission to monitor what already flows freely on the web. Ars Technica reported on the London Metropolitan police’s extensive efforts to monitor social media:

For the past two years, a secretive unit in the Metropolitan Police has been developing the tools for blanket surveillance of the public’s social media conversations. Operating 24 hours a day, seven days a week, a staff of 17 officers in the National Domestic Extremism Unit (NDEU) has been scanning the public’s tweets, YouTube videos, Facebook profiles, and anything else UK citizens post in the public online sphere.

Several commercial tools exist to monitor social media streams, and companies actively market them to law enforcement.  Police departments at the University of Maryland, Hampton University, and the city of Boca Raton, Florida use tools from the Virginia-based technology company ECM Universe to surveil social media users and analyze the text of their messages. A brochure touts that with the tool,

[A] city can monitor activist groups who are using social media to organize their efforts on the ground and receive alerts in a matter of minutes from the time of the postings when dangerous radical elements emerge from the crowd.

Such language underscores the need for oversight on how to use information gathered from social media. Participating in an activist group is not a criminal activity, and “dangerous radical elements” do not emerge at every activist meeting.

US law enforcement generally needs a reason and court permission to investigate someone. Predictive analytics involves mining data to look for undetected or unconsidered patterns. Agencies “don’t necessarily know what they need to monitor on Twitter,” software company SAS wrote in a paper detailing its tools, one of which maps people’s friends and followers on Facebook and Twitter. Users maintain hundreds of connections on these sites, including people they may not have contacted for years or people they don’t even know. To what extent will a person’s connections implicate them?

Predictive policing has helped police departments lower crime. But such efforts used previously reported and anonymized crime data; inclusion of social media data adds another dimension of concern. People don’t know what governments are doing with troves of social media data, and people can’t see the algorithms that police use to fight crime. As Evgeny Morozov wrote, “If no one can examine the algorithms…we won’t know what biases and discriminatory practices are built into them.”