Abstracts

Using the Social Web to Explore the Online Discourse and Memory of the Civil War

July 19, 2013, 10:30 | Short Paper, Embassy Regents F

Overview

Some humanities scholars argue that the social web “poses a grave threat to the humanities because it lacks the depth, nuance and permanence that make genuine, meaningful interactions about the human condition possible.” They fear that “we risk becoming a world without the comprehensive communication tools needed to keep the humanities alive” (Adamek 2010). Even those who embrace social media all too often dismiss it as a tool that is used primarily for community engagement and self-promotion (Terras 2012). Yet the social web — which we broadly define as the array of technologies that allow individuals to post their thoughts, pictures, and comments in a public forum — when coupled with recent advances in cloud computing, data management and statistical/visual analysis offers significant potential to explore new and enduring humanities questions.

Through careful, rigorous analysis, we believe that the social web affords opportunities for the humanities to realize a contemporary, nuanced understanding of how the public believes our past informs modern society. For example, although there is a strong consensus amongst historians, the broader American public remains conflicted, divided, and confused about the causes of the Civil War. This is evident in any number of recent public polls, such as an April 2011 CNN poll which found that 42% of Americans believed that cause was something other than slavery, and the April 18, 2011 issue of Time, whose cover said “Why We’re Still Fighting the Civil War: The Endless Battle over the War’s True Cause Would Make Lincoln Weep” (CNN 2011; von Drehle 2011). Where polls provide a snapshot of Americans’ views, analysis of online discourse and social media activity affords opportunities to explore modern attitudes towards issues surrounding the Civil War.

Hence, against the backdrop of the on-going sesquicentennial commemorations, this paper examines the intersection of humanities, social sciences, social media, and computing to probe enduring questions around the legacy of the Civil War. To do so, we will illustrate how an ongoing study that uses the Civil War serves as a testbed for examining and developing techniques to conduct traditional types of humanistic inquiry in the context of the social web. Our results demonstrate how careful analysis of the online discourse that occurs across the social web enables deeper understanding of how the Lost Cause Ideology continues to recast the origins of the war and minimize the role of slavery. Simultaneously, we explore how stereotypes of Southerners continue to be propagated and used to shape a memory of the Civil War.

To conduct this study, we demonstrate how contemporary tools developed by industry may be used to analyse the social discourse around the Lost Cause and stereotypes of Southerners. Most scholars researching the social web rely primarily on the use of single keyword search and user-specified hashtags to narrow the size of their datasets to only posts that are immediately relevant to the topic under discussion (for example, Graham 2012; Ross, et al. 2010). However, these methods can be inadequate or problematic for understanding widely diffused social phenomena. For example, when attempting to study a topic such as the Civil War or especially a concept such as “the South” this methodology isn’t adequate as the majority of users do not tag their casual online conversations with these types of metadata nor do they restrict their conversations to a single platform (e.g., Twitter, Facebook etc.). To elicit a holistic view of conversations on the social web, the researcher has to employ strategies that reach across social media platforms and go beyond simple hashtag sets. To overcome these limitations, we leverage software originally developed for business analytics to aggregate content from across the social web (Radian6, 2012) — our search includes not just Twitter and Facebook, but also content from sources such as mainstream news sites, blog posts, and forum posts.

Method

Through our analysis of the social web, we highlight issues and opportunities for scholars who work at the intersection of the humanities, social science, social media, and computing. One issue that must be overcome is that, on a global scale, there are many Souths: not only must we find a way of eliminating references to countries or continents such as South Africa, South America, and the South Pole, we must also filter out conversations related to places such as the South Side of Chicago and even the television show South Park, none of which would be relevant for this research. Similarly, while most Americans would refer to the American Civil War as simply “The Civil War,” there are many similarly named historic and ongoing conflicts that we must filter from our results. The solution is to create “topic profiles” — collections of words that fall into one of three categories: words or phrases that must be present in a post to be included in the results; phrases that must be present along with words from the first category to help categorize results into different topics; and, finally, phrases that, if present in a post, will result in that post being excluded from our results. So, when conducting research into online discussions of southern gender ideals, we create a topic profile similar to the following:

  • 1. Posts that CONTAIN any of the following: “the south”, “thesouth”, “southern”
  • 2. AND CONTAINS any of the following: “belle”, “lady”, “gentleman”, …
  • 3. But DOES NOT CONTAIN any of: “south pole”, “south park”, “south side”, “south africa”, …

By using keywords that were selected following a survey of users of the H-South discussion list, a request from leading scholars (including several former presidents of the Southern Historical Association, of the American South, surveys of a graduate and an undergraduate history class on Southern History at Clemson University, as well as a non- random sample of black and white southerners who were not academics, we have constructed topic profiles for five different themes that might be used to segment contemporary discussion of the South online: Southern Culture (which includes concepts such as honor, hospitality, and accents), Southern Food, Discussions of Gender, Religion, and Southern History. This set of keyword groups results in approximately 300,000 unique hits from the social web each month. Figure 1 illustrates the relative volume of each keyword group during September 2012 and shows how the number of hits varies for each topic across the month. Similar keyword groups have also been constructed for specific events that occurred during the Civil War, such as the attack on Fort Sumter, the Emancipation Proclamation, and the Battle of Gettysburg.

Figure 1:
Relative Volume of Five Topic Profiles for September 2012

Analysis

Drilling into this data reveals numerous insights into how users of the social web conceptualize the South and what topics are of interest. Figure 2(a), for example, presents a word cloud of the major terms associated with civil war commemorations, again for the month of September 2012. This word cloud immediately reveals several potential avenues for further investigation, including the centrality of Gettysburg to the online discourse surrounding the Civil War, but also suggests other useful topics to explore. Indeed, although word clouds have frequently been criticized for divorcing words from their context (Harris, 2011), we are able to maintain the connection between the words represented in a cloud and the underlying data. Therefore, if we are interested in further exploring the appearance of the word “history”, we can, as shown in Figure 2(b), drill further into the data to reveal new levels of insight to our topics. At all times, the original posts remain available both to view at an individual level to ensure the relevancy of content and to download for further textual and statistical analysis with dedicated software packages such as R and Gephi.

Figure 2:
(a) Word Cloud of Frequently Used Words in the Civil War Profile for September 2012; (b) The same word cloud, instead focused on the word “history”

We can also analyse demographic information, such as age, gender, and location, from the posts we have collected, allowing for a more nuanced understanding of our results. Figure 3 illustrates the location of users in the United States whose posts and tweets were collected from the Civil War topic profile over a seven-day period in mid- September 2012, while Figure 4 presents word clouds of the conversations that occurred in our Civil War topic profile on the 150th Anniversary of the Battle of Antietam by (a) people located in Maryland; (b) those in North Carolina; and (c) users aged between 36 and 45. These charts reveal not only the geographic distribution and volume of conversations, but also regional and demographic differences in the topic, tone, and scope of this online discourse.

Figure 3:
Geographic Distribution of Unique Hits from the Civil War Topic Profile in September 2012. A darker shade of orange indicates a greater number of posts from that state.

Figure 4:
Word Cloud of conversations that occurred on the 150th Anniversary of the Battle of Antietam by (a) people located in Maryland; (b) those in North Carolina; and (c) users aged between 36 and 45.

References

Adamek, D. (2010). Social Media Flaws and the Humanities. Valley Advocate, 13 December. http://www.valleyadvocate.com/blogs/home.cfm?aid=12863 (accessed 10 October 2012).
CNN Political Unit. (2011.) Civil War Still Divides Americans. CNN, 11 April. http://politicalticker.blogs.cnn.com/2011/04/12/civil-war-still-divides-americans/ (accessed 11 April 2011).
Graham, S. (2012). Mining the Open Web With Looted Heritage — Draft. Electric Archaeology, 8 June 2012. http://electricarchaeology.ca/2012/06/08/mining-the-open-web-with-looted-heritage-draft/ (accessed 15 October 2012).
Harris, J. (2011). Word Clouds Considered Harmful. Nieman Journalism Lab, 13 October. http://www.niemanlab.org/2011/10/word-clouds-considered-harmful/ (10 October 2012).
Radian6 (2012). http://www.radian6.com. accessed 10 October 2012.
Ross, C., M. Terras, C. Warwick, and A. Welsh. (2010). Pointless Babble or Enabled Backchannel: Conference Use of Twitter by Digital Humanists. in2010 Digital Humanities Conference. http://www.ucl.ac.uk/infostudies/claire-ross/Digitally_Enabled_Backchannel.pdf
Terras, M. (2012). The Impact of Social Media on the Dissemination of Research: Results of an Experiment. Journal of Digital Humanities, 1(3) http://journalofdigitalhumanities.org/1-3/the-impact-of-social-media-on-the-dissemination-of-research-by-melissa-terras/ (accessed 10 October 2012).
Von Drehle, D. (2011). 150 Years After Fort Sumter: Why We're Still Fighting the Civil War. Time, 18 April. http://www.time.com/time/magazine/article/0,9171,2063869,00.html (accessed 18 April 2011).