Data Driven Documentation of Digital Humanities Discourse

July 17, 2013, 15:30 | Centennial Room, Nebraska Union

This poster presents a work-in-progress investigating the use of social media in scholarly communication and the role such technologies play in the formation of scholarly communities. The digital humanities have emerged as a focal point for debates about the impact of information technology in the humanities. [1] While the digital humanities has its roots in the computational processing of text, [2] the landscape today is far richer and more complicated than early practitioners of humanities computing could have ever imagined (except perhaps Father Busa whose grand visions have yet to be realized). [3] Today, the digital humanities encompasses transformative methods of inquiry, radically new kinds of research objects, and potentially destabilizing shifts in scholarly publishing. However, beyond a metamorphosis of method, object, and account, digital humanities leverage information communications technology in unique ways to constitute themselves as a community in-formation.

Social media, especially blogs, have been eagerly adopted by the digital humanities community. [4] Blogs are pregnant with promise and peril as platforms for serious (and silly) scholarly communication. They are quick for publishing, support multimedia, and enable rapid interaction, yet, the low barrier of entry and lack of peer review puts blog's credibility and quality in doubt. Outside of the digital humanities, blogs are not necessarily seen as modes of serious scholarly communication, instead they are considered a place for gossip. [5] Such totalizing perspective ignores the diverse uses and meanings of blogs for scholars in a variety of disciplines. [6] The value of scholar's blogs and the vibrant communities of discourse around them should not be understated or ignored.

The seriousness of blogs as a mode of scholarly communication is evident in the creation of initiatives such as Digital Humanities Now [7] and the Journal of Digital Humanities. [8] These projects treat blogs as legitimate forms of proto-scholarship and provide a filter function to the community; finding high quality discourse within their curated selection of digital humanities blogs, the Compendium of digital humanities. As a model for scholarly publishing, the Journal of Digital Humanities presents a reversal of the traditional dynamics of scholarly discourse. Technology has flipped the flow of scholarly communication from one of scarcity to surfeit. It is impossible to keep up with the flood of blogs and Tweet, yet, scholars ignore this "cool kids table" at their peril. [9] In the face of such information overload, scholarly communities must change not only their means of knowledge production, but their information seeking behavior and the ways in which membership and identity are constituted as well.

This study presents a data driven analysis of scholarly discourse focusing on the sociotechnical dynamics of blogs and their role constituting the digital humanities as a community-in-formation. There have been a few data driven approaches to understanding the digital humanities, Melissa Terras' beautiful infographic, Quantifying the Digital Humanities, was an important first step towards surveying the community writ-large. [10] Matthew Jockers and Elijah Meeks have both done some initial work combining topic modeling and digital humanities blogs. Jockers analyzed one year's worth of Day of DH blog posts [11] and Meeks produced a model and visualizations of a variety of texts discussing the question "What are the digital humanities?" [12] This study continues these initial works with a broader breadth of data and deeper analysis of the results.

This poster presents initial findings and an innovative mixed methodological approach combining topic modeling, [13] a form of computational text mining, with grounded theory, [14] a method for developing analytical concepts from interpretivist social science. This mixture of methods enables both a "distant reading" of a vast textual corpus while also rigorously reading individual texts to better analyze and articulate the content of scholarly discourse. Leveraging the Compendium of Digital Humanities, [15] a curated list of blogs produced by Digital Humanities Now, I archive, mine, visualize, and interpret these communications paying special attention to the discursive work of community constitution.

The contribution of this poster is twofold. First, it presents a data driven landscape of scholarly communication. Using tools and techniques from information visualization, I represent a topic model of discourse on digital humanities blogs using a javascript visualization framework, Data Driven Documents. [16] Second, it presents a rigorous methodological procedure for the analyzing topic models rooted in an interpretivist qualitative analysis framework, grounded theory. Leveraging grounded theory this study informs our understanding of scholarly communities in-formation with an interpretive, grounded, empirical analysis of a computational model and its concomitant texts.


1. Gold, Matthew K., ed. (2012). Debates in the Digital Humanities. Univ Of Minnesota Press.

2. Hockey, S. "The History of Humanities Computing." A Companion to Digital Humanities. 2004.

3. Busa, R. "Foreword: Perspectives on the Digital Humanities." In A Companion to Digital Humanities. 2004.

4. Cohen, Dan. "Professors, Start Your Blogs." 2006. http://www.dancohen.org/2006/08/21/professors-start-your-blogs/

5. Saper, C. "Blogademia." Reconstruction 6, no. 4 (2006). http://www.citeulike.org/group/1736/article/1108357.

6. Hank, C. F. "Scholars and their Blogs: Characteristics, Preferences, and Perceptions Impacting Digital Preservation". University of North Carolina, 2011. http://ils.unc.edu/~wildem/ASIST2011/Hank-diss.pdf.

7. http://digitalhumanitiesnow.org

8. http://journalofdigitalhumanities.org/

9. Pannapacker, William. "Pannapacker at MLA: Digital Humanities Triumphant?" 2011. http://chronicle.com/blogs/brainstorm/pannapacker-at-mla-digital-humanities-triumphant/30915

10. Terras, Melissa. "Infographic: Quantifying Digital Humanities." 2012. http://melissaterras.blogspot.com/2012/01/infographic-quanitifying-digital.html

11. Jockers, Matthew. "Who's Your DH Blog Mate: Match-Making the Day of DH Bloggers with Topic Modeling" 2010. http://www.matthewjockers.net/2010/03/19/whos-your-dh-blog-mate-match-making-the-day-of-dh-bloggers-with-topic-modeling/

12. Meeks, Elijah. "Comprehending the Digital Humanities." 2011. https://dhs.stanford.edu/comprehending-the-digital-humanities/

13. Blei, D. M., A. Y. Ng, and M. I. Jordan. "Latent Dirichlet Allocation." The Journal of Machine Learning Research 3 (2003): 993–1022.

14. Glaser, Barney G., and Anselm Strauss. The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine Transaction, 1967. Charmaz, Kathy. Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis. 1st ed. Sage Publications Ltd, 2006.

15. http://tinyurl.com/kaj2vpl

16. http://d3js.org/