I’ve been trying to keep this blog updated as I move through the PhD program at the UMD iSchool. Sometimes it’s difficult to share things here because of fear that the content or ideas are just too rough around the edges. The big assumption being that anybody even finds it, and then finds the time to read the content.

As with most PhD programs the work is leading up to the dissertation. I’m finishing my coursework this semester and so I have put together a prospectus for the research I’d like to do in my dissertation. I’m going to spend the next 8 months or so doing a lot of background reading and writing about it, in order to set up this research. I imagine this prospectus will get revised some more before I share it with my committee, and the trajectory itself will surely change as I work through it. But I thought I’d share the prospectus in this preliminary state to see if anyone has suggestions for things to read or angles to take.

Many thanks to my advisor Ricky Punzalan for his help getting me this far.


Appraisal Practices in Web Archives

It is difficult to imagine today’s scientific, cultural and political systems without the web and the underlying Internet. As the web has become a dominant form of global communications and publishing over the last 25 years we have witnessed the emergence of web archiving as an increasingly important activity. Web archiving is the practice of collecting content from the web for preservation, which is then made accessible at another part of the web known as a web archive. Developing record keeping practices for web content is extremely important for the production of history (Brügger, 2017) and for sustaining the networked public sphere (Benkler, 2006). However, even with widespread practice we still understand very little about the processes by which web content is being selected for an archive.

Part of the reason for this is that the web is an immensely large, decentralized and constantly changing information landscape. Despite efforts to archive the entire web (Kahle, 2007), the idea of a complete archive of the web remains both economically infeasible (Rosenthal, 2012), and theoretically impossible (Masanès, 2006). Features of the web’s Hypertext Transfer Protocol (HTTP), such as code-on-demand (Fielding, 2000), content caching (Fielding, Nottingham, & Reschke, 2014) and personalization (Barth, 2011), have transformed what was originally conceived of as a document oriented web into an information system that delivers information based on who you are, when you ask, and what software you use (Berners-Lee & Fischetti, 2000). The very notion of a singular artifact that can be archived, which has been under strain since the introduction of electronic records (Bearman, 1989), is now being pushed to its conceptual limit.

The web is also a site of constant breakdown (Bowker & Star, 2000) in the form of broken links, failed business models, unsustainable infrastructure, obsolescence and general neglect. Ceglowski (2011) has estimated that about a quarter of all links break every 7 years. Even within highly curated regions of the web, such as scholarly publishing (Sanderson, Phillips, & Sompel, 2011) and jurisprudence (Zittrain, Albert, & Lessig, 2014) rates of link rot can be as high as 50%. Web archiving projects work in varying measures to stem this tide of loss–to save what is deemed worth saving before it becomes 404 Not Found. In many ways web archiving can be seen as a form of repair or maintenance work (Graham & Thrift, 2007 ; Jackson, 2014) that is conducted by archivists in collaboration with each other, as well as with tools and infrastructures that support their efforts.

Deciding what to keep and what gets to be labeled archival have long been a topic of discussion in archival science. Over the past two centuries archival researchers have developed a body of literature around the concept of appraisal, which is the practice of identifying and retaining records of enduring value. The rapid increase in the amount of records being generated, which began in the mid-20th century, led to the inevitable realization that it is impractical to attempt to preserve the complete documentary record. Appraisal decisions must be made, which necessarily shape the archive over time, and by extension our knowledge of the past (Bearman, 1989 ; Cook, 2011). It is in the particular contingencies of the historical moment that the archive is created, sustained and used (Booms, 1987 ; Harris, 2002). The desire for a technology that enables a complete archival record of the web, where everything is preserved and remembered in an archival panopticon, is an idea that has deep philosophical roots, and many social and political ramifications (Brothman, 2001 ; Mayer-Schönberger, 2011).

Notwithstanding these theoretical and practical complexities, the construction of web archives presents new design opportunities for archivists to work in collaboration with each other, as well as with the systems, services and bespoke software solutions used for performing the work. It is essential for these designs to be informed by a better understanding of the processes by which web content is selected for an archive. What are the approaches and theoretical underpinnings for appraisal in web archiving as a sociotechnical appraisal practice? To lay the foundation for answering this question I will be reviewing and integrating the research literature in three areas: Archives and Memory, Sociotechnical Systems (STS), and Praxiography.

Clearly, a firm grounding in the literature of appraisal practices in archives is an important dimension to this research project. Understanding the various appraisal techniques that have been articulated and deployed will help in assessing how these techniques are being translated to the appraisal of web content (Maemura, Becker, & Milligan, 2016). Particular attention will be paid to emerging practices for the appraisal of electronic records and web content. Because the web is a significantly different medium than archives have traditionally dealt with it is important to situate archival appraisal within the larger context of social or collective memory practices (Jacobsen, Punzalan, & Hedstrom, 2013). In addition, the emerging practice of participatory archiving will also be examined to gain insight into how the web is allowing the gatekeeping role of the archivist.

Appraisal practices for web content necessarily involve the use of computer technology as both the means by which the archival processing is performed, and as the source of the content that is being archived. Any analysis of appraisal practices must account for the ways in which the archivist and the technology of the web work together as part of a sociotechnical system. While the specific technical implementations of web archiving systems are of interest, the subject of archival appraisal requires that these systems be studied for their social and cultural and effects. The interdisciplinary approach of software studies provide a theoretical and methodological approach for analyzing computer technologies as assemblages of software, hardware, standards and social practices. Examining the literature of software studies as it relates to archival appraisal will also selectively include reading in the related areas of infrastructure, platform and algorithm studies.

Finally, since archival appraisal is at its core a practice it is imperative to theoretically ground an analysis of appraisal using the literature of practice theory or praxiography. Praxiography is a broad interdisciplinary field of research that draws upon branches of anthropology, sociology, history of science and philosophy in order to understand practice as a sociomaterial phenomena. Ethnographic attention to topics such as rules, strategies, outcomes, training, mentorship, artifacts, work and history also provide an approach to empirical study that I plan on using in my research.

Timeline

2017-11-01 - Prospectus Draft

2017-12-01 - Prospectus Final Draft

2017-12-15 - Committee Review

2018-01-15 - Committee Approval Meeting

2018-09-01 - Proposal Final Draft

2018-10-01 - Proposal Defense

Reading

Archives and Memory

Anderson, K. D. (2011). Appraisal Learning Networks: How University Archivists Learn to Appraise Through Social Interaction. Los Angeles: University of California, Los Angeles.

Bond, L., Craps, S., and Vermeulen, P. (2017). Memory unbound: tracing the dynamics of memory studies. New York: Berghahn.

Bowker, G. C. (2005). Memory practices in the sciences. Cambridge: MIT Press.

Caswell, M. (2014). Archiving the Unspeakable: Silence, Memory, and the Photographic Record in Cambodia. Madison, WI: University of Wisconsin Press.

Daston, L., editor (2017). Science in the archives: pasts, presents, futures. Chicago: University of Chicago Press.

Gilliland, A. J., McKemmish, S., and Lau, A., editors (2016). Research in the Archival Multiverse. Melbourne: Monash University Press.

Halbwachs, M. (1992). On collective memory. Chicago: University of Chicago Press.

Hoskins, A., editor (2018). Digital memory studies: Media pasts in transition. London: Routledge.

Kosnik, A. D. (2016). Rogue Archives: Digital Cultural Memory and Media Fandom. Cambridge: MIT Press.

Van Dijck, J. (2007). Mediated memories in the digital age. Palo Alto: Stanford University Press.

Sociotechnical Theory

Berry, D. (2011). The philosophy of software: Code and mediation in the digital age. New York: Palgrave Macmillan.

Bowker, G. C. and Star, S. L. (2000). Sorting things out: Classification and its consequences. Cambridge: MIT Press.

Brunton, F. (2013). Spam: A shadow history of the Internet. Cambridge: MIT Press.

Chun, W. H. K. (2016). Updating to Remain the Same: Habitual New Media. Cambridge: MIT Press.

Cubitt, S. (2016). Finite Media: Environmental Implications of Digital Technologies. Durham: Duke University Press.

Hu, T. (2015). A Prehistory of the Cloud. Cambridge: MIT Press.

Dourish, P. (2017). The Stuff of Bits: An Essay on the Materialities of Information. Cambridge: MIT Press.

Edwards, P. N. (2010). A vast machine: Computer models, climate data, and the politics of global warming. Cambridge: MIT Press.

Emerson, L. (2014). Reading writing interfaces: From the digital to the bookbound. Minneapolis: University of Minnesota Press.

Kittler, F. A. (1999). Gramophone, film, typewriter. Palo Alto: Stanford University Press.

Galloway, A. R. (2004). Protocol: How control exists after decentralization. Cambridge: MIT Press.

Kelty, C. M. (2008). Two bits: The cultural significance of free software. Durham: Duke University Press.

Kitchin, R. and Dodge, M. (2011). Code/Space: Software and Everyday Life. Cambridge: MIT Press.

Rossiter, N. (2016). Software, Infrastructure, Labor: A Media Theory of Logistical Nightmares. Oxford: Routledge.

Russell, A. L. (2014). Open standards and the digital age. Cambridge: Cambridge University Press.

Practice Theory

Bourdieu, P. (1977). Outline of a Theory of Practice. Cambridge: Cambridge University Press.

Bräuchler, B. and Postill, J. (2010). Theorising media and practice. Bristol: Berghahn Books.

Foucault, M. (2012). Discipline & punish: The birth of the prison. New York: Vintage.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford: Oxford University Press.

Law, J. (2002). Aircraft stories: Decentering the object in technoscience. Durham: Duke University Press.

Schatzki, T. R., Cetina, K. K., and von Savigny, E. (2001). The practice turn in contemporary theory. Oxford: Routledge.

Wenger, E. (1998). Communities of Practice: Learning, meaning, and identity. Cambridge: Cambridge University Press.


References

Barth, A. (2011). HTTP state management mechanism (No. 6265). Internet Engineering Task Force. Retrieved from https://tools.ietf.org/html/rfc6265

Bearman, D. (1989). Archival methods. Archives and Museum Informatics, 3(1). Retrieved from http://www.archimuse.com/publishing/archival_methods/

Benkler, Y. (2006). The wealth of networks: How social production transforms markets and freedom. Yale University Press.

Berners-Lee, T., & Fischetti, M. (2000). Weaving the web: The original design and ultimate destiny of the world wide web by its inventor. San Francisco: Harper.

Booms, H. (1987). Society and the formation of a documentary heritage: Issues in the appraisal of archival sources. Archivaria, 24(3), 69–107. Retrieved from http://journals.sfu.ca/archivar/index.php/archivaria/article/view/11415/12357

Bowker, G. C., & Star, S. L. (2000). Sorting things out: Classification and its consequences. MIT Press.

Brothman, B. (2001). The past that archives keep: Memory, history, and the preservation of archival records. Archivaria, 51, 48–80.

Brügger, N. (2017). The web as history. (N. Brügger & R. Schroeder, Eds.). UCL Press. Retrieved from http://discovery.ucl.ac.uk/1542998/

Ceglowski, M. (2011, May). Remembrance of links past. Retrieved from https://blog.pinboard.in/2011/05/remembrance_of_links_past/

Cook, T. (2011). We are what we keep; we keep what we are: Archival appraisal past, present and future. Journal of the Society of Archivists, 32(2), 173–189.

Fielding, R. (2000). Representational state transfer (PhD thesis). University of California at Irvine.

Fielding, R., Nottingham, M., & Reschke, J. (2014). Hypertext transfer protocol (http/1.1): Caching (No. 7234). Internet Engineering Task Force. Retrieved from https://tools.ietf.org/html/rfc7234

Graham, S., & Thrift, N. (2007). Out of order understanding repair and maintenance. Theory, Culture & Society, 24(3), 1–25.

Harris, V. (2002). The archival sliver: Power, memory, and archives in South Africa. Archival Science, 2(1-2), 63–86.

Jackson, S. J. (2014). Media technologies: Essays on communication, materiality and society. In P. Boczkowski & K. Foot (Eds.),. MIT Press. Retrieved from http://sjackson.infosci.cornell.edu/RethinkingRepairPROOFS(reduced)Aug2013.pdf

Jacobsen, T., Punzalan, R. L., & Hedstrom, M. L. (2013). Invoking “collective memory”: Mapping the emergence of a concept in archival science. Archival Science, 13(2-3), 217–251.

Kahle, B. (2007). Universal access to all knowledge. The American Archivist, 70(1), 23–31.

Maemura, E., Becker, C., & Milligan, I. (2016). Understanding computational web archives research methods using research objects. In IEEE big data: Computation archival science. IEEE.

Masanès, J. (2006). Web archiving methods and approaches: A comparative study. Library Trends, 54(1), 72–90.

Mayer-Schönberger, V. (2011). Delete: The virtue of forgetting in the digital age. Princeton University Press.

Rosenthal, D. (2012, May). Let’s just keep everything forever in the cloud. Retrieved from http://blog.dshr.org/2012/05/lets-just-keep-everything-forever-in.html

Sanderson, R., Phillips, M., & Sompel, H. V. de. (2011). Analyzing the persistence of referenced web resources with Memento. Open Repositories 2011 Conference. Retrieved from http://arxiv.org/abs/1105.3459

Zittrain, J., Albert, K., & Lessig, L. (2014). Perma: Scoping and addressing the problem of link and reference rot in legal citations. Legal Information Management, 14(02), 88–99.