The following is from a conference paper I presented on open access in the humanities.
Good Afternoon. Let’s jump right into this and talk about open access. We will define two terms:
Open source means sharing and making all the source code files of a project available online. This does not just include code, but images, video, audio, notes on process, and tracking file changes.
Open Access means permitting anyone with an internet connection and a link to view, copy, and/or change those files for their own use.
Now the two terms are often used interchangeably. But I want to focus today on open access because it implies accessibility. Today I am going to share my experience as a public historian doing open access work online, tell you all why humanities scholars are bad at open access, critique the utopic idealism of open access and its proposed global accessibility, and problematize the need for humanities scholars to start “doing” open access (and yes, let’s just say open access is an action phrase that you can “do”).
I originally subtitled this project “My Digital Graveyard” because I wanted to talk to you about the “failure” or at least “dead endedness” of many of my digital projects. I was going to use these projects as a case study. For instance, I have several versions of my masters research project that for one reason or another were abandoned for different ideas and ways of presenting history online. But my history of failing at open access begins much earlier. In 2011, my supervisor Shawn Graham enlisted me to scrape all of Prime Minister Mackenzie King’s diary entries on the Library and Archives site into text files to perform a “distant reading” – Looking at the patters and structure across the texts. We were going to plug these files into the text analysis tool . I began to look at the files and discovered they were all digitized as images. So we could not extract the text directly. Being 2011, we attempted to run the images through free, online OCR (Optical Character Recognition) generators which convert images of text into computer-readable text files. The OCR we used was horrible and the images were not exactly clean. The result was a huge mess of text. We abandoned the project.
I am not going to go into depth on my graveyard anymore. As with any research, from the time I applied for this conference to when I wrote this presentation, my process and thus ideas about open access changed. Instead, I want to begin with a comment from when I was a young digital humanities student in my undergrad. I had the privilege of being a scholar and blog about my research doing digital history work. I received a great comment on my post about the diary failure:
Josh responded to my post with a mantra I remind myself of at least several times a month: there is signal in the noise. On a metaphorical level, so much of open access is noise. But we can draw the signal from the cacophony of noise presented to us online.
Historians are bad at Open Access
I want to get into open access by stating that historians, and humanities scholars generally, are bad at open access. We are usually not to keen to share our work in its development phases. I think it mainly stems from a desire to produce original, individual work. Then, on the practical side, historians are not well trained in basic file naming conventions, data structures, and organization. Which are all essential for reproducable work. This is important because open access does not necessarily require much technical skill. You can create a blog and publish your material online.
None of this presentation is groundbreaking or new. In fact a lot of this is intuitive. And maybe I will just state some obvious points about open access throughout the presentation. But my goal is to push our critical thinking as humanities scholars from academia to the online world. Historians need to see themselves as entangled in online communities. Because it is easy to assume that “the digital” is some distant, separate world. It is easy for scholars not interacting online to hear the noise of “the digital” and lazily agree with ideas like “open access proliferates knowledge.” But when you really interact with this stuff, you see how far from the truth those repeated scripts can be, and how close to home these issues actually are. I will argue that “the digital” reflects a lot of our academic practices. Let’s pull some signal from this noise.
Open access: Coulda, Shoulda
I will now put forward two statements.
1. Open access cannot be entirely accessible.
Theoretically I can make my work entirely open source by providing all the information and data that I desire. But we run into several issues (without even confronting copyright; we can discuss this in the question period if you would like). Open access is about reproducibility. Both digital literacy as well as access to computers and networks effect a project’s accessibility. How can we say the internet is an open bastion of information when global access is limited by infrastructure and socio-economic positioning. Moreover, even if everyone had the potential to see my work, I doubt few people would actually want to. Nor would they see the value in it. Who, then, is my work actually open for? Primarily other historians and researchers. And there is nothing wrong with that fact. We must, however, recognize that any project claiming to be ‘open’ is limited by these complexities. Just because historical work is made available online does not make it automatically popular and transparent. In this regard, open access reproduces the structures of academia: essentially, now you get to allow infinitely more people the opportunity to not read your academic work! It is both ignorant and arrogant to assume open access can be entirely open.
2. Open access should not be entirely accessible.
This one might be a bit more obvious. From a practical standpoint, you are not allowed to make certain information available online. If you interview people, you likely have to transcribe and destroy the recordings. You choose what can be said. As historians, we all understand this. So the main point I want to make, then, actually inherits an important method from “traditional” historical work – that is, we should not make everything available. I do not then obviously believe in radical open access. And this is where I believe humanities scholars can do amazing work in open access. Because we understand how to meticulously fashion a narrative without simply stating everything we know. We understand the art of storytelling. As such, we can begin to apply our knowledge towards an art of open access.
What this art entails, then, is mostly to make our methods available and accessible. In other words, we want to allow others to view our workflow. For my masters research, I used GitHub – an online repository for code – to host the version control for my source code. This essentially means that every time I made changes, those changes were logged for everyone to see. I also kept an open research notebook online connected to my project website. I was able to push notes local on my computer to the web using – a static website generator written in the Python programming language.
I also wrote my research essay live online through a service called . Anyone could read my work from initial thoughts to its completion. Furthermore, unless information is only available in print or copyrighted, I can link out to my sources allowing easier access to them. We thus allow others to hold us accountable to our claims since our sources, method, and workflow can be tracked and reproduced. I opened myself and my work up to criticism and transparency online.
I have presented a very brief synopsis on the structure of open access in my master’s research. But I am working as a white male academic working on the web. Now, let’s problematize that perspective by discussing identity in open access.
Transparency Online: Working as a white male academic
In order to understand how we can as historians approach open access, we must trace its history. Open access has its roots in STEM fields (Science Technology Engineering and Math). Regarding identity, in my research, I confronted open access’s positivist trajectory of making as much material accessible online as possible. I used a lot of newspaper databases for my source material. Many services like ProQuest follow a particular philosophy of digitization carried over from STEM fields – that the text of a newspaper is just information and can be scraped, run through an Ocular Character Recognition Software, and output as text. From a historian’s perspective, however, the newspaper is a graphical artefact and its meaning changes on a computer screen when it is output as simple text or presented in a new context with new styles. Thus we must be aware of how much open access influences our narrative.
To that end, one’s identity obviously effects how we access material. The closed system I confronter – relying on the university’s newspaper database, and searching microfilm at the Library and Archives in Ottawa – is not the open access we dream of. Open access should be about easy access to information. I had access to resources through the school library. I used databases inaccessible to anyone outside of my institution. Reproducable work should be easy to access and historians must remember that we work within institutions with particular affordances.
But what if I was working with material available to anyone with an internet connection. Again, open access has a history that values the work and opinions of white, technically skilled men. We must therefore also be aware that our identity and experiences effect how we are perceived online. I can write live online and feel comfortable and safe with what I am doing, so how could I praise open access when the process marginalizes certain people, least forces them to hide their identity.
A great question was raised by one of the online communities that I contribute to – . Adam Crymble asked in the Programming Historian GitHub issues section how they can make their site more friendly for women to contribute to. This alone is an interesting use of GitHub’s issues section. Users and developers typically use this section of a repository to submit bugs, errors, and suggestions for a project. But the Programming Historian uses it here to discuss social issues surrounding their site.
One of the site’s editors, Miriam Posner, responded that “GitHub is an unfamiliar, opaque platform to many people, and women have well-supported reasons for declining to participate in unfamiliar digital platforms.” My identity affords me the opportunity to contribute to the Programming Historian via GitHub without really considering potential reprocussions. Posner continued, “I don’t actually think that we should abandon GitHub; I think it’s a great way to streamline our editorial processes and it’s been working well. It’s just that, realistically, we should understand that a lot of women are not going to be super-stoked about signing up for another online platform, participation in which may or may not expose them to all of the myriad offensive behaviors women encounter online.” Posner raises a great point here. Gender issues are not just an issue for the Programming Historian but for open access throughout online communities like GitHub. It is obvious that online bigotry and the potential for harrasement prevents certain people from contributing to open access projects. Several users commented that the issue page itself was under-represented by women because it was publicly visible. Crymble created an anonymous survey for anyone to contribute responses.
Several solutions resulted from this issue. One editor, Ian Milligan, suggested that editors work with people in the pre-submission stage through email, thus retaining contributors’ privacy. Other users created a mentorship gathering which, although it met in England, is an attempt to make safe digital spaces. Yet again, as several users noted, the contributions to that GitHub issue were mostly male. The editors and contributors of the Programming Historian have at least started a dialogue and offered potential solutions to these issues.
We must be critical of open access because it is so easy for proponents of it to accept it without criticism. It is easy for a white male on the internet to form ideas that the web is an open utopia for the free exchange of knowledge, data, and information. When doing open access, we must abandon any imagination of an entirely free and open internet, made better simply by making everything accessible. I am not arguing that this is an ideal we should not strive for. Rather, we should approach open access with an awareness of the context in which we do historical work. And this is not necessarily a new approach in historical theory. Historians are already taught to examine their research process through their contingency: our situations and identity influence the history we write; how we access sources changes the stories we tell.
Open access is an art. We cannot assume that one’s identity is dissolved in the open embrace of the online world. As such, historians can bring their knowledge of the uneasy history entangled in open access to understand it. It’s easy for me to stand here as a white male academic working with a white male tenured professor and lecture historians on being transparent in their work and opening their lives up to the online world. But that does not mean we should not or cannot be transparent about our work. As the Programming Historian discussion shows, open platforms are the best for doing collaborative projects, and we can do things to mitigate transparency. Through these networks, we can still be open and transparent without making everything about ourselves accessible. We can thus choose how we present ourselves online.
And this is where I want to end because I honestly do not have solutions to these problems. The issues are largely systemic and my intuition is that we need to create safe online communities while also critically engaging with these very uncomfortable realities surrounding open access. These solution will have to be collaborative, but also involve personal responsibility in ethical online interactions.