By Jacob Siefring
In a class or elsewhere in the vast field of library-and-information-science, you might have come upon a reference to digital humanities. Digital humanities is basically an academic discipline concerned with the application of computational tools and methods to traditional objects of scholarship (i.e. texts, artifacts, art works, historical data, etc.). The field is interdisciplinary and vast, and also highly collaborative.
Earlier this week, I paid a visit to Stéfan Sinclair, who is Associate Professor of Digital Humanities in McGill’s Department of Languages, Literatures, and Cultures. Since he received his Ph.D. in French literature, Professor Sinclair has worked on numerous projects designing digital humanities text visualization tools, often in collaboration with other scholars. He was most generous and open in responding to my questions as we sat in his windowed office overlooking the intersection of rue Sherbrooke and rue University.
I noticed that it’s kind of common in the digital humanities to start giving a talk by stating how you got into it, because it’s kind of a hybrid field and everybody takes a kind of circuitous route or ends up there in a different way. I know you have a background in French literature. But were you drawn towards programming and computers prior to your work in the humanities?
Yes. What I can recall is that, during the early part of my undergraduate studies, I liked fiddling with computers and doing some programming. At that point, I was especially playing with Visual Basic. I was home-brewing beer, and I had built an application that would allow me to manage time, and various settings, and fermentation and so on. At that time, it was really about building things for my own interest. I had done my first couple of years of undergraduate taking courses from all over the place, everything actually except French literature. Then in my third year I decided that I would do French literature as something that seemed more specific, leveraging my background as a francophone, Franco-Albertan, and so on. I loved literature, but not enough that my first instinct was to know that I was going to go into undergraduate studies and take literature. I was more interested in philosophy and religious studies and all sorts of other stuff. There happened to be a course—I can’t remember if I was in my third or fourth year—on computer applications in French at UBC. That’s sort of the moment where it clicked. This very unusual, very rare course that was in the French department made me realize that I could combine these two interests that I had probably never thought of as merging at any point. There’s something very empowering and magical about getting a computer to do something that you build it to do. That I think was a big motivation for any of the programming that I was doing. I’d almost put programming in quotes, because it wouldn’t be programming by computer science standards, it would just be sort of hacking, scripting—except that it wasn’t really scripts, it was Visual Basic. Anyway, this course made a connection for me between literature and computing. From there I started thinking about and building tools where my work in literature would be supplemented, augmented by some of the tools that I wanted to build. And it sort of kept growing from there. Over time, the balance has shifted. Initially, I was doing a bit of programming and computers to complement the literary side. Over time, it’s sort of gone the other way, where I’m mostly doing programming and building stuff and relatively little literature, though I hope to do more. So I don’t know if that really answers your question…
I think that’s very interesting, because a lot of the time those narratives aren’t the ones that get relayed, and yet they’re so important in determining how we end up where we do.
It seems so natural now that this is the direction that I took, and yet I recognize that it’s really an enormous coincidence. I didn’t go to UBC because this person prof, Bill Winder was there. I didn’t know he was there. Part of the experience has also been not only that very serendipitous first experience that made the connection that I probably wouldn’t have made if this course hadn’t been there, but also a willingness to kind of ignore what… if people had given me academic advice at that point it would have been to not do computing stuff because that was just too weird, and too fringe, too marginal. But it’s what I was interested in. So it’s hard to say what lesson to pull out of this except that I happened to have been extremely lucky because I did sort of do what interested me and it ended up working out.
Can I ask what your dissertation work was on?
Sure. There’s a group of primarily French mathematicians and writers called the Oulipo and they’re interested in formal constraints in literature and the idea that all forms of literature—be it a sonnet, a play or theatre, whatever—what determines that it’s that genre are is a set of rules. And so by formulating new rules they think they open new forms of literature. It looks like a constraining act but in some ways it’s an opening-up act. I became interested in whether or not that formalized aspect might be a good foothold for computational methods, and so I became interested in the Oulipo and Georges Perec in particular. Georges Perec wrote a three-hundred page novel in French without the letter e. So my dissertation was primarily on La Disparition, on this novel; that was half of it, and the other half was using a text-analysis tool that I built called HyperPo that was meant to help me in examining some of the things that I wanted to examine about the text. So it was a very hybrid project, but again, I was fortunate to have support for it where I was so that it worked out.
Do you still follow contemporary French literature?
Umm… that’s a good question. I will say with some degree of shame and regret that not really. Except for the fact that I tend to still read contemporary French literature for pleasure.
Well yeah, that counts, that counts!
But I don’t turn my attention to analysing it and doing literary criticism, which is a big part of my intellectual upbringing. It’s just more for the interest of reading.
Wouldn’t reading be less fun if you had to analyse what you read?
Yeah, I guess in some ways it would be. I actually do find it a lot of fun to analyse texts as well, so it’s a very different kind of experience. But I don’t think they’re mutually exclusive. As I did with La Disparition I thoroughly loved reading it as a novel, as literature, and I thoroughly enjoyed analyzing it as a piece of digital text.
I want to ask about Voyant Tools, which you developed with Geoffrey Rockwell. You’re probably able to see from the site stats what kind of use that tool’s getting, and maybe even where it’s being adopted, where geographic use is coming from. What does that look like?
Yes. I would say it’s gotten, for an academic project, moderate traction. It gets in the thousands of hits per week. The way it’s structured is that there’s a landing page where you can do some things, and it’s also a Web application, so that doesn’t really count the multiple pages that people might be consulting within the site. What it does count, though, which is misleading and also interesting in other ways, is that Voyant Tools has this mechanism where you can embed any of the tools into a blog or Web page, just as you would embed a YouTube clip on a blog or site. An example of this is a German site on Narratology that happens to have embedded Voyant. I know that a good part of the traffic comes from there. My own blog and various other sites that have embedded the tool contribute as well. Part of the traffic is people going to Voyant to use it as a tool, and that tends to spike whenever someone’s doing a workshop or something. That does happen increasingly, which I think is another way of measuring and understanding the traction of the project. Beyond that, it gets regular steady use but not an enormous amount of use, not yet.
As I mentioned to you in a Twitter conversation, I found it very helpful for a project I was doing which was in a class called “Knowledge Taxonomies.” The assignment was to redesign the taxonomy and the navigation tool for a website. So we selected one, 3QuarksDaily. It’s a large aggregator blog with probably over twenty posts per day. So we started with the word frequencies as a way to see what some of the hottest and most frequent topics were.
I don’t know how we would have proceeded without that, but that was definitely the best way to start our content audit.
So what were some of the limitations and frustrations?
We applied the Taporware words, and then we manually went through the frequency list, and then clustered the terms… what’s that process called, topic modelling?
Right, so you did topic modelling separately?
Topic modelling, if I’m not mistaken, would refer to an automated process. We were doing it manually in a group by consensus. So that consisted of accounting for stemming changes as well as ruling out insubstantial terms, like colors. That wasn’t a real challenge, it was just a bit of work that we had to do as part of our process. I think we didn’t go that far down into the word frequencies either. We were fairly satisfied that what we were seeing was comprehensive in terms of the topics at least.
Now, a skeptic’s question. Recently, Matt Jockers, who is a leader in the digital humanities in my opinion, tweeted the following message with the hashtag “overly honest methods”: “We deconstructed the text because we didn’t have any good ideas.” I wonder about this. We start playing around with texts with digital humanities tools, maybe without a particular question in mind, and it does end up yielding questions. But I wonder if having that really powerful tool kit at our disposal doesn’t in some cases impair us to have strong conceptual questions to begin with. Do you think that is ever the case?
I definitely think it can be. To say that it impairs us is also to say that we’re not willing to sit down at some point and work hard at trying to come up with the concepts and intuitions separately. Partly what it’s saying is, the tool and a certain number of methodologies are there, and it seems simpler just to start banging away at that than to really think about the text or whatever you happen to be looking at. And I don’t know how common that is. In a lot of cases you’re either working with text that you know already, and so you’ve probably gone through the process of thinking about it. What the tool does and can do very effectively then is to stimulate new ways of looking at and new representations of the text that you wouldn’t have thought of. It opens up new channels. In other cases, where you haven’t read the texts, and Matt will admit to this for some of the texts, it’s also a way of including and dealing with those texts that maybe you don’t have the time and the inclination to read. It’s an impairment in some ways except that maybe realistically you never would have sat down to read those texts and so it’s better than nothing. You know, at least you’re including them in some sense. It’s actually strangely, vaguely reminiscent of what Georges Perec says about constraints when he’s starting to write. He said that nothing scares him more than the blank page. In other words, if he sits down and has to start writing something, he finds that terrifying. What the constraint allows him to do, when he sits down and he’s working on an Oulipian constraint like writing a novel without the letter e, that gives him a structure in which things can happen. It sort of removes that paralysis of the blank page. In some ways I wonder if in some cases with text analysis it can’t be similar. Sometimes you’re looking at a text and it can be very intimidating and paralyzing to say, now what am I going to do with this thing? In some ways working with text analysis methodologies allows for a breaking up of things and then you go back and you read it. It’s the reverse of how things are sometimes done. But maybe you start doing text analysis and then go back and start reading chunks of text more closely in a way that’s already been informed with some first observations that you’ve made or some intuitions that you’ve had. It’s a way into the text that you wouldn’t have had otherwise. I wouldn’t exaggerate that too much. I present that as one scenario. Another scenario is that you’re too lazy or too unmotivated to read the text and so you start deconstructing it, as Matt says, with the methodologies. In my experience the challenge has been that the more you practice some methodologies the more that can become a habit and a rut, and in some cases the challenge is finding variants of a methodology or new methodologies that would be more appropriate or more fruitful for a given text or corpus.
Yes, there are two really good metaphors that have come to me through you. In your interview with Adam Bluestein that appeared on Fast Company, you said something along the lines of, “When you’re holding a hammer everything starts to look like a nail.” The other is from Ted Underwood. When he was at McGill for a day of digital humanities presentations that you organised, he said, “You have to know when to get out of the jeep.” And he was talking about the limits of formal analysis and the toolkit. So those have stayed with me. I like that.
So do I.
What do you predict for the digital humanities in the next decade?
Hmm… I predict I’m a bad predictor of predictions. [pause] I think most people would agree that digital humanities has been enjoying momentum and administrative support and a willingness by some people to find out what it is, and so on. Some people think that has plateau-ed. I don’t think it actually has, but it will. I think what’s going to happen between now and then, though, is that some aspects of working with digital texts will just become more normalised, in the same way that most researchers will start googling and start looking in a library database and working with digital text. Maybe they don’t admit to that. I know that there are colleagues who would find quotations that are of interest to them in digital form, and then they’ll go look it up in the print version and cite the print version. What I would hope is that people don’t feel like they need to do that anymore, and that there’s a more gradual, iterative movement towards additional exploitation of the strengths of the digital text. Not doing things like topic modeling necessarily, but being able to search in a PDF for those words that you know are there, but would take forever to find in a print edition. That doesn’t seem like very interesting text analysis, but it can be a fundamental part of a research process. And it is significant in that maybe something that you feel like you wanted to say, and then you say, argh, it would take me forever to find that quotation that I was interested in, so I’m just going to drop it; what the digital text allows you to do is to go and look and to see if it would actually be that useful. So I’m actually not convinced that digital scholarship, the form it takes, will change all that much. But the process that leads to there will change. That’s sort of a behind-the-curtains kind of thing, so in some ways it may not seem like things change that much, but I think they really will. And especially the ability to try things out, to experiment. Malcolm McCullough, for example, writes about how the big breakthrough of spreadsheet programs for businesses in the late seventies was not so much in how quickly things could be computed. By the time you enter them it’s not that different from a calculator. But the speed at which you could try things—you know, what if I change this number, what happens to the worksheet number 18 in the cell 4A for example—is increased, and you can try things out very quickly, you can experiment. It’s an endless palimpsest that you can experiment with very quickly. And I think that maybe one of the most significant contributions from digital texts is the ability to try things out very quickly, that there’s a low cost to some paths that you might want to explore. I think that enriches scholarship in general.
I was recently reading a post that came out of the recent MLA digital humanities sessions about setting up a digital humanities lab, where there’s departments, or people from different departments, who say, let’s do this, it’s a great idea, can we do it? In digital humanities, McGill has you and Andrew Piper, and others. To what extent do you see it being practised, either now or in the future, among graduate students at McGill? I don’t want to ask whether or not it’s in McGill’s future to have what might be called a digital humanities ‘lab,’ because it’s so much of an ad hoc thing sometimes, and it’s not a physical space necessarily.
In this very exact case I can say yes, I really do believe it’s in McGill’s future to have a digital humanities lab, institute, centre, whatever. And that’s for various reasons. But you’re hitting on something more significant which is, how does that interact with existing programs, and how does that affect graduate and undergraduate teaching and that sort of thing, especially when you have at best a handful of people and it’s very difficult to build a program out of that. I think that the momentum in a university and an institution is a difficult thing to really predict, because some things happen very quickly that surprise you, and other things that you think would be quick are very slow. So in the meantime, I think the focus is to ensure that we’re starting to build a set of courses that have a strong digital humanities component, even if we don’t call it like that, but that do. That will build interest and a need, a desire for more. By filling my graduate and undergraduate courses, that sort of sends an indication that we could do more of these, and they would be well attended, that people are interested. When the English department, the grad society organises a panel or an event on digital humanities and lots of people show up, that’s sort of an indication that there’s an interest for it. There may be an interest in ways that warrant additional examination, you know, maybe there’s a curiosity, a sense of, what is this thing?, it’s not that I necessarily want to do it, I just want to know what it is; some students go because of that. So I think it’s a combination of things, where there are pure or primarily digital humanities courses that are taught—and I think there will be more and more of those—but also where more and more aspects of digital humanities manifest themselves in existing courses. I think that will happen. I think that some of our colleagues here are genuinely interested and would like to incorporate some of the methodologies. Truth be told, academics tend to be pretty busy and if it’s a prospect of learning a bunch of new technical skills, it may not happen. It’s not necessarily that it’s going to be primarily those people who teach it, but there’s a slow trickle-down effect where more courses mean that there are more students who have come through the system who have taught themselves as I did or who have courses that they take that help with the training. There’s the digital humanities summer institute in Victoria, there’s something similar in Maryland now, and there are things being planned in Europe. Some of those students may go on to academic jobs and those students will incorporate that digital methodology into their teaching. So we recognize that things don’t change as quickly in academia as they do in society. There’s a greater prevalence of digital aspects in society—the prevalence of social media—than you see in typical humanities courses. But there is a constant catching-up process that happens.
My thanks go to Professor Sinclair for taking the time to participate in this interview and for reviewing the interview draft.