Edgar Allen Poe Meets High Performance Computing

On a sticky summer night, two men walk down a dark alley with murder on their minds.

A gruesome crime has just been committed: two women have been killed in their home. There’s a lot of evidence--blood and hair everywhere--and eyewitnesses come out of the cracks to point fingers and spin tales. Even the laundry lady wants her fifteen minutes. The women paid her well, she tells the reporter. They lived alone and loved each other. So surely one didn’t slit the other’s throat and shove her into the chimney.

The police, hardworking but unimaginative, are baffled. The first cop on the scene is blinded by xenophobia. He heard somebody say something but he can’t say if it was in English or Spanish or Italian or what; it’s all noise to him. There’s so much evidence it’s sending the cops into a tail spin. (And, truthfully, they’re a little nauseous already just looking at the scene. They’re beat cops, not trained for this kind of wet work.) The eyewitnesses contradict each other and sometimes themselves.

But those two men walking down the alley? They have the answers. Or one of them does, but they come as a pair. Their deep bond is made stronger by being total opposites.

One is outgoing, supportive, kind. A little goofy, even.

The other has a dark past. He’s haunted by some kind of family secret; you can see it in his eyes. They never talk about it, it just walks between them in that alley, drawing them closer together even as it pushes them farther apart.

He’s cold, anti-social. Brilliant. His mind is always working, observing even the smallest things.

He knows that when you have excluded the impossible, whatever remains, however improbable, must be the truth.

A woman’s dead body was shoved into a chimney. He's going to find a way to...smoke out the killers.

He’s the hero we deserve.

And he has a thousand faces.

The Shape of the Whole Thing

Holmes and Watson. Batman and Robin. Horatio Caine and sunglasses of justice.

Before them all was Edgar Allan Poe’s Auguste Dupin and narrator, says Ted Underwood (left), Professor of English and Liberal Arts and Sciences Centennial Scholar.

“I’m studying detective fiction right now, and the origin story [for detective fiction] turns out to be simpler than I thought. Boom — 'The Murders in the Rue Morgue,' 1841 — Edgar Allan Poe sets a pattern that really does endure pretty much to the present day,” Underwood says.

Poe’s Dupin is a moody superhero whose power is acute observation. He can assemble details, like whisper-thin loose hairs and hidden nails and what his friend ate for breakfast yesterday, into a larger pattern that shows him the truth of a crime even while others can only scratch their heads.

“Often, if a person looks at something very closely he can see a few things more clearly, but the shape of the whole thing escapes him,” Dupin declares to his sidekick, the unnamed narrator.

But nothing, not a scrap of fiber or dribble of blood, gets past Dupin’s calculating gaze. He’s a forensics lab and lie detector test and psychological screening all in one broody package.

He’s incredibly nineteenth century. He’s also incredibly modern.

Poe’s tales of Dupin’s detective wizardry were published in 1841. That means Dupin beat a certain infamous pipe-smoking British sociopath to print by at least forty years.

But even if Sherlock won the popularity contest, becoming synonymous with the detective story genre itself, he did so by following the pattern Dupin started.

They all follow that pattern pretty closely, actually. This fascinates Underwood.

“I’m interested in the really big patterns we see when we back up to look at literary history over the course of several centuries. How are poetry and fiction, say, different from nonfiction prose? Many of our assumptions about literary genres emerged only in the last two centuries,” Underwood says.

For example, we’ve all had a high school English teacher who told us that writing should “show, not tell.” But where did this idea come from? And how long has it been the standard of good writing?

We can see that assumption emerging very, very gradually across the nineteenth century,” Underwood explains.

Some ideas take a while to develop. Others, like the detective story, emerge very quickly, and stick in our cultural consciousness to be replayed over and over again in shows like CSI: Miami, Dr. Who, and Sherlock.

And in Underwood’s own work. Much like Dupin, Underwood is interested in collecting little bits of evidence and assembling them into a pattern that can tell him a larger story about the world.

Instead of blood drops, though, Underwood follows words.

And to find and see, really see, important words scattered across hundreds of stories in thousands of newspaper clippings, books, articles, plays, musicals, television shows, movies, and marginalia produced over literally hundreds of years, he needs a lot more than a magnifying glass.

He needs a supercomputer.

Story Time

High-performance computing (HPC) helps Underwood discover patterns that endure over large swaths of time. These patterns allow him to see relationships between things that might have seemed unrelated before.

Underwood is a big proponent of using technology to better understand old stories and to make new ones. This is in keeping with how humanist scholars have used technology over time.

Scholars have always written essays. They used to write them by hand, then they used typewriters, and now they use laptops. As in all fields, technology is a tool that makes scholarship increasingly easier to do.

Given the new tools available to them, scholars can now use technology to ask new questions.

“For instance, literary history would have been hard to model quantitatively in 1950 because you need to be talking about thousands of variables; that’s really a place where it would be impossible to ask certain questions without computers,” Underwood explains.

To track patterns, Underwood does a lot of his own programming. He uses Python, R, and Java. But he also borrows from other developers. When he needs machine learning algorithms, for instance, he uses scikit-learn, a Python library.

Crucially, he takes advantage of free HPC resources on campus.

“I’m working with several terabytes of data, I can’t do this all on my desktop; I-CHASS has helped me obtain an account on the campus cluster, which has really been vital to making everything I do possible,” Underwood says.

I-CHASS is the Institute for Computing in Humanities, Arts, and Social Sciences. I-CHASS connects scholars in the humanities and social sciences with HPC resources (including knowledgeable staff) to create and execute projects.

HPC might bring to mind sterile rooms and hulking machinery. And we might be used to thinking about the kind of work happening here as the search for a cancer vaccine. This is not an inaccurate depiction of HPC.

But more than the particularities of HPC work is the generalized work that HPC allows. HPC is, after all, about scale.

In an interview conducted last year with Technology Services, Dr. Alan B. Craig, former I-CHASS director and XSEDE liaison to Humanities and Social Sciences scholars, noted how the scale of HPC can change not just the scope of work but how humanists in particular formulate the very questions that motivate their work.

Underwood can go big with HPC. With grants from the American Council of Learned Societies (ACLS) and the National Endowment for the Humanities (NEH), Underwood released a dataset that mapped poetry and fiction at the page level in 850,000 (!) English-language volumes.

“This seems like a very simple thing, but in fact we didn’t actually know where literary texts were in the library, and you can’t start to write large-scale literary history until you’ve located a substantial sample of texts. My map isn’t perfect (obviously I didn’t read 850,000 books, so it’s based on algorithmic predictions). But it’s a start,” he explains.

Using the scale provided by HPC, Underwood can tell bigger stories and, because he can get necessary data much more quickly, he can tell more of them. But he’s not interested in using computers to rescue a vague abstraction like “the humanities.”

There is an international conversation happening right now about the place of the humanities in higher education. Wartburg College, a small liberal arts college in central Iowa, recently announced it was cutting faculty lines in distinct humanities fields like American Literature and Philosophy.

Common stories circulate about the lack of employment prospects for humanities majors (despite evidence to the contrary).

But Underwood adopts the Dupin approach. He gathers evidence, looks carefully at it, and creates a larger story about the pattern he sees developing.

That pattern is that “the humanities” is only one way to think about writing. It’s only one story we’re telling about stories. And it might be the wrong one to tell.

“I'm going to be blunt about the crisis of the humanities. Here's how bad it is: I'm an English professor, and not even I care about ‘the humanities.’ I think that phrase describes a particular system for organizing subjects inside universities, and I'm not sure it matters enormously how we organize them,” Underwood says.

Instead, stories are what matter, and they don’t seem to be going anywhere soon.

“What I do care about are history, story-telling, discussion of stories, television, and poetry. And I doubt that any of those things are in crisis at all. I think they're doing fine.”