Heremeneuti.ca: The Rhetoric of Text Analysis

Introduction: Correcting Method

How do you think something through with technology?

Descartes in his 1637 Discourse on Method describes a moment of solitude that allowed him to talk to himself about his thoughts and to develop a method for thinking correctly. Here is how he describes the solitude he needed:

… as I was returning to the army from the coronation of the emperor, I was halted by the onset of winter in quarters where, having no diverting company and fortunately also no cares or emotional turmoil to trouble me, I spent the whole day shut up in a small room heated by a stove, in which I could converse with my own thoughts at leisure. Among the first of these was the realization that things made up of different elements and produced by the hands of several master craftsmen are often less perfect than those on which only one person has worked. [1]

The Discourse is important to the practices of the humanities because it introduced an accessible method for anyone to do philosophy without needing to be widely read or part of an intellectual community. His story of method, and its accompanying provisional moral code for behaviour without certainty, is one of the fables that founded modern philosophy and in particular the solitary, doubting and reflective practice that still dominates how we think we should think things through.

But practices are changing, and older forms of communal inquiry are being remixed into modern research. We have come to recognize how intellectual work is participatory even when it includes moments of solitary meditation. Internet conferencing tools allow us to remediate dialogical practices, collaborative communities like Wikipedia depend on contributions by a large group of editors, and the communal research cultures of the arts collective or engineering lab are infecting the humanities. Accessible computing, the amount of data available, and the opportunities of new media have provoked textual disciplines to think again about practices and methods as we try to build digital libraries, process millions of digital books, and imagine research cyberinfrastructure that can support the next generation of scholars.

What is new is that we are imagining projects in the humanities that are big and need a variety of skills for implementation, skills rarely found in one solitary scholar/programmer let alone a Cartesian humanist. Thus we find ourselves working in teams, reflecting on how to best organize the teams, and then reflecting on what it means to reason through with others. This reflection in project teams inevitably turns to method correcting for the new media as we try to balance our traditional Cartesian values with the opportunities of open and communal work.

Hermeneuti.ca, like Descartes’ “histoire” is a story about the turn to method, this time methods of interpretation. Hermeneuti.ca is both a story of return to dialogical practices that predate Descartes and introduces computer-assisted methods that are just becoming hermeneutically interesting with the digitization of the human record. Specifically Hermeneuti.ca returns to method in four ways:

  1. First, Hermeneuti.ca is a hybrid project, both printed book and online reflection. The online heremeneuti.ca mirrors the convenience of a book by weaving text with interactive components, so as to show one of our conclusions about the opportunities for online interpretation.[2]
  2. Hermeneuti.ca is both a text about computer-assisted methods and an instantiation of tools called Voyeur Tools that implement our interpretation of method. The code is an interpretation of method presented in a particular way online so that you can try it with its companion text, manual and documentation.[3]
  3. Hermeneuti.ca presents three case studies - each with an example essay (Essay) that shows computer-assisted text analysis in application. The example is paired with a reflective chapter (Reflection) on analytics that uses the Essay as an example. The examples, like “Now Analyze That”, are essays that interpret texts using the hermeneutical tool things or "hermeneutica" (the plural of hermeneuticon or interpretative thing). [4] Accompanying these Essays are demonstration Recipes from our Methods Commons that show you how to use computing methods with Voyeur (or similar tools). The Recipes are tutorials on how to do interpretative things with common tools.[5]
  4. Just as Hermeneuti.ca is both book/site, texts/tools, so you will find that our Essays are both text and code: both narrative text and embedded interactive panels. The panels are part of the text quoting results, but they are also interactive so you can recapitulate and experiment with our results. They let you return to a computer-assisted method in the context of an essay. Here, for example, is an interactive panelWeb frameworks like the TAPoR Portal organize information into panels (sometimes called portlets or coplets.) These can me minimized, maximized and closed using the three buttons in the upper left-hand corner of the panel. With Voyant you can export panels of results and place them into other web sites. Return to Glossary.. It show the Voyeur Collocate Clusters of this Introduction. (Collocate Clusters allows users to visualize words that are interconnected by proximity and high frequency). Such panels are a difference made possible when publishing online. Try it!

Collocate Cluster of the Introduction

Collocate Cluster of the Introduction[6]

In short, Hermeneuti.ca is a weaving together of hermeneutical things whether print and electronic, text and code, essays and reflections, or narrative and interaction; all of which are a thinking through of interpretative method through computing. Hermeneuti.ca tries to correct for the Cartesian solitudes of text and method by showing how code is an instantiation of interpretative method and that it can be woven closely into other hermeneutical things like text.

The Story of Method

To confront the privilege of solitary reflection in academic practice we would do well to pay attention to how Descartes introduces his story of method. Descartes calls his Discourse a personal history or fable (“histoire”) which “you can imitate” or not. It illustrates his practices for readers to imitate, which in turn leads into his method and provisional moral code. While the formal method gets attention, the story of the provisional practices that he adopted to get there typically do not. That story, with all its baggage, gets passed over on the road to method, but it is in those provisional practices that those of us without certainty are stuck. It is a story about doubting oneself and others in order to rid oneself of possible influence. It is a story of isolating oneself from traditions of scholastic disputation in order to think alone, think afresh, think thoroughly, free oneself of error (especially from the errors of others) by assuming nothing, and think about thinking. This story of thinking thoroughly has four aspects that interest us,

  1. First, Descartes doubts all authority in order to think afresh about thinking, rejecting all opportunities for thinking along with others. He turned to solitary work apart from others as the best way to think about thinking, a practice still common in the humanities. We still tend to think that to really get research work done we need time away. Hermeneuti.ca proposes an alternative and more participatory practice of dialogical criticism. You don’t have to do it alone, especially if you haven’t all the skills.
  2. Second, and paradoxically, Descartes writes his thinking through as an interior dialogue. Rejecting dialogue with others gave him “the leisure to talk to myself about my thoughts”. Avoiding conversation with others freed him to discover himself as a conversational partner, which is his way of reflecting on thinking, or at least reflecting on his own thinking, and this reflection generates personal certainty. [7] The Discourse traces the trajectory of the personal; Descartes tells the story of his education in Part 1, his development of a personal method in Part 2, and even why he decided to publish the Discourse in Part 6. The autobiographical narrative made method accessible to others as a personal path, which accounts for popularity and influence of the work. Descartes let readers voyeuristically listen in on him doing philosophy, which helped them imagine how they could be philosophers if they took the time to think alone about their thoughts. In Hermeneuti.ca we likewise make our practices part of this story, but they are practices of working together, a different type of dialogue and a different type of practice. This story is more about what is needed when one wants to tightly couple interpretation with the development of software tools of interpretation. Ours is a story of collaborative reflection and development across the divide of writing code and text. Readers will find reflections on pair work later in the concluding dialogue on Agile Interpretation.
  3. Third, Descartes thinks through by reflecting on thinking. His method is a thinking that takes itself, thought, as its first companion and subject for interpretation. What is being interpreted is first of all the thinking of reflection. Likewise Hermeneuti.ca is about thinking, but thinking through interpretation rather than reflection, and a thinking through with things of interpretation whether text or tool. One way we do this is by being open and documenting our experiments in the very writing of these documents. Heremeneuti.ca is an open and self-documented work: you can recapitulate our analysis, examine our code, recover earlier versions and recover blog entries.
  4. Fourth, Descartes provides case studies that show the results of employing his method. In Descartes’s Discourse these are provided as appendixes. We too have provided case studies, but in Hermeneuti.ca the case studies are not pushed to the end of the story. The Essays are hermeneutical things that interpret using analytical methods and they are interpreted in the Reflections.

It is not surprising that the Cartesian train of doubting, solitary and reflective thought leads to the Cogito – “I think, therefore I am” – from which, with personal certainty, Descartes methodically rebuilds his ideas. The irony of the Discourse is that if his practice appeals to you, then you should suspect its results as the authority of another, namely Descartes, and start all over by interrogating your thinking practices. The rhetorical power of the Cogito is that Descartes bets you will end up right where he did, all the more convinced since you followed (even if only remotely as a reader) his "correct conduct" of reason, not his conclusions. [8] But that's the point - the Discourse is first presented as a guide towards method which you can imitate and reuse, not an authority to consider true - and that's going to be our point: you really shouldn't imitate our practice without thinking it through too, which is why we have provided you the tools to try it yourself. The image of the solitary Cartesian philosopher has influenced the how we think about intellectual work; perhaps it is time to correct our methods and reanimate other images of practice. In Hermeneuti.ca we have stitched together hermeneutical things so that you can try tools-as-methods as you read about them. Our story is one of thinking through together, as we hope yours will be.

Agile Interpretation

Knowing Geoffrey was leaving for Alberta, we decided to try some experiments in text analysis together while we still had access to quarters, in this case a lab away from our offices. Rather than be diverted by our personal projects we thought we would direct our conversations to the intersection of methods, tools and interpretation by taking a small project through from conception to writing in one day - a day set aside away from the distractions of other work. We spent the day closed up together in a lab overheated by all the computers, where we had the leisure to talk while thinking through tools and experimenting with texts. Among the many reflections of the day we noticed how few questions you can ask alone using one tool and how much more richness there was to interpretation in dialogue that weaves evidence together from several tools by different masters, as needed by the questions at hand.

Computer-assisted research in the humanities, by contrast to the Cartesian story and traditional humanities practices, has almost always been collaborative and not solitary. This is due to the variety of skills needed to implement digital humanities projects, and because of the relationship between the practices of interpretation and the development of the tools of interpretation, be they text analysis tools or digital editions. This difference, while acknowledged in various ways, has been a professional hindrance, as anyone who submits a CV for promotion with nothing but co-authored papers knows.[9] More importantly collaboration is not always good. Collaboration separates the interpreter/scholar from the implementer of the scholarly methods (programmer). Willard McCarty notes that the introduction of "software separated the conception of the problems (domain of the scholar) from the computational means of working them out (bailiwick of the programmer) and so came at a significant cost."[10] As computing is introduced into research it separates conception, implementation, and interpretation in ways that can only be overcome through dialogue and collaboration across very different fields. Typically humanities scholars know little about programming and software engineering, and programmers know little about humanities scholarship; going it alone is an option only for the few with the time to master both.[11]

There are obviously all sorts of ways people can collaborate, but for the purpose of correcting method we propose that collaboration is the normal practice of humanities computing and should therefore be imagined as part of any discussion of method. Solitary time, while much desired in the bustle of academic life, is here conceived of as a withdrawal from a background of working together in various structured and unstructured ways. Even Descartes starts with the collaboration of authority as the norm from which he has to retreat to correct his thinking. The very desire for solitary time for reflection proves our point; what is normal is collaborating with students in teaching or meeting with colleagues in committees. Thinking alone is the dream of the humanities not the ground from which to develop our method.

Collaboration, however, can take many forms. Working on Hermeneuti.ca we modelled our collaborative practice loosely on a programming methodology called Extreme Programming (XP) which includes practices of Pair Programming and which belongs in the wider category of Agile Programming, for which reason we call our practice Agile Interpretation (AI).[12] What is “extreme” about such methods is how extremely different they are from what we expect of best practices in coding. Traditional programming wisdom emphasized the need for careful analysis and specification before coding, while XP recommends rapid iterations of coding and reflection to achieve immediate goals without worrying much about the long term or big picture. You don't analyze the situation and then fully specify the final product before coding. You scratch an itch in a short iteration, look at it, and then start adding stuff as needed (not as anticipated). Often that means throwing out your code and starting all over when adding functionality leads to redesigning basic structures. The traditional wisdom held that rewriting your code was a sign of failure, XP makes it part of the process. Likewise in AI we started small trying to take an interpretation through from conceiving the problem to write a short essay in one day. (We failed to finish in a day!) This grew to larger iterations around each one of the case study Essays. Each iteration forced us to rewrite the code and to rethink Hermeneuit.ca.

XP also recommends working in pairs right down to the typing of code. You don't meet and then go off to code alone, you code in pairs alternating typing and guiding - one person in the pair coding at any moment to force discussion with the other. Likewise, where traditional practices in the humanities are solitary or forced compromised collaborations, AI is purposefully collaborative – at its heart is pair-work where one person performs the work of interpretation (or using the text analysis tools) while the other looks ahead or reflects on what is needed (actually, both members engage in both activities, but each member has a dominant role to play). The idea is to maximize the dialogue between the scholar function and development function to the point where they are woven into an organic whole.

Where the humanities aim to be theoretically grounded - you are supposed to have it all theorized beforehand (just as traditional programmers should have complete specifications,) AI is pragmatic, starting with small experiments and generating hermeneutical theories as the things of interpretation, like texts and tools. Where the humanities avoid formal methods in favour of loose and largely unexamined practices, AI makes methods (and the instantiation of methods in tools) an issue to be discussed throughout the experiment. It’s hard to avoid talking about what you are doing when only one person has the keyboard and everything has to be negotiated. Try it; it isn’t the waste of time you think it is. Above all, where the Cartesian practices involve reflection and talking with yourself, AI is about talking with another with complementary skills and summarizing those conversations in different ways.

The particular practice we followed involved redeveloping tools as we wanted to pose new questions and then continually testing the tools in the context of concrete experiments. We were fortunate that we actually could hack our tools as we needed. There was no divide between literary scholar and programmer, we were both of us capable of both. It undoubtedly helped our project to both have some familiarity with literary criticism and programming, but we don’t consider this a pre-condition of Agile Interpretation; AI can happen between two literary scholars or even two programmers.

In summary, Hermeneuti.ca is the record, outcome and essay of our three AI experiments:

1. The first Experiment starts with an Essay, “Now Analyze That” comparing the important pre-election speeches on race by Barack Obama and his spiritual father Jeremiah A. Wright. This essay uses text analysis for the comparison of shorter cultural texts available online and it is accompanied by a reflective chapter entitled “There's a Toy in my Essay!” which looks at how the results of text analysis have been woven into essays. A Recipe on “Exploring Themes Across a Text” shows you how you can try text analysis with our tools. [13]

2. The second Experiment is about studying a large collection of texts over time. The essay, “Humanist: The Sparrow Flies Swiftly Through” looks at diachronic patters in the archives of the Humanist discussion group which have documented the digital humanities community. “The Epidemiology of Ideas” is a reflective chapter that looks at how text analysis can be applied to such archives to track ideas through time in a community. A Recipe shows you how to Explore a Diachronic Collection.

3. The third Experiment on a community of text starts with an essay, “What's In A Day of Digital Humanities?” which analyzes the combined blogs of a social research experiment, The Day in the Life of Digital Humanities, where about 90 people blogged what they do. [14] It is accompanied by a reflective chapter “Animating the Knowledge Radio” that looks at real-time text analysis and animated visualization. A final Recipe helps you Visualize a Collection of Blog Entries.

These three sections, each with their Essay, Reflection and Recipe, are framed by historical and theoretical chapters on computer-assisted text analysis. “From the Concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. to Ubiquitous Analytics” surveys the development of analytical tools in the textual disciplines in order to provide a context for Voyeur Tools. “Theorizing Analytics” returns to issues about interpretation and the place of computer-assisted text analysis. We conclude with a Dialogue on Agile Interpretation, the method followed in Hermeneuti.ca.

Computing in Humanities Research

Heremeneuti.ca is a work about and for the application of computing to humanities research, specifically to textual studies and interpretation. We have both been part of the field that used to be called Humanities Computing and is now commonly referred to as the Digital Humanities. This field is one of the communities of practice that has been negotiating this application of computing into the humanities. [15] To some extent this book is the result of decades of development and reflection in this field and we will on occasion engage in dialogue with others in the field, especially around tools and collaboration, though this is not a survey of the field. One of the characteristics of the field is that it has focused and supported the development of applied technology rather than being strongly theoretical. Humanities computing has – through training, conferences and projects – bridged the gap of scholarly practice and technology development rather than a theory/practice gap. Humanities computing was often based in units that supported computing for humanists in universities and therefore brought together faculty, staff, programmers and students to run labs, run servers, and develop tools. In short, computing humanists tended to build digital things, often for research uses by others, rather than theorize.

In Canada there has been a long tradition of building concording tools, starting with the PRORA concording tools, whose manual was published in 1966, to TACT in 1989, and the TAPoR project which led to and supports Hermeneuti.ca.[16] Heremeneuti.ca is, by virtue of being a hybrid of text and tool, another contribution in this tradition, and one that reflects back on what these code things are, the subject of the second chapter, “From the Concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. to Ubiquitous Analytics”. Willard McCarty, author of a book titled Humanities Computing (which is one of the first attempts to theorize the field) writes, in an essay he gave as a plenary lecture at 2006 Canadian Symposium on Text Analysis,

But the more important lesson Iʹve learned is that although better tools are possible, the humanist’s perspective on tools problematizes them. That is ultimately the point of tool‐development in humanities computing, just as problematizing our methods and objects of study is ultimately the point of applying the tools we do have.[17]

Hermeneuti.ca is in this tradition of problematizing methods through developing tools, but we have tried to more tightly couple the development (writing of code) and interpretation (using the code.) Hermeneuti.ca tries to argue, through its hybrid structure, that the lines between tool and text are blurred – and that blurring is good. Voyeur Tools isn't a better tool, or the one everyone has been waiting for; it is another contribution in a tradition of developmental and interpretative research.

Further, as mentioned above, we believe that collaborative practices of research and development are at the heart of humanities computing, and therefore Hermeneuti.ca also presents itself in a tradition of reflecting on collaborative practice. Here, however, is where we diverge from the service tradition in Humanities Computing that sees the field as a "methods commons" for research that happens elsewhere. [18] For us Humanities Computing is not just the collaborative development of tools for others (that would be closer to software engineering) or the application of tools by others to humanities problems (that would be digital humanities), it can be a disciplined set of practices that problematizes methodology, tools and interpretation at the same time. There is now a tradition of research development and discussion within the community independent of instrumental concerns. Heremeneuti.ca, is a contribution to that tradition, from and for the field and involved in development as a form of research. Our research is simultaneously about "how we might think" while "thinking through" prototyping, coding, documenting and testing with real questions.[19] It is a particular type of research craft where one of the important outcomes is a re-imagination of how research tools should be designed to fit in the cycle of research.

Voyeur Tools, the tool intervention of Hermeneuti.ca, is a new text analysis environment meant to support Agile Interpretation in the following ways:

To support research Voyeur is designed to be rewritten and to support different interfaces. There has been much handwringing about how we are constantly reinventing our tools, something we think is actually a good thing ... called interpretation. Extreme Programming is built around turning change, refactoring, and iteration into a virtue; likewise we try to make reimplementation a virtue of Voyeur. We have come to the conclusion that if reinterpreting tools (and therefore rebuilding tools) for the humanities is an inescapable part of problematizing method, then we should welcome it as a research practice and design an environment for reimplementation instead of being tempted by the teleology of “getting it right once and for all”. Another way to put this is beware of Voyeur. It is a research project, not a production tool you can buy shrinkwrapped and stable.

Thinking Through Text Technology

Like Descartes' Discourse, this work is also about thinking, but of a very different type of thinking through that both returns to an earlier model of how to do work and looks forward to how to do it with the hermeneutical tools at hand. This book takes a different path back than Descartes’s Discourse; our story does not reject authority or talking with others and, in fact, it is the story of writing and software development embedded in traditions. Our story does not present solitary thinking as a dialogue, instead we present our communal thinking as possibilities for dialogue, or, as you will see, as interactive essays where the dialogue is a possibility for you to follow through with hermeneutical things. Most importantly, our story doesn’t begin with reflecting about thinking but with interpretation assisted by tools.

What we have in common is that this work is about thinking through, but we will play with another sense of thinking-through than the Cartesian sense of thinking about or thinking thoroughly through method. This book is about thinking-through as thinking with or by means of extensions of the mind that instantiate methods. It is about thinking through with others and with technology where Descartes shunned both.

Thinking through is rooted in one of the paradigmatic styles of doing philosophy, dialogue (as opposed to solitary meditation). The Greek “dia” in the word “dialogos” meaning "conversation" does not, as many assume, mean “two”, but instead can be translated as “through”, “between” or “exchange”. Thus a playful etymology of “dialogue” would explain it as “thinking through” or that which comes "through conversation" whether it is the Cartesian inner dialogue or a conversation with another. [21]

Socrates, in one of Xenophon’s dialogues, played with the connection between dialogos (conversation) and dialego – (to classify). Xenophon, writing about Socrates says “The very word ‘discussion,’ according to him, owes its name to the practice of meeting together for common deliberation, sorting, discussing things after their kind: and therefore one should be ready and prepared for this and be zealous for it…”.[22] In the Greek the joke is obvious because there is only one word dialegontas, a form of dialego for both sorting and for discussing. Dialogue for the Greeks was clearly connected with thinking through, by way of sorting and classifying, a collaborative practice illustrated in many of the Platonic dialogues as they sort through different definitions of the virtues.

Why text technology now?

In this book, however, we are going to concentrate on ways of thinking collaboratively through technology, specifically text technologies which we believe are of epochal importance. But why text technology? Why is information technology and in particular text technology so important now?

  1. First of all, because we are surrounded by electronic texts that we read mediated through technology. The e-texts we read on our laptops, smartphones, e-readers, off the web, and on screens are all read through technology. Text analysis tools can be considered as simply more powerful versions of the search utilities in the browsing and editing tools available for e-texts from your favourite word-processor to your PDF reader (text analysis tools can also be more nuanced and speculative).
  2. Second, we’re interested in text technology because the market for electronic reading is changing dramatically as we write. With the Kindle and the iPad, we seem to have viable electronic book and media readers that are actually doing well in the market place. Both have succeeded in connecting ease of acquisition to ease of reading so that many are reading electronic representations despite the convenience of paper. Hermeneuti.ca looks at how we can go beyond reading in the sense of flipping virtual page now that we have texts that can be processed by the computing. It does so by not asking about electronic reading, but about interpretation.
  3. Third and for Hermeneuti.ca most importantly, is the change in scale of available electronic texts. Thanks to Google Books, researchers can read millions of books in digital form. Should the intellectual property issues around Google Books ever be solved we could have access not just to page images, but the raw text files.[23] The question is what can we do with some much data? Our tools for analyzing texts have grown out of concording tools designed to handle one book or a small collection, not millions of books. The types of questions we ask also tend to be about individual works, small collections of a single author, or comparative collections. What sorts of tools, methods and questions can handle millions of texts?[24]

But it is not just researchers who need access to text technologies. According to a 2003 study “How Much Information?” by Peter Lyman, Hal R. Varian and colleagues at Berkley, there was about 5 exabytes of new print, film, magnetic and optical information produced in 2002. And it is growing by about 30% a year. Of this only a small amount – a mere 1,634 petabytes is print, but consider that 2 petabytes is sufficient to represent all the U.S. Academic research libraries.[25] Most of this print information is office documents: North Americans in 2003 were consuming 11,916 sheets of paper per person and they estimate that half of that is used in printers and copiers for office documents.

A more recent, and more alarming study, “The Diverse and Exploding Digital Universe” prepared by IDC, a “global provider of market intelligence” and commissioned by EMC2 (a storage solutions company) estimates that,

In 2006, the amount of digital information created, captured, and replicated was 1,288 × 1018 bits. In computer parlance, that’s 161 exabytes or 161 billion gigabytes. This is about 3 million times the information in all the books ever written. [26]

They go on to say that “between 2006 and 2010, the information added annually to this digital universe will increase more than six fold from 161 exabytes to 988 exabytes”.

It should not be surprising that, according to “How Much Information,” the Internet is the fastest growing medium, accounting for 532,897 terabytes between the web, e-mail, and instant messaging. Most of this is text like email. Text is even on the multimedia web, where HTML and PDF account for 17.8% and 9.2% respectively while images and movies account for 23.2% and 4.3% respectively. If you think about how people search and find information on the web through search engines like Google you can see the importance of text. Even if a growing amount of the information on the web is time-based media like video, it is text that we use to search for that information, it is text that is indexed, and it is text that makes up the metadata.

This explosion of information raises ethical and privacy issues connected to hermeneutical issues. One major issue is control and text-mining over this universe of text.

IDC predicts that by 2010, while nearly 70% of the digital universe will be created by individuals, organizations (businesses of all sizes, agencies, governments, associations, etc.) will be responsible for the security, privacy, reliability, and compliance of at least 85% of that same digital universe.[27]

Where organizations have access to our words they can use analytical tools to mine them in order to draw inferences individuals wouldn’t want drawn. We are already seeing the fall-out from the tensions between individually created information and corporate management of it in a cover story in the CAUT Bulletin, “Email Outsourcing Threatens Privacy & Academic Freedom” which reports on the Lakehead University Faculty Association grievance against the university for outsourcing e-mail to Google Gmail whose terms of use allow it to store and process information in the United States which opens the possibility that the email may be mined by the American government if ordered to do so. [28] As more and more of even our “private” textual correspondence is available for large-scale analysis and interpretation we need to learn more about these methods. Hermeneuti.ca introduces you to how analytics can be used so that individuals might have some control over the tools of analysis.

In short, we are practicing thinking in the humanities in an epoch of change in the way people read, the tools of reading, and the amount, privacy and organization of the information that we care about. And this matters.

Dangers

There are, however, some dangers ahead in Hermeneuti.ca with its doubled practices. The first is the disappearance of the author. To paraphrase what Shaftesbury said about dialogue, “the author is annihilated, and the reader, being in no way addressed, stands for nobody”. [29] This is a danger Heidegger and other philosophers of technology talk about: the danger that tools, when ready-at-hand, are transparent and the creator’s authorial responsibility for the tool is hidden. When using a hammer you don't wonder about the author of the hammer. The tool is an extension of their interpretation about what you might need that you should be careful of. To avoid this danger we have to ask how one might interpret tools. If tool development is research then it should be open to scrutiny as other types of research are, but open source is not openness in the way that a philosophical paper is open. It is hard to interpret things designed to be thought with rather than thought about because they are designed to withdraw, much as it has always been hard to interpret philosophical dialogues, at least as the position of a dissappearing author. This is not to give undue value to Romantic ideas about the importance of the author, it is simply to point out that the interpretation of technologies is hard to do, especially while using them. Matt Kirschenbaum’s book Mechanisms shows us one way forward in that he adapts bibliographic practices to electronic classics of electronic literature from Michael Joyce’s Afternoon to early adventure style computer games. He reads these as literature. We are taking a different approach and reading tools as hermeneutical things.

The second, and more prosaic danger, is that entanglement can lead to commoditization which can then corrupt research. David Noble in “Digital Diploma Mills: The Automation of Higher Education” warns about “the commoditization of the research function of the university, transforming scientific and engineering knowledge into commercially viable proprietary products that could be owned and bought and sold in the market”.[30] While we doubt there is any risk of being corrupted by commercialization in this project, especially since we are providing free access to the text and releasing the code for Voyeur Tools under an open license, we do worry about the entanglement in the administration of technology. Could we walk away from our tools if we were convinced they were inadequate or inappropriate? Are we willing to build tools that are conceptually interesting and innovative but that have little chance of being used? Does the weaving of development and critique mix practices that should be kept at arms-length for the sake of perspective? An easy answer is to argue that we have always been entangled, but that may just be sophistry as the entanglements in the humanities are usually trivial. No one really wants to buy our souls the way they want to buy pharmaceutical research results. That said, there is in the sort of humanities computing work that develops tools a very real difference and danger when you find you need to get grants, and to get grants you have to get matching funds from industry, and so on. Such engaged work is not by definition corrupt, but it is corruptible and that’s why we consider it a danger.

A third and final danger is a bundle of commitments that we can call the modernist commitments to progress through technique. To think through the development of possible technologies is to agree, at least provisionally, that there could be better designs. Bundled with the practices of design comes a hope of improvement and a belief in progress. While this hope can be moderated by care for unanticipated outcomes, and by a skepticism regarding the hyper-ventilated claims of computing, you don’t do it without any hope and you do such work in the knowledge that you can’t anticipate how it will be used ultimately. We regard this danger as unavoidable – it is the danger of any action – any involvement in the world that is not cynical. We all, in some form or another, try things out in the face of dangers, and that is our hope. For that matter, it is the danger of any other type of intellectual work – you could be misinterpreted. Without the confidence of an intellectual ground or clear ends we are all local-modernists – trying to make a way forward in the local, but in ignorance of the outer grounds or end.

Don’t Imitate, Contribute

How can I read a hybrid book/site like Hermeneuti.ca?

Descartes’ Discourse is important to the practices of the humanities because it marks a shift to the explicit discussion of method. Descartes, when he introduces his method as a personal history which you can imitate or not, is suggesting how the reader should engage the work that has influenced how research is done and that has dominated the methodological imagination of the humanities since. This book is about an alternative dialogical method where interpretation is done in conversation, specifically three types of conversation which suggest three ways you could read and engage us:

Voyeur Tools was developed in part as a result of our first experiment, “Now Analyze That” where we found it difficult to swiftly move from the analytical tool environment where textual evidence is explored to the environment of the essay where a new interpretation is crafted. If you will, we had trouble flying through from interpretation (the environment and activity of interpreting evidence) to interpretation (the writing environment of the new essay.) Voyeur Tools was designed, as mentioned above, to allow us to cycle back and forth from interpretation to interpretation. Voyeur Tools is designed to work with the types of Web 2.0 online writing environments that have emerged from blogs, to wikis, to works like hermeneuti.ca, which uses the open-source content management system Drupal. As in any experiment in interactive interface, the goal was to get to a point where the practices of moving between tool and text were swift enough to be experienced as another thread of dialogue rather than the long wait of years for the tool to come along that lets you ask the next question. We hope we have enabled your iterative experimentation with text technologies. We hope with hermeneuti.ca you can move swiftly through from interpretation to interpretation and not be held back as we were. If the interaction works, you will soon find the limits of Voyeur, and when you want one more feature you will have caught the bug that infected us. And if you then want to help write code you should see our How to Contribute Code online at hermeneuti.ca.

Works Cited

Auer K., and R. Miller. Extreme Programming Applied: Playing to Win. Boston, Addison-Wesley, 2002.

Beck, K. Extreme Programming Explained: Embrace Change. Boston, Addison-Wesley, 2000.

Crane, G. “What Do You Do with a Million Books?” D-Lib Magazine. Vol. 12:3. 2006. Online at http://www.dlib.org/dlib/march06/crane/03crane.html.

Descartes, R. A Discourse on the Method of Correctly Conducting One's Reason and Seeking Truth in the Sciences. Trans. I. Maclean. Oxford: Oxford University Press, 2006.

Galey, A. and S. Ruecker. “Design as a Hermeneutic Process: Thinking Through Making from Book History to Critical Design.” Paper presented at the Digital Humanities 2009 conference, University of Maryland. June 22-25, 2009.

Glickman R., and G. Staalman. Manual for the Printing of Literary Texts and ConcordancesA concordance or keyword in context (KWIC) is usually represented as a list of occurrences of a word with some limited context shown (words to the left and words to the right). Here is an example that shows the occurrences of the word "dream" in A Midsummer Night's Dream in TACTweb: I.1/577.1 | Four nights will quickly dream away the time; | And I.1/578.2 Swift as a shadow, short as any dream; | Brief as the II.2/585.1 | Ay me, for pity! what a dream was here! | Lysander, III.2/591.1 this derision | Shall seem a dream and fruitless vision, | IV.1/593.1 as the fierce vexation of a dream. | But first I will IV.1/594.2 to me | That yet we sleep, we dream. Do not you think | The IV.1/594.2 rare | vision. I have had a dream, past the wit of man to IV.1/594.2 the wit of man to | say what dream it was: man is but an IV.1/594.2 he go | about to expound this dream. Methought I was--there IV.1/594.2 his heart to report, what my dream | was. I will get Peter IV.1/594.2 to write a ballad of | this dream: it shall be called IV.1/594.2 it shall be called Bottom's dream, | because it hath no V.1/599.1 | Following darkness like a dream, | Now are frolic: not a V.1/599.2 theme, | No more yielding but a dream, | Gentles, do not See also the definition at Wikipedia. Return to Glossary. by Computer. Toronto: University of Toronto Press, 1966.

IDC, “The Expanding Digital Universe.” Project Director J. F. Gantz. White paper available online. 2007. <http://www.emc.com/leadership/digital-universe/expanding-digital-univers....

Kirschenabum M. G. Mechanisms: New Media and the Forensic Imagination. Cambridge, MA: MIT Press, 2008.

Lyman P. and Varian H. R. How Much Information. 2003. Report online at <http://www.sims.berkeley.edu/how-much-info-2003>.

Noble, D. F. “Digital Diploma Mills: The Automation of Higher Education.” First Monday. Vol. 3, No. 1. January 1998. <http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/5...

Miles, M. “Descartes’s Method.” A Companion to Descartes. Blackwell Reference Online. Eds. Broughton J. and J. Carriero. Oxford: Blackwell, 2008.

McCarthy, W. Humanities Computing. New York: Palgrave Macmillan, 2005.

McCarty, W. “Beyond the word: modeling literary context.” Text Technology. Forthcoming.

Rockwell, G. "Multimedia, Is it a Discipline? The Liberal and Servile Arts in Humanities Computing", Jahrbuch für Computerphilologie. Online and in print. Vol. 4, 2002. See <http://computerphilologie.uni-muenchen.de/jg02/rockwell.html>.

Rockwell, G. Defining Dialogue From Socrates to the Internet. Amherst, New York: Prometheus Books, 2003.

Shaftesbury, A. Earl of. Characteristics of Men, Manners, Opinions, Times, etc. 2 vols. Gloucester, Massachusetts: Peter Smith, 1963.

Xenophon. Xenophon in Seven Volumes. Trans. O. J. Todd and E. C. Marchant. 7 vols. Cambridge, Massachusetts: Harvard University Press, 1968.

Footnotes

[1] Descartes, Discourse on Method, Part 2, p. 12.

[2] To see the online version see <http://hermeneuti.ca>.

[3] To try Voyeur Tools see <http://voyeurtools.org>.

[4] “Now Analyze That” is included in print in this book. To see the interactive online version see <http://hermeneuti.ca/node/15>.

[5] To see all the Recipes of the Methods Commons see <http://methodi.ca>.

[6] Sinclair, S. and G. Rockwell. Collocate Cluster. Voyeur Tools. To try the tool with your text see <http://voyeurtools.org/tool/Links>.

[7] See Miles, "Descartes's Method" about his method of analytical reflexion.

[8] The full title of the Discourse is A Discourse on the Method of Correctly Conducting One's Reason and Seeking Truth in the Sciences. It should be noted, as mentioned above, that his "method" is different, though closely related, to his story of how he went about correcting his thinking until led to the method. It is only in the three appendixes to the Discourse like that on Geometry that you get results from the method. They are the case studies showing how his method could be employed scientifically.

[9] Collaborative research is a new phenomenon in the humanities that is often dealt with by assigning percentages to the final outcomes as if writing a paper collaboratively was simply a division of labour as in, “I wrote that part worth 30%, and she wrote the remaining 70%.” For more see “The Evaluation of Digital Work” wiki maintained by the Modern Languages Association, <http://wiki.mla.org/index.php/Evaluation_Wiki>.

[10] McCarty, Humanities Computing, page 81.

[11] There is something comedic about the strange pairs of computer science student programmers and senior scholars that do many humanities computing projects. The mismatch of backgrounds, age, and interests is what humanities computing attempts to bridge.

[12] For an introduction to Extreme Programming see Beck, Extreme Programming Explained or Auer and Miller, Extreme Programming Applied.

[13] The Recipe “Explore Themes Across a Text” can be followed at <http://hermeneuti.ca/node/123>.

[14] For the Day of Digital Humanities project see <http://tapor.ualberta.ca/taporwiki/index.php/Day_in_the_Life_of_the_Digi....

[15] It is by no means the only site where applications of computing to the humanities have taken place. Computational lingustics, quantitative history, cyberculture studies, media studies and recently game studies are other fields with communities of research using computing methods.

[16] For information about PRORA see Glickman and Staalman, Manual for the Printing of Literary Texts and Concordances by Computer. For TACT (Text Analysis Computing Tools) see <http://projects.chass.utoronto.ca/tact/> and for TAPoR see <http://portal.tapor.ca>.

[17] McCarty, “Beyond the word: modelling literary context”, page 1.

[18] Willard McCarty introduced the idea of a Methdological Commons with a chart on page 119 of Humanities Computing. The chart depicts the Commons and its relationships to disciplines. His position in Humanities Computing is too nuanced for a quick summary, but in general he has argued for an interdisciplinary field and against the independence of disciplinarity. We, on the other hand, have argued for our own research agenda and programmes in articles like “Multimedia, Is it a Discipline?”

[19] Development as a form of research is common in the design field as Alan Galey and Stan Ruecker argued in “Design as a Hermeneutic Process: Thinking Through Making from Book History to Critical Design.”

[20] For TACTweb see <http://tactweb.mcmaster.ca/tactweb/doc/tact.htm>. For HyperPo see <http://tapor.mcmaster.ca/~hyperpo> and for TAPoRware see <http://taporware.ualberta.ca>.

[21] For a more in-depth consideration of dialogue see, Rockwell, Defining Dialogue From Socrates to the Internet.

[22] Xenophon, Memorabilia, IV. vi. 1.

[23] For more on this see the American Library Association's site, Google Book Settlement: An Informational Site for the Library Community at <http://wo.ala.org/gbs/> or Google's site on the Google Book Settlement at <http://www.googlebooksettlement.com/>.

[24] One computing humanist that anticipated this issue of scale is Greg Crane in, “What Do You Do with a Million Books?”

[25] A petabyte is 1,000,000,000,000 bytes.

[26] IDC, “The Expanding Digital Universe”, page 1.

[27] Ibid. page 1.

[28] See < http://cautbulletin.ca/default.asp?SectionID=0&SectionName=&VolID=34&Vol...

[29] See Shaftesbury’s Characteristics of Men, Manners, Opinions, Times, Etc., p. 132. The original quote is, “For here (in dialogue) the author is annihilated, and the reader, being no way applied to, stands for nobody. The self-interesting parties both vanish at once."

[30] Noble, “Digital Diploma Mills.”

[31] Instructions on How to Contribute are at <http://hermeneuti.ca/node/80>.

Now Analyze That: The Rhetoric of Text Analysis (Case Study 1)

Now Analyze That: Comparing the discourse on race

Sounds like he talked a hate speech, doesn't it? Now, analyze that. (Wright, NAACP Speech)

Introduction

Resources

Texts Used in this experiment.

TAPoR Portal for text analysis research.

Experimental Notes, May 1 & 2 are our notes on this experiment.

Recipes for learning text analysis including a comparison recipe.

Text Analysis Developers Analysis wiki with lots of materials like this.

In the lead-up to the 2008 US Presidential election the news media became interested in the conflict between what Barrack Obama had to say about race and what his spiritual mentor Jeremiah A. Wright Jr. had to say.1  The news media presented Obama and his spiritual father as in an oedipal drama. Obama the son tries to distance himself from his father-pastor to win the presidency while Wright struggles to continually correct the record while getting attention unlike what he is used to getting from the pews. Both, in different ways, are trying to tell the media what should be talked about and how. Both want the attention on more substantive issues and, in trying to redirect us, have given moving and important speeches on race and America (by which we mean the USA). Both have been trying to use the attention to redirect us to what "this time we want to talk about", or, to use Wright's blunt phrase, they challenge us directly: "now, analyze that"!2 

Of course the media know where the engaging human story is and it is in the age-old conflict of the son and his father, as the son comes of age as a leader.

But, what if we took them at their word and looked away from the pulpit-and-pews drama. What if we take them seriously and look at what they say. What if we try to "analyze that" looking for the similarities and differences between their speeches. Are they a generation apart in their thinking or are they caught in the headlights of the media?

So we decided to quickly analyze and compare a speech by Obama and one by Wright. There are ironies to this analysis, but those will come out later. This is an experiment, but that too will come out.3

The Texts

The two speeches we chose to look at are:

  • Barack Obama's March 18, 2008 speech A more perfect union which he gave in response to the controversy and to clarify where he stood on race. This speech has been generally considered one of Obama's finest on race and America.
  • Jeremiah Wright's April 27th speech to the NAACP that follows Obama's speech and also deals with race.

Why these two texts?

  1. First, because we weren't interested in the "gotchas" that bloggers and media have been focusing on, like Wright's references to Louis Farrakhan. We looked at Wright's speech to the National Press Club on April 28, 2008, but chose not to use that speech because a large portion of it took the form of question and answer and therefore would not necessarily reflect how Wright wanted to shape the issues.
  2. Second, we were able to find reasonable transcripts (from prominent news media sources) for both with associated video records, though there are typos in both that suggest either problems in transcription or oral infelicities. We have not proofed either against the video records, letting the record stand.
  3. Finally, and most importantly, these seem to be the important documents to which people are returning to understand Obama and Wright's positions. Why not analyze that?

Much of computer-assisted text analysis is essentially about counting and comparing. One thing the computer can show you is differences in word use, but what the computer shows you is just something to think about - we will need to interpret the something. What then stands out in their words as differences worth thinking more about?

This Time We Want To Talk

One of the first things we noticed was that Obama uses the word "time" far more often than Wright.4 In fact, at the climactic end of Obama's speech, he repeatedly uses the phrase "this time we want to talk". This table shows a concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. of all the instances of "time" in Obama, see for yourself:

Repeated phrases like this are always an indication of something, in this case they are at the climax of Obama's speech and tell us two things.

  • Not this time Obama is trying to redirect what we, including the electorate and the media, talk about this election. He is making a claim about discourse during an election and calling for it to not degenerate this time as it has other times. He wants to elevate and focus what is talked about on what he believes matters to the electorate and away from the identity politics that tars him with Wright. For Obama Wright is a distraction, and if that is what the media pays attention to then the nothing will change, and change is what Obama promises. "But if we do, I can tell you that in the next election, we'll be talking about some other distraction. And then another one. And then another one. And nothing will change."
  • We want to talk about is the phrase the precedes what Obama thinks is important, and it is a list of things that he believes are important. The repetition of the phrase is the climax of the speech, both in terms of location and in terms of the rhetorical power of its repitition. If we want to know what Obama thinks is important for us to talk about we should pay attention to what "this time we want to talk about".

And what are the five things Obama wants us to talk about? They are a fairly traditional list for Democrats that includes education, health care, jobs along with the war in Iraq.

  • Crumbling Schools - Education
  • Lines in the Emergency Room - Health Care
  • Shuttered Mills - Loss of Manufacturing Jobs
  • Shipping Your Job Overseas - Business Outsourcing
  • Serving and Fighting Together - The War in Iraq

But there is a difference, and that is that for Obama these are issues that transcend race. "This time we want to talk about the crumbling schools that are stealing the future of black children and white children and Asian children and Hispanic children and Native American children." For Obama an election is about the common issues that affect all races rather than our differences.

The thrust of his speech is that this election time should be about the issues that Americans (both white and black) have in common, not about the issues that hijack elections (for the Republicans).

Committed to Repetition

Interestingly, when we looked to see if there was a similar repeated phrase in Wright's speech we found one, "we are committed to changing the way" that is similarly located at the climax of the speech and is similarly used to draw attention to the change important to Wright. The distribution graph for "committed" shows how it is distributed towards the end of the speech similarly to how "time" was distributed in Obama.

A concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. of the word "committed" in Wright shows a pattern of similar phrases that he repeats:

Again, that which Wright and his audience are committed to is at the heart of what Wright has to say and has to do with changing the way we see and treat ourselves and others. The heart of it is two words that show up with text analysis as used more by Wright: different and deficient. Wright wants people to see and treat each other as different, not as deficient. And it is not just about race.

In the past, we were taught to see others who are different as somehow being deficient. Christians saw Jews as being deficient. Catholics saw Protestants as being deficient. Presbyterians saw Pentecostals as being deficient.

Folks who like to holler in worship saw folk who like to be quiet as deficient. And vice versa.

Whites saw black as being deficient. ...

Europeans saw Africans as deficient.

Strangely Wright also goes on about differences beyond those between people like differences between African and European music. These differences of rythm illustrate something important for Wright.

Now, what is true in the field of education, linguistics, ethnomusicology, marching bands, psychology and culture is also true in the field of homiletics, hermeneutics, biblical studies, black sacred music and black worship. We just do it different and some of our haters can't get their heads around that.

Different and Deficient

This is the difference between Obama and Wright. Obama sees challenges common to all and Wright sees differences that need to be recognized in order to be treated.

Obama is running for President and wants us to turn away from difference so we can see the challenges we have in common - what is deficient in the country as a whole. Wright is not running for election (though he is dealing with the media attention from an election), but is a minister and asks us the audience to make a commitment to how we see and treat difference.

Obama is trying to turn electoral discourse to political issues that administrations can solve. Wright is trying to turn away media criticism to focus on individual change - the changes we as individuals can commit to.

Obama talks about Wright, but otherwise is talking to the American public. Wright references academics, as if to say that his position isn't so extreme, but otherwise is talking to the NAACP and not about Obama. Obama needs to distance himself from Wright, and Wright probably doesn't want to cause any more trouble of Obama.

Conclusion

So what do these two have to say about race in America? First we should note that race is still about "black" and "white." Here are the most frequently used words in both speeches.

"Black" is the highest frequency word after "I", and "white" is up there, though it should be noted that Wright only uses "white" 4 times compared to Obama's 27. It is also worth noting that neither of them uses the phrase "White House", preferring the less coloured "Oval Office."

In sum, Obama is talking to all races, and he goes out of his way to talk about his white grandmother. Wright, on the other hand, is addressing the NAACP and talking from the perspective of the black church.

Obama distances himself from Wright's use of "incendiary language to express views that have the potential not only to widen the racial divide, but views that denigrate both the greatness and the goodness of our nation; that rightly offend white and black alike." Obama has some sympathy for his "religious leader's effort to speak out against perceived injustice", but unequivocally condemns Wright as being divisive.

As such, Reverend Wright's comments were not only wrong but divisive, divisive at a time when we need unity; racially charged at a time when we need to come together to solve a set of monumental problems - two wars, a terrorist threat, a falling economy, a chronic health care crisis and potentially devastating climate change; problems that are neither black or white or Latino or Asian, but rather problems that confront us all.

Wright on the other hand is insisting that there are real differences, and by implication divisions that must be acknowledged even if politically charged.

We will close with a view of the collocates of the words "black" and "white" in both speeches. Collocates are words that appear near the words in question. This static visual collocation Collocation refers to the occurrence of words adjacently more often than would be expected by chance. Collocation is the relationship between two words or groups of words that often go together and form a common expression. If the expression is heard often, the words become 'glued' together in our minds. 'Crystal clear', 'middle management' 'nuclear family' and 'cosmetic surgery' are examples of collocated pairs of words. Some words are often found together because they make up a compound noun, for example 'riding boots' or 'motor cyclist'. Return to Glossary. should provoke you to think about how Obama and Wright talk about black and white; or you can try to analyze that with the yourself.

blackAndwhite.jpg

  1. 1. See the Wikipedia article, Jeremiah Wright controversy
  2. 2. Transcript of Jeremiah Wright's Speech to the NAACP.
  3. 3. We are digital humanists interested in how computing methods can be used to study, among other things, contemporary culture and politics, not political scientists. This essay was written as an experiment in rapid collaborative computer-assisted text analysis, what we call extreme text analysis after the movement in computing called Extreme Programming or Pair Programming. Our goals were:
    • To spend no more than two days taking a small and meaningful text analysis project from discussion through to presentation of results (this page.)
    • To test the TAPoR (Text Analysis Portal for Research) environment and record bugs, enhancements and general thoughts.
    • To develop new tools like Voyeur or fix our old ones to better suit real projects like this.
    • To reflect on computer-assisted text analysis as a research practice and the rhetoric of reporting results.

    For more on this see Experiments In Text Analysis. In particular see the May 1, 08 Experiment Notes which were written as we were doing this.

  4. 4. When comparing texts using the computer, it makes sense to compare their relative use of vocabulary - to see what words are used more often in one text compared to another.