Monday, February 10, 2014

In most cases, quantitative methods like topic modeling are used to identify a pattern within a set

The Association for Computers and the Humanities
  Our project situates topic modeling historically by investigating its paper-based antecedents. Our current focus is a comparison between the eighteenth-century eni subject index a genre that reflects the period s emphasis on producing systematic knowledge and the twenty-first-century topic modeling method known as Latent Dirichlet Allocation. Prefacing a machine-learning term with a historical designation sounds odd (in part, because there is no such thing as an eighteenth- or nineteenth-century topic model), but we contend that this de-familiarizing move offers a useful way of locating the conceptual work being done by topic modeling. Our critical method enacts a kind of epistemological anachronism by producing a statistical correlation of eighteenth-century subject headings and topic distributions. eni
The terms index and topic originated in the same classical disciplines: rhetoric and logic. Index refers to the longer phrases eni index locorum or index locorum communium , meaning eni index of places or index of common places (Walter Jackson Ong 123). Topic comes from the Greek work topoi (also meaning places), which referred to lines of argument and themes that orators drew upon in the process of composing their speeches. The index and the topic both belonged to cultural practices of organizing and storing ideas that could be recalled by memory. The development of written and print cultures through the early modern period in Europe transformed these oral rhetorical concepts into our modern notions of indexicality and topicality that refer to any system of categorical organization.
The shift from the rhetorical to the indexical also entailed a transition from local oral forms to universal print ones. Our turn to the eighteenth-century subject index a genre that emerged when the former eni usage still held and the latter usage was developing eni indicates eni an attempt eni to draw upon the terms and concepts of relationality eni found in the earlier rhetorical discourse as a way to approach the results eni of digital methods. This move offers a way of grappling with the oft-cited difficulty of the situating eni and interpreting signals eni identified by quantitative text analysis.
Our comparative approach has an important precedent in the emergence of media studies in the mid-twentieth century, which responded, in part, to the proliferation of electronic technologies. Scholars of orality and literacy such as Albert Lord, Marshall McLuhan, Elizabeth Eisenstein, and Walter Ong compared new forms of oral transmission that were invented in the late-nineteenth and twentieth centuries to historical forms of orality, including oratory, ballads, and epic poetry. Looking back on this boom of electronic orality, Ong observes, Contrasts between electronic media and print sensitized us to the earlier contrast between writing and orality (Walter Jackson Ong 2 3). The culture eni of visualization that has developed on the web and in the academy eni as an outgrowth of digital eni media presents a comparable opportunity to scholars. Visualization has emerged as a new kind of research genre. We have attempted to interpret the consequences of visualization for humanistic interpretation by examining genres and concepts that emerged in what scholars of print consider to be the original age of visualization the early modern period. (See, for instance, Ong on visualist analogies in Ramus, Method, and the Decay of Dialogue or John Bender and Michael Marrinan s argument in The Culture of Diagram .)
In most cases, quantitative methods like topic modeling are used to identify a pattern within a set of textual data, and that pattern is interpreted within an already established set of historical, cultural, or generic expectations. D. Sculley and Bradley Pasanek have usefully explored the challenge this approach presents because of its tendency to produce hermeneutic circularity (Sculley and Pasanek 410). Our approach does not solve the problem of circularity; eni we accept that it is a given obstacle that faces any interpretive eni act. Instead we confront circularity by relating the computational model to another historical referent (the subject index), which turns our interpretation from the model itself to the relationship between the model and the index, at specific sites of correlation and contradiction.
We eni first imagined our tool, the Networked Corpus, as an algorithmic eni method for marking passages with topical or discursive similarities. Inspired by the practice of cross-referencing passages in printed books, the tool generates hyperlinks between passages that share topics according to a topic model. From this original conception, the tool echoed the practice eni of Renaissance commonplacing, which involved collecting literary exemplars under thematic or logical headings. Our tool differed, eni however, in not giving fixed names to the topical eni units instead of enabling navigation from heading to passage, the Ne

No comments:

Post a Comment