AM Posters 2009 START Conference Manager    

Textual Data Analysis Using a Nonhierarchical Neural Network Approach

Brenda Battleson and Joseph Woelfel

(Submission #87)


Abstract

Artificial neural networks excel at recognizing patterns in human communication expressed in textual form. When analyzing a textual dataset, a neural network engine uses its pattern recognition capability to assign weights representing the multiple connections among concepts. These weights are used to create dendograms or otherwise categorize concepts in a hierarchical manner. However, this approach has limitations.

The human brain is the most sophisticated example of a parallel distributed processing machine. The language used to express human ideas, attitudes and emotions is evidence of this sophistication with words often having different meanings depending on the context in which they occur. A hierarchical clustering method is unable to fully describe multiple relationships because it is only able to show concepts connected in one way. Each concept is assigned to only one “best” cluster in the output suggesting that there is only one meaning of that concept in the data analyzed.

The use of a non-hierarchical approach can address this limitation since it allows the researcher to interact with the neural network to explore all possible meanings of a concept. Thus, in the resulting output a concept may appear in as many clusters as are appropriate.

In this study, hierarchical and non-hierarchical procedures are used to analyze a large dataset of newspaper articles related to opinions about the terrorist attacks of September 11, 2001. The full text of editorials, opinion pieces and letters to the editors of all U.S. newspapers indexed in the FACTIVA™ database were retrieved for the month of September 2006 and analyzed using the CATPAC™/ORESME™ software packages. While clusters that emerged in the hierarchical approach were relatively predictable, the non-hierarchical approach suggests that relationships among terms which may not be statistical “best fits” are nonetheless important in finding “meaning” in the text. Method and results are discussed in detail.


  
START Conference Manager (V2.54.6)