In order to authenticate the meaning of collections and to
preserve their evidentiary value, archivists create documents (finding
aids) that describe the provenance and original order of
the records (MacNeil, 1995). Metadata standards such as Encoded Archival
Description (EAD) enable finding aids to be encoded, searched, and
displayed online. However, recent research has begun to draw attention
to problems with the quality of EAD finding aid data and metadata, and
the encoding practices by which finding aids are created. Since the
next frontier in archival description involves reusing finding aid
data for advanced information visualization techniques that support
additional ways of engaging with collections, there is a
pressing need for further study of data quality and how it might impact information
visualization. This work analyzes a set of 8729 finding aids
aggregated by the Texas Archival Repository Online (TARO) using
VADA, a visual analytic tool for finding aids. The results show previously
unidentified problems that have significant impact on the ability to
visualize this data. The paper explains how these problems relate to
both EAD’s design and the actual encoding practices of EAD, and
provides recommendations for improving the quality of
finding aid data.