Searching for Uses and Users in Gene Ontology Research

W. John MacMullen

ASIS&T 2008 Annual Meeting
Columbus, Ohio, October 24-29, 2008


Despite millions of dollars in investment over the past decade in the creation and maintenance of the Gene Ontology (GO), little is known about how (or even if) its intended end users – biomedical researchers – actually employ the ontology and its related databases and interfaces in their work. This project is a preliminary investigation of what evidence exists in the literature of specific uses of GO by researchers, and of use cases proposed for researchers by system designers. This work will help inform future in-depth studies of the specific information needs and research questions that researchers might use GO and other similar knowledge structures to address. It also provides to library and information science researchers and practitioners some insight into the quantity, sources, and breadth of publications about GO that exist. The Gene Ontology is intended as a means of cross-organism integration of knowledge of the molecular functions, biological processes, and subcellular localization of gene products (Gene Ontology Consortium, 2008a). Much of basic biomedical research is performed on organisms that serve as surrogates for humans, due to their relative biological simplicity and the inappropriateness of direct experimentation on humans. GO integrates knowledge about gene function, process, and location across such so-called “model” organisms as the fruitfly Drosophila melanogaster, the mouse Mus musculus, and budding yeast Saccharomyces cerevisiae. While dozens of ontologies and other controlled vocabularies for biomedical research, clinical medical informatics, and general science are under development (Smith, et al., 2007), and many different systems and applications have been developed to employ these vocabularies, there is often a knowledge gap between ontology developers, maintainers, and system designers, and the target audience of end users (Rubin, Shah & Noy, 2008). Ontology developers and curators in biomedical informatics are frequently subject matter experts, holding the same academic credentials as the putative end users, and often have similar backgrounds and laboratory experience. Although developers may have a user perspective in mind, they are often in the position of trying to promote adoption and use of such systems to practicing researchers (see, e.g., Rhee, Dickerson & Xu, 2006). Rubin, Shah & Noy (2008) classify biomedical ontology use into six functional types: search and query of heterogeneous biomedical data, data exchange among applications, information integration, natural language processing, representation of encyclopedic knowledge, and computer reasoning with data (76). While abstract examples of uses of ontologies in these areas are discussed, specific cases are not. To better understand actual use, this exploratory literature-based study investigated the existence of user studies of biomedical scientists use of the Gene Ontology; experimental articles that cite the use of GO; GO infrastructure and -application articles that describe user needs assessment; and articles that provide GO use cases

