B U L L E T I N
Emerging Content Requirements for News Products
Howard Williams is a business and technology analyst in Colorado Springs, Colorado; he can be reached by e-mail at email@example.com.
The current trend toward media convergence has placed new demands on traditional print news providers. Having adapted to the new media environment in several ways, not the least of them being through expansion into the online arena, these same news providers are also under increased economic pressure to identify new sources of revenue. This problem is intensified by the loss of readership that has impacted their traditional print products. Risks inherent in the new media environment, however, also carry with them opportunities for generating new and different types of news products. These products reflect many of the current trends in online content production, presentation and delivery, including, for example, the use of multiple delivery channels, content aggregation and syndication services, personalization based on reader profiles, advanced search-and-retrieval capabilities and new design approaches for Web-based presentation. We can expect to see these capabilities extended and leveraged by news and other content providers as they create increasingly differentiated products, appealing to consumers with a diversity of needs and interests. At the same time, the environments in which these products are created will likely be based on new content production models, characterized by new value chains and the increased exploitation of value through multipurposing of content.
Despite the current and expected use of many of these capabilities and features for online news presentation and delivery, the actual process of reading, referring to the perception and comprehension of printed symbols, when readers engage the text (and the knowledge acquisition that results from this engagement), is not unlike what happens when readers read a print newspaper. The readability of text is therefore an important variable in determining how readers will take advantage of the printed words before them.
This focus on reading the news is justifiable when one considers the continued importance of text for representation of news content. There is, in addition, substantial evidence that text is still an important medium for large segments of the news consuming population. Given the increasing volumes of text-based news content and competing demands on readers' time and attention, the problem for news readers is, so to speak, how to read at Internet speed. In addition, readers arguably have motivation to learn about events in the world and to stay informed. Given the investment that news providers make in their text-based products, the problem for them is how to make these products accessible and how to exploit the economic value inherent in the content. They also have an investment in the cause of learning, which is consistent with the long-standing mission of journalism to help educate the reader. As we will argue below, however, none of these objectives is served by packaging news products in the traditional way. By understanding the behaviors (and psychology) associated with reading the news, it is possible to envision new ways of creating and packaging news products that will better meet the objectives of both readers and news providers.
Reading the News
People who "read the newspaper" do not actually read it. More often than not, they skim it, grabbing information here-and-there according to what catches their eye. Only occasionally do they slow down enough to read portions of the paper, and very rarely do they read entire articles. The general observation is that readers' experience with newspapers is quite cursory, and the typical reader often carries away from the newspaper only as much information as is contained in the headlines. Corresponding with the reader behavior of skimming print newspapers, consumers of online news have a similar characteristic of reading in a cursory manner, with a tendency to pursue topics in more depth only on a selected basis.
Among these observations, of particular interest is the suggestion that much of the written text in either paper or online news products (the result, after all, of considerable effort on the part of news journalists and editors) goes unread. By the same token, despite the journalistic mission to educate news consumers, one might also conclude that most readers do not learn much when they read the newspaper, since, after all, what is unread will not be learned. In making these observations, we are not saying that the current news is unreadable, nor are we saying that nothing is learned from it. We are suggesting, however, that it is quite often not being read in the manner in which it is intended, with the consequence that readers quite often learn less than they could. This also has an economic dimension, since underutilized content possesses unexploited value.
The premise in this article is that there are unexplored opportunities for news providers when consideration is given to the full range of variables that impact readability. The fact that typical readers do not take advantage of news content as it is currently packaged and delivered is, in fact, only part of the rationale for this statement. There are, in addition, compelling reasons to identify products that respond to the personalized demands of consumers. These personalized requirements reflect a general trend toward providing customized services and include a number of variables that relate to readability of text. Finally, it is important to emphasize that there are a variety of constraints on the typical reader's ability to perceive and comprehend text. Consideration of these variables provides evidence of new ways to add value to text, the result being not only more accessible content, but products that appeal to a more diverse consumer population.
Reading and Learning
Reading and learning studies highlight important features of text and its presentation that have an impact on reading time and recall (as one indicator of learning). Applying these observations to the consumption of news information adds context to some of the behaviors identified above. From these studies, we note, for instance, that text can be enhanced to improve comprehension, using features such as illustrations and examples, questions, headings, underlining, italics, notations and highlighting. These studies also demonstrate that what readers learn while reading is heavily influenced by their prior knowledge and the goals and strategies that they employ while reading. Cursory reading of text on subjects with which the reader has only surface familiarity is not a strong indicator for learning and helps to explain why even regular news readers retain very little information from day to day. Indeed, very rarely are active strategies employed when reading, although there are some very obvious ones such as repetition that would predictably benefit the reader.
These observations suggest features and capabilities that might enhance the presentation of news. There are other observations from reading and learning studies that provide a cognitive dimension to some of these behaviors. These studies indicate how, in the absence of clear connections between events in a story, readers will attempt to fill in the blanks (make inferences) based on their own knowledge.
We know, for example, that events are connected in the reader's mind by cause-and-effect relationships that provide coherence among parts of a story, and the ease with which a reader draws a causal inference influences reading time and recall. In addition, in the absence of clear connections, readers will infer causal explanations based on their own informal conceptions of causality. These studies also demonstrate that when causal relations in a story are implicit (rather than explicit) people are not very good at making correct inferences and often misrepresent the actual relations.
In addition to this natural tendency to infer causal relationships when they are trying to understand what they are reading, readers also have a tendency to make predictions and track people's goals, all with the intent of identifying the motivations and purposes behind people's behaviors. This attempt to understand motives is an important factor in a reader's search for explanations for events. In the absence of information regarding motivations, readers will use their own knowledge (however deficient) of motives, wants, needs and goals, in order to understand what they read. It is also commonplace for people to use stock explanation patterns (clichés and trite explanations) when making inferences.
These observations suggest a two-way street between the news consumer and the news provider. On the one hand reader characteristics and behaviors can be viewed as real constraints on the reader's ability to comprehend news content. Insofar as news reading is an inferential process, there are a number of relevant variables that impact reading and learning that the average reader will neither be aware of nor seek to actively compensate for. On the other hand they suggest opportunities for news providers to devise appropriate presentation techniques in order to compensate for many of these constraints.
Despite investment in new design approaches to online news presentation, however, techniques that might assist the reader in comprehension are arguably lacking or underutilized in conventional presentation of news. The absence of techniques that might help the reader to make clear connections between events in a story, for example, or that clearly identify important explanatory information, necessarily places the burden of inference on the reader, who, as we have seen, is not typically equipped for the challenge. One consequence of this is that readers don't learn much. Another consequence is that a tremendous amount of useful information remains buried within the text of stories. One starting point for considering new opportunities for enhanced presentation of news, then, begins with the assumption that relevant information must be surfaced from within the news story (or from multiple sources) and presented in such a way that the reader can benefit from its identification.
Enhancing the Presentation Value of Online News
We believe it serves the fundamental mission of journalism to view the observations presented above as opportunities for news providers. It also represents a response to a definable consumer requirement. If we know, for example, that the majority of readers do not read beyond the first couple of paragraphs in a story (if they read that far), then some attention to alternative presentation of the remaining text would serve the cause of improved readability. Likewise, if we know that readers tend to gravitate toward pictures or other graphic displays, it would make sense to add more descriptive text nearby, so the reader doesn't need to search through the text for explanatory content.
Making text more readable might be achieved by using techniques that are quite common in other styles of writing (for example, technical communications), but less common in newspaper writing. They include more extensive use of headlines and section headers, with more substantive information contained in them in order to accommodate skimmers, use of highlighting or other visual means of emphasizing content to draw out main ideas or story content or use of "information chunking" or outlines, so that content is organized around main ideas.
In order to support learning, some level of story repetition from day-to-day might be used, since we know that repetition helps comprehension. Many readers forget the background of stories by the next day's news, and they will not have the patience to search for it within the story text; they will simply skip the story in its entirety. Accommodating readers' characteristics might also include the addition of easy-to-understand (and easy-to-locate) explanatory information. Many readers become lost in story content, and even educated readers need more help with basic ideas than is generally assumed.
Other presentation features could respond to some of the observations made above and, in some cases, reflect a deeper understanding of the content itself. Insofar as they are suggestive of future capabilities, they define some of the emerging content requirements for news products, which include the following:
It is possible, of course, to find examples of some of these features in news presentation today. In most cases, however, either the supporting capabilities are not available or there are other factors that discourage their implementation. From our perspective, they represent examples of products that respond to often unacknowledged reader requirements. While they may not be of interest to all consumers, they can arguably be viewed as options for all consumers to select from. This suggests the overarching importance of personalization in considering new product features that address variables of readability.
There are many ways in which news products might be personalized, including, for example, by subject interest or editorial preferences. By extension, we can envision other personalized features that address variables of readability, including the following:
Implementation of many of these features would obviously be either labor intensive or require the application of sophisticated technology. What they suggest, however, is the importance of identifying preference options based on definable reader characteristics. Insofar as there are personalized variables that support the reader's task of reading and learning, this becomes especially critical. Reader profiling at a fine level of granularity is a prerequisite for creating news products that meet these preference requirements. The importance of meeting these requirements across a diverse reader population will drive a set of requirements for content management that is predicated on the multipurpose nature of content, where the content itself is processed at equivalent levels of granularity. The ability to respond to very fine-grained user preferences necessarily translates into specialized features and capabilities that can improve the reader's ability to read and comprehend news content.
The Importance of Explanation
Explanation answers the questions "why" and "how," two of the six basic questions traditionally addressed by journalists. It is generally accepted that clear and simplified explanations of the "how" and "why" of news stories represent value added to the reader. Good explanations address the common problem, frequently encountered by readers, of reading the news without appropriate background or context, despite the fact that they may have read similar stories countless times before. The learning studies discussed above provide context for understanding reader constraints when seeking explanations. At the same time, explanatory information in news stories is often lacking, inadequate or buried within the text.
Explanations are important because this is what readers ultimately desire. In the absence of explanations provided in the text, readers will either abandon the process or seek their own explanations. Good explanations, by contrast, are those that can anticipate reader questions in advance and provide adequate answers to these questions. In many cases, these questions are not arbitrary or story-dependent, but rather are common and somewhat predictable. Readers will differ among themselves less in their preference for explanations of this type than in the level of explanatory detail that interests them. Some reader questions, for example, are fairly straightforward and refer to basic background information, including history of events, timelines, etc. Other questions relate specifically to cause-and-effect relations that govern events; that is, they explain how something works and address the functional mechanisms behind actions or events. Another set of reader questions addresses the cognitive need to attribute motives to actors. Specifically, these questions relate to actors and their roles, along with their beliefs and goals. One might say that the reader is attempting to flesh out a set of plausible cause-and-effect chains that provide a suitable basis in motives for explaining news events.
From learning studies, we know that it is not sufficient to address explanations in an implicit way, since this places the burden on readers to identify them and/or make inferences on their own. Therefore, good explanations will also make explicit the cause-and-effect relations that are identified as governing an event, or the motives or goals of actors, or the pertinent background or functional information.
We believe there is an opportunity to add value to news content by deliberately presenting the content so that explanation is emphasized. Through use of presentation techniques, the explanatory content can be displayed for rapid comprehension and readability, subject to consumer preferences. A prerequisite for presenting this explanatory information is that it first be identified within the text of the news story (or from multiple sources) and then "surfaced" so that it can be managed as a modifiable feature of the text. What is to be surfaced is what's important for purposes of explanation. We are referring here in many cases to "deep structures" that address the semantics and meaning of the story or event itself. Surfacing explanation features may in some cases require domain-specific knowledge, even if the categories that they represent can be viewed generically across domains. For example, categories might include
From the preceding discussion, we can conclude that there is a need to process content in such a way that a news item can be represented in different ways for different consumers. The content is therefore inherently multipurpose. At the same time, there is a need to process content at a deeper level of granularity than is currently associated with conventional indexing and text analysis.
In many cases, these requirements involve the identification or derivation of text-based features or structures and the application and use of these features or structures for various purposes. For example, categorization is the process of assigning categories to documents. These categories may be pre-defined, or they may be derived from the text of documents. Regardless of the approach, the categories are identified or derived and then applied to new documents, based on features of those documents. This process of categorization can then be used to support a particular purpose, for example, to support matching of documents with reader profiles.
By extension, we can identify a set of content requirements for identification or derivation of features and structures that include the following:
In support of explanation, there is a requirement to provide answers to reader questions. The specific requirement is to be able to extract information from text that provides suitable answers to these questions. Explanatory constructs include structures and features based on schema that address reader questions about the mechanisms underlying events (how things work) and pertinent background information, including information about actors, their roles, motives, goals and beliefs.
These content requirements are suggestive of possibilities for generation of news products. They provide the basis for identification of specific news products that exploit the underlying content and present it in ways that emphasize its potential uses. We refer to these content requirements as emerging requirements because they are driven by new trends in news production, and also because they require, in many cases, the capabilities of new technologies that are by no means commonplace in their application.
Enabling Technologies for Online News Production
In effect, we are envisioning a sophisticated system that can provide highly refined text analytics as a stage in the process of news product creation. The technology that provides this level of text analytics is grounded in the disciplines of information retrieval, language modeling, computational linguistics, natural language processing, knowledge representation and artificial intelligence.
There is a substantial body of research derived from these disciplines that applies directly to news content. Evidence of this research can be found prominently in the workshops and evaluation projects conducted under the National Institute of Standards' Text REtrieval Conference (TREC) and related programs. This large and active body of research greatly facilitates the objective of mapping technical capabilities to emerging content requirements. Capabilities evaluated within these efforts represent leading-edge research technologies as well as commercial products and include the following:
The emergence of XML as a foundation for abstract representation of content has led to a variety of approaches to implementing many of the capabilities described above, using XML-based specifications. As such, XML should be viewed as an enabling technology for purposes of addressing emerging content requirements for news products. The use of XML for content representation also provides the basis for considering new models of content production. Instead of viewing news production as the writing of stories and their delivery as packaged units, for example, we can envision an environment in which news content is analyzed and segmented into meaningful components that are then stored in XML databases and utilized in workflows that generate a variety of news products.
Content Management for News Production
The news production environment we are envisioning has both high-tech and low-tech components. The technologies presented above serve the purpose of meeting requirements for intelligent information retrieval, text analysis and knowledge-based representation. Low-tech requirements, that is, those that require human intervention, include the application of human factors knowledge, presentation techniques and good authoring and editing.
The general model for content production includes aggregation of content from multiple sources, categorization and summarization of content, analytical processing and representation of content for multiple purposes. One of these purposes is to provide "rough copy" to subject experts, editors and writers, or software agents, who generate content for final consumption (as explanations, summaries, highlighted text and so forth).
This view of production is clearly simplified. Perhaps of most interest is the suggestion that these technologies will be exploited rather routinely in news production environments in the future. Because of the nature of the technologies and the requirement that there be large text databases for use as source content, it is reasonable to expect that these production environments will be hosted primarily by large content aggregators. The implication is that the news production environment in the future will be fundamentally different from what it is today, both in terms of how news content is generated and in the kinds of work and skills needed to generate that content. This change does not mean that journalists and editors will not continue to create news content in the traditional way, but rather that the downstream uses of this content will be leveraged so that more value can be added to news products, and the value inherent in the content can be more fully exploited.
The content production requirements addressed here apply not just to news content, of course. For example, it's possible to envision knowledge management solutions based on these technologies. While the basic model is similar, the motivation for this article is also driven by what might be called the unique mission of journalism. This mission includes the objective of educating consumers about events in the world, and it is in support of that aim that these capabilities will provide assistance. Given the universally large potential population of consumers and the tremendous diversity of their preferences, the application of these capabilities to news content also provides some unique challenges that expand and enrich the basic model. The underlying principle that motivates this effort is how to make text more readable and how to increase reader learning. In this age when news providers have incentives to experiment with new ideas it seems appropriate to consider approaches that specifically address known problems in reader comprehension.
For Further Reading
Copyright © 2004, American Society for Information Science and Technology