B U L L E T I N
Analytic Myopia, Data Disintegration and Homeland Insecurity
Lee S. Strickland is director, Center for Information Policy, in the College of Information Studies, University of Maryland. He recently retired from the Central Intelligence Agency as a Senior Intelligence Officer.
America remains a dangerous and conflicted environment – dangerous given the increasingly virulent threat of terrorism as it becomes institutionalized in global and independently directed communities and conflicted given the diverse opinions and judgments regarding a response. The President asserts a lack of actionable intelligence, government critics allege that there was a systemic failure of intelligence, others debate the appropriate balance between civil liberties, and the 9/11 Commission recommendations may well lead to the creation of a separate department for domestic intelligence. This article examines perhaps the most critical topic in the debate – intelligence performance – and suggests that, although there has been a fundamental misunderstanding of intelligence, there is most significantly a dysfunctional information space in which to conduct effective analysis of the terrorism threat. In this article we identify two general problems: data disintegration and analytical capability. We discuss the problems in detail, including specific remediation efforts that could substantially improve our homeland security posture. We conclude by emphasizing the tight integration between technology and the academic discipline of analysis in the intelligence arena and suggesting that reorganization, without addressing the root cause of failure, will be unavailing.
The recurring demand for "actionable intelligence," a term that first emerged in the media in the context of the Somalia debacle some 10 years ago, is consistently offered as both a criticism and a rationale for inaction. Indeed it has even begun to invade the lexicon of competitive intelligence in the business environment. And while it is critical for intelligence to make its factual findings and conclusions as quantitatively and qualitatively explicit as possible, consumers must understand what the intelligence process can provide. Quite simply, the inputs (relevant factual data) may be limited and must be augmented by assumptions. Furthermore, the risk of denial and deception cannot be eliminated completely, and the question or issue posed must be capable of answer. All too often, the intelligence question cannot be resolved as a matter of logic because it depends on events yet to happen, such as human intentions that will unfold or factors that will converge by chance. In a cloud of mysteries our intelligence collection and analysis methodologies can inform us – allow us to understand the threat and identify vulnerabilities – but cannot provide an answer where none exists. In sum, the intelligence process, however rigorous, does not remove the obligation for difficult decision-making by the national leadership.
Recognizing the Dysfunctional Information Space
The information collection and analysis environment at the Federal Bureau of Investigation (FBI) clearly presents the conundrum facing certain elements of the intelligence community today – critical demands placed upon a poor information working space characterized by disintegrated data and impaired analytical capabilities. Yet analysis is at the heart of an effective counter-terrorism effort. Such analysis first requires effective electronic record-keeping systems that ensure the availability of information over time. Secondly, there must be experienced professionals trained in the scientific discipline of analysis and supported by productive analytical tools that provide cross-platform access to relevant information and also augment the human endeavor of analysis. With such systems, training and tools, analysts can have far greater ability to search, collaborate, visualize data, perform entity and relationship extraction from full text and ultimately conduct data mining or knowledge discovery work for predictive information that could prevent another 9/11. Let us consider both of these issues in turn.
The Data Disintegration Problem
We begin with the argument that effective information management (IM) and information technology (IT) are key elements of the solution – notwithstanding the suggestions by former Harvard Business Review editor Nicholas Carr that IT matters less today, that it is an undifferentiated commodity, that we should spend less and that we should follow not lead. I believe the issue is not whether IT is a commodity like the electric power industry; rather, the problem is that we have balkanized assets – information – and balkanized tools – technology – that don't work well together. It is as if the electrical system comprised a multitude of generators producing at different voltages and frequencies, trying to move through overlapping networks and powering electrical appliances of various electrical needs. The single most significant challenge facing IT in the intelligence community is this environment – enterprises with hundreds of isolated, stove-piped applications that manage redundant data and don't communicate effectively, if at all, with each other.
What Does This Mean for Productivity and Homeland Security? This environment is the FBI's problem, and, although it is hardly unique, the consequences are substantial, including high costs since hardware, software and information common to all projects are redundantly procured or developed. And there is adverse impact to mission. A stove-piped environment is slow, ineffective, costly, complex and fragile. But most of all, from mission perspective, there is the assurance of inconsistent answers because there is no authoritative data source. Stated differently, this environment guarantees that the right people do not have the right information when required.
So Why Does This Environment Continue to Exist? In part it reflects the nature of IT development in government. Generally every department has an IT capability and/or funding that can address critical needs faster than centralized IT units, which are notoriously unresponsive. As a result, it's possible and relatively easy to build your own stove and stovepipe. In part the environment also continues to exist because of cost of re-engineering. That is, few mission directors can overcome bureaucratic inertia and push for a very significant investment that seemingly provides no new functionality. And lastly, part of the reason lies in the demands for data protection (national security classification and the "need-to-know" principle) that work against integrated solutions, including integrated data repositories.
However, the most significant reason is the complexity of the enterprise and that means generally that there is no comprehensive enterprise architecture (EA) – an organizational blueprint that defines in business, technology and data terms how an organization operates now, how it will operate in the future and what technology and business process changes will be required to make that transition. For example, the data layer of this approach is known as the data architecture and is a model of all data maintained by an organization – itemizing types, definitions, rules and relationships – that support the business processes defined in the EA. Without such an architecture, an organization will not have effective information management, and modernization efforts will yield mixed results at best because requirements – the software-engineering process that defines what the system should do and ensures that business and engineering are working on the same product – will be poorly done by definition. In sum, an EA is critical to business success.
The FBI Revitalization Effort. Although the FBI's Trilogy program is intended to revitalize the IT environment, the effort remains significantly over budget and schedule. And the General Accounting Office (GAO) has had much to say on this subject, finding in October 2003 that the lack of an EA had adversely impacted the success of Trilogy and its two principal components – a new infrastructure (a wide area network linking field offices) and a new investigative case management system, the Virtual Case File (VCF).
Specifically the GAO found that Trilogy was a good, albeit initial, step in the effort to bring the FBI's information management into the 21st century and to provide an information platform for analysts, but that it would integrate only a handful of the existing 234 nonintegrated, stove-piped applications. Moreover, it observed that without a detailed EA an agency cannot ensure systems requirements for any other development effort are consistent with the architecture and it further follows that program management will falter because it is impossible to produce an integrated project view (IPV) that details dependencies, linkages to other projects and how changes here as well as changes in resources or requirements will impact schedule, quality and/or costs. (See For Further Reading.)
The Analytical Capability Problem
Understanding intelligence, managing expectations and re-engineering the means by which IT is designed and managed are critical objectives if we are to improve our homeland security profile. But we must also address the discipline of analysis and the ability to provide quality analysis to the national leadership. As we shall see, this requires a foundation of training, sharing and automated tools.
Analytical training. As I suggest in classes at the University of Maryland, "Analysis doesn't just happen, and immersion in data is not an analytical technique." It follows that the human element in analysis is as critical as the information science and technology issues that we have considered. We begin with the recognition that there are critical informational and process differences between analysis for law enforcement and analysis for strategic or intelligence purposes. The former focuses on the collection of specific information in the context of an actual crime and individualized suspected wrongdoing while the latter focuses on the collection of generalized information that may prove relevant to future investigations often without any evidence of specific wrongdoing. Stated differently, the purpose of intelligence is to collect the totality of information relevant to mission and to develop knowledge of actions, events and/or threats that might affect domestic stability or national security. And this dichotomy is important because law enforcement tends to do the first well and the second less well as exemplified by the infamous FBI memo from its Phoenix, Arizona, office that was not acted upon prior to 9/11.
The FBI culture has focused on reacting to crime and has favored agents developing and tightly holding information about individual cases rather than utilizing it as part of a strategic intelligence effort. Indeed it was not until the 1993 World Trade Center (WTC) bombing that prevention, and hence intelligence, were viewed as important. However, personnel resources were not applied, and even a 1998 strategic plan that called for a professional cadre of analysts did not change the status quo – a review in 2000 found that 66% of the analysts were not qualified. Although that review made recommendations for improvements, little changed and the effort was dissolved after the 9/11 attacks.
What is the science and profession of analysis? We begin by remembering the roots of analysis in the federal government – when General William Donovan was made the Coordinator of Information (COI) and later the Director of the Office of Strategic Services (OSS) by President Roosevelt. Donovan understood, based on consultations with British Intelligence, that research and analysis were the heart of any intelligence organization and accordingly enlisted Dr. William Langer, the noted diplomatic historian from Harvard and others trained in the academic paradigm of research, to create the forerunner of CIA's Directorate of Intelligence.
Through their efforts, intelligence analysis and the scientific method became synonymous. This means that judgments are developed through a rigorous, scientifically based human process of analysis of competing hypotheses. It proceeds from the identification of a full range of hypotheses (prospective answers to the research question), the evaluation of the evidence and assumptions for consistency or inconsistency with each hypothesis, and the identification of most likely outcomes. Without this approach, intelligence becomes an exercise in satisficing – the selection of the first identified or most politically favored hypothesis that appears "good enough." Beyond recognizing that satisficing is little better than guessing, we should understand its three significant weaknesses:
Of course, even with this systematic approach, one must be cautious as to various cognitive biases – mental errors caused by our simplified information processing strategies. These may include problems arising from vividness, absence of evidence, the desire for consistency, the cumulative effect of probabilities, the persistence of impression, the preference to see patterns, the preference to see centralized direction, the preference to overestimate internal factors and underestimate external factors, the overestimation of our own importance especially with respect to successes but less so with failures, and illusory correlation (relationships that are statistically significant but not causal). And of course there are other biases – cultural, emotional, organizational or those of self-interest. But there is little doubt that a scientifically based process of analysis yields the best and most defensible intelligence.
Sharing. Information sharing, between peers as well as among federal, state and local homeland security agencies is a problem of legendary proportions and more complex than the security clearance and secure connectivity issues that tend to be our focus today. Early in 2004 the Markle Foundation released its second report (see For Further Reading) on national security in the information age urging that the handling of intelligence should be decentralized (networked and not hierarchical), that this network should include not only the federal government but also state and local governments and private industry, that policies should be adopted to empower and constrain the government (balance privacy and security interests) and that we should focus on prevention, thus contemplating the need for strategic homeland security intelligence.
However, several points follow from the Markle report. First, the problem from an individual analytical viewpoint just got much larger as did the problem for the individual organizational node on the network: What information is needed and how can it be integrated into a solution? And second, there is the cultural problem. Sharing is constrained today by the context of classification that developed in the Cold War where the leak of a single bit of information could have catastrophic consequences, where the primary users were the most senior policy makers in the federal government and where dissemination was limited by a rigid "need to know."
The picture today is vastly changed and the mindset must be changed to broad sharing. But the complexity is that we have thousands of nodes of information in this country and each, quite simply, doesn't know what they don't know or what they need to know. Although there are processes in place to allow the dissemination of even sensitive information while protecting sensitive sources and methods such as sanitizing or tear-line dissemination, they are cumbersome, tend to slow the flow of information and often don't include state and local agencies. One remedy would be to create the less sensitive dissemination versions up front, while another would be to designate responsibility for this function both individually and agency-wide. The bottom line is that our sharing model is dysfunctional and must be re-engineered.
The Tsunami of Information and the Need for Analytical Tools. Another key is addressing the volume of information – a veritable tsunami – and the need for tools. In short the totality of this information far exceeds the ability of any organization to effectively and completely analyze it and render judgments. And there are several aspects to this issue. One is that textual information must be captured and be retrievable in an effective electronic record-keeping system. Another is that the quantity of textual information or structured data quickly outstrips the working capability of the mind to retain and thus analyze discrete factual information. Yet another is the necessity to integrate that unstructured text information with structured data – often many organizations tend to ignore free-form text because of the analytical difficulty presented. The last issue is the importance of tools both to work on the problems of entity and relationship extraction from text and to work on the analysis of the resulting data – the discovery of trends or links that are quite simply not obvious to the human analyst. And the size of the problem? It is estimated that 80-90% of all government data is textual and not structured.
The complexity is highlighted if we consider the actual analytical process for a moment. The analyst attempts to identify relevant data and relationships in order to answer the intelligence (research) question presented typically through a process of creating and testing hypotheses. But we quickly reach the limits of human memory, thus necessitating the process of decomposition and externalization – segregating and recording key evidence – supported yesterday by the analyst's "shoe box." But the proliferation of data means that it is simply impossible manually to identify and record relevant data, much less to find the hidden or non-obvious relationships. The question is: Can the analyst take a body of documents, a special body of documents of interest to an analyst working on a specific issue, and visualize the subset that is of particular relevance, extract specific entities from it and identify their relationships?
The answer is part positive and part negative. There are tools that facilitate matching, such as the Violent Criminal Apprehension Program (VICAP), which is a nationwide data information center designed to collect, collate and analyze crimes of violence, especially murder. This FBI-supplied software is currently used by a number of cities, including New York, Los Angeles, Chicago, Detroit, Dallas and Kansas City, and operates by continually comparing all entries in order to detect signature aspects and thus crimes with common offenders. And there are software products that provide much higher levels of analytical support, such as tools for visualization, identification and extraction of relevant entities, and link analysis from companies including Inxight, Attensity and Systems Research and Development that are useful for identifying non-obvious relationships. The negative is that none of these state-of-the-art products appear to be the "holy grail" of analytical tools and none integrate well with the massive data repositories familiar to the law enforcement and intelligence community. They may even encourage creation of additional stove-piped solutions.
Finally, Dr. Richard Restak suggests in his well received book, The New Brain, that our time of information overload and multitasking is resulting in behaviors in the population that were previously considered dysfunctional, such as hyperactivity, impulsiveness and easy distractibility, becoming the norm and bringing significant reductions in articulated thought and overall efficiency. This research could have profound impact on intelligence analysis where the problem of information overload has been addressed solely by the insertion of additional technology.
However the recommendations of the 9/11 Commission are ultimately implemented – by a new domestic intelligence agency or the creation of an intelligence service with the FBI – it is critical that the root causes of our intelligence failures be recognized and addressed with specificity. Reorganization is all too often a feckless response to system failures. If the United States is to be successful in its homeland security mission, it will require that we have trained analysts throughout the intelligence and law enforcement community, that stove-piped collections be broken down and sharing effected, and that these analysts look at the totality of the information space through different prisms, with different perspectives, attempting to solve the universe of threat and issues presented. We must recognize that analysis is a professional exercise, mirrored on the research processes in the university community, and provide the requisite national direction and funding.
For Further Reading
A Note about the Definition and Categories of Intelligence
Intelligence can be defined as a process, a product or an organization but is best characterized as the following:
A Note about Data Security
Traditional information systems have not generally allowed data to be separated into different sensitivities within a single database. However, commercial solutions today have reached a level of maturity that allows different levels of access within a given environment. Often termed data level security or label security, the concept is that a security appliance (computer code) mediates access to given elements of data in a database by comparing a security label attached to each data element with a security authorization assigned to a given user. Moreover, this approach easily supports the need for comprehensive role-based security within many organizations – an approach that moves from general authentication to specific authorization predicated on position and hence need-to-know principles.
A Note About Program Management and IPV
Many organizations manage by the "multiple-release timebox approach," which is characterized by a focus on a single team, a single product and a single resource allocation. This is a recipe for data and systems disintegration as well as cost and schedule slippage on individual projects. The IPV approach achieves two goals: first, it allows project managers to plan for contingencies and their effects and, second, it permits senior management to oversee completely disparate (separate) efforts but ones that must come together if they are to meet the demands of the EA. Also frequently ignored is the critical necessity for detailed requirements – not only to ensure compliance with the EA but also to generate an adequate test plan for functional testing as well as integration and interoperability testing that is dependent on a complete data and systems architecture.
Failure of the Analytical Base: Airplanes as Weapons
A classic result of the absence of an analytical base for homeland security issues is presented by the repeated testimony of government officials as to their lack of knowledge of "aircraft as weapons." In point of fact there were multiple warnings of such use that were never made central to our domestic counter-terrorism efforts. In January 1995 Ramzi Yousef, later convicted in the 1993 World Trade Center bombing, conspired with associates in the Philippines to blow up 11 U.S. airliners in a definitive strike against this country. Only incompetence (mixing water with sensitive chemicals that exploded) led to the disclosure of his plot. Based on this report, the FBI in the following year (1996) included this threat scenario in its counter-terrorism planning for the Atlanta Olympics. Three years later (1999), the CIA's National Intelligence Council widely circulated a report that highlighted the potential for terrorists to crash explosives-laden civilian aircraft into critical national targets including the Pentagon, the CIA and the White House. And two years later, the concern over airliner weapons again surfaces when Italian authorities in Genoa establish a no-fly zone for the spring 2001 G8 meeting. Yet six months later in September 2001, there was consternation throughout the American government that such an event could take place.
Copyright © 2004, American Society for Information Science and Technology