Capabilities-Based Evaluation Processes: Moving Beyond the Fear of Failure

Published June 20, 2007
By Katherine C. Gandara
Headquarters Air Force Operational Test and Evaluation Center Chief of Public Affairs

KIRTLAND AIR FORCE BASE, N.M. -- The following is an interview with the Headquarters Air Force Operational Test and Evaluation Center Technical Advisor to the Commander, Mr. Jerry Kitchen, about the value of capabilities-based evaluation processes and how AFOTEC is incorporating this approach into how the Center conducts and reports its test and evaluation results on weapon systems.

Q: What is a capability-based rating system, and how does it compare to the effectiveness and suitability-based rating system?

A: A capabilities-based rating system is tied to the transformation that the Department of Defense has been executing for the last several years. We've transitioned from the Cold War mentality of a requirements-based acquisition system to a capabilities-based acquisition system. A capabilities-based acquisition system is tied to the focus of the Department of Defense to execute effects-based warfare or effects-based operations. If we want to generate a specific effect, we look at our capability to do so. If a capability shortfall or capability gap exists, we look at a variety of ways to bridge that gap. If we can't bridge that gap with non-materiel means, we use the acquisition system to bridge that capabilities gap which allows us to generate the desired effect. In the past we listed specific requirements. For example, we used to specify that the radar must detect a one square meter target at 200 miles. Now we say the radar must have the capability to detect an incoming target such that we can engage it before it is able to use its weapons against us. Our capabilities-based rating system confirms that the capability gap has been bridged by showing that the weapon system enables the warfighter to accomplish the mission objective by generating the desired effect. We still rate effectiveness and suitability. We use the lens of the operation to focus our assessment of effectiveness and suitability shortfalls on the impact to the operation.

Q: How does this transformation work in the test and evaluation arena? Past practice was based upon a desire that the system not be delivered to the user until it did everything it's supposed to do to support the mission.

A: This aspect of acquisition transformation infuses the concept of spiral and incremental acquisition. We look to the acquisition community to deliver as much capability as possible as quickly as possible. We understand that there may be a need to make modifications in the future to gain the other capabilities that are needed, but the users get something now that they can use. This is a departure from the way we acquired weapons systems in the past, when they were expected to do everything according to the requirements document. If they could not, then we went back to the drawing board. Consequently, we delayed getting weapon systems into the warfighters' hands. If a system had some capability, even as much as 80 percent of the capability that the warfighter needed, but could not deliver the other 20 percent, it was delayed. The developer was told to "keep working on it until you get the other 20 percent delivered." Incremental acquisition is supported by our transition to a capabilities-based reporting system. The decision-makers can more easily see what capability has been delivered in that increment.

Q: Why is AFOTEC leading this effort toward a capabilities-based rating system?

A: AFOTEC is leading the transition to a capabilities-based rating system because we listened to our customers, the warfighter. They were constantly asking, "So what? What does effectiveness and suitability mean in terms of getting the job done?" Our responsiveness to the transforming acquisition environment is keeping us relevant, and we believe other test organizations and oversight agencies will eventually follow suit with similar transitions.

Q: Since the 1990s, the U.S. military has been involved in a much higher operational tempo. One of the concerns that has been expressed, when it comes to test and evaluation and acquisition, is that the combatant commanders have an immediate need in the field to be able to do "X, Y, and Z" and they can't wait 12 to 24 years. They need at least an 80 percent solution, and sometimes they're even willing to take a 50 percent solution, in order to accomplish the mission they're facing. How does this approach address that?

A: That question allows me to make an important point. The transformations being initiated are not about the approach of "buy it now and fix it later". That has been a terrible mantle that we've worn in the past. The capabilities-based, agile, streamlined, and spiral acquisition initiatives are all about buying the "A" model now, and improving the baseline to the "B" model later as we better understand the technology or the threat.

Q: Would a more simplified explanation be similar to the example of how a new computer software program initially comes out as a "version 1.0" and does what it says it's going to do, but later versions, like version 1.1, may add to the capabilities of that particular software?

A: Exactly. Version 1.0 allows you to work with that software today instead of waiting three more months for a release that would do both what you need today and what you need tomorrow. Even if you need both of those capabilities today, you accept what the software company can deliver today because that's the more important driver. Then they continue to work on developing the rest of what you wanted, and deliver it when it's ready.

Q: So, is this approach about focusing more on the needs of our customers, the warfighters?

A: Yes, and I can't emphasize that enough. I've encountered many people who are detractors of the capabilities-based, agile, streamlined, and spiral acquisition initiatives who say this approach is a cloak to "buy it now, fix it later". This is not true. If the weapon system has significant effectiveness and suitability shortfalls that drive substantial or severe impacts to the operations and is in need of further development, then AFOTEC reports that. But, if the system can do certain things and can't do other things, like some aspects of the mission or some parts of the mission under certain conditions, we report that as well. Using our evaluation findings, the decision-makers and the warfighters can make a determination of operational acceptability.

Q: Is this approach more relevant because you have the combatant commanders who are in Areas of Responsibility who often need a capability now and don't have time to wait? Would you say this is the more responsive approach?

A: Absolutely. It becomes relevant from the perspective of, "Can you get the mission done?" We've always been required to answer the questions, "Is the system effective... is it suitable... is it survivable?" We've always had to answer those questions and we'll continue to answer those questions. But, now we put it in more relevant terms of, "You can get the mission accomplished... or you can get the mission accomplished if you're willing to pay certain operational costs... or you can get the mission accomplished only under certain conditions and/or aspects ... or you can't get the mission accomplished." These mission-oriented statements are more relevant to the end-user who is employing the system, as well as the decision-makers who are either acquiring it or deciding whether or not to field it. The relevancy is really hinged on the operational employment aspect of, "Does the weapon system enable the warfighter to accomplish the mission?"

Q: In addition to the benefits to the warfighter, in the long run doesn't the American taxpayer also benefit with this 'working smarter versus working harder' approach, especially from a cost perspective?

A: Yes, these acquisition initiatives do provide fiscal benefits. We design our tests with the acquisition-based rating system in mind. When we're looking at capabilities, we understand that the battlespace is very complex. It's just not one particular way to employ the weapon system, under one set of environmental conditions, against one set of opposing forces. It's about a whole spectrum of conditions that, when coupled, require comprehensive testing. AFOTEC has adopted a very elegant, robust, and efficient test methodology that allows us to test under a wider spectrum of battlespace conditions, also known as factors and descriptors. While using fewer test resources, these test design techniques allow us to tell the warfighter where the sweet spots are and where the sour spots are. When a warfighter gets into a dogfight, it's better to be a vicious 75-pound junkyard dog than an ankle-biting five-pound Chihuahua. Our capability-based evaluation and rating system allows us to address those kinds of things, and to do so efficiently.

Q: What are some pitfalls we should avoid when implementing capabilities-based testing?

A: When we're testing a weapon system, we have to guard against making an incorrect conclusion. When analyzing test data, I face four possible outcomes: (1) I claim the weapon system works, and it really does - I made the right call; (2) I claim the weapon system works, but it really does not - I made the wrong call; (3) I claim the weapon system does not work, and it really does not - I made the right call; or (4) I claim the weapon system does not work, but it really does - I made the wrong call. It's important to guard against making an incorrect conclusion because we don't want to reject good weapon systems. If we reject a good weapon system...a weapon system that really performs well, but improperly conclude it does not get the job done, we delay getting that weapon system into the warfighters' hands. We don't want to delay giving urgently needed tools to the warfighter. On the other hand, when the warfighter squeezes the trigger and it doesn't work, we've put the warfighter at a tremendous disadvantage. So, it's critical that our capabilities-based rating system provides accurate information. Our current terminology of "fully mission capable" or "partially mission capable" or "mission capable" or "not mission capable" is in the warfighters' lexicon. We must give an answer they can understand and trust.

Q: How does this fit into the future of test and evaluation? As there seem to be indications of more acceptance of it with the other services' operational test agencies, do you see this as the practice of the future?

A: I do. I think that many of us in the test and evaluation community have been searching for a solution like this for a long time. At AFOTEC we have established what we call our test construct. The test construct becomes the foundation from which everything else is built. The test construct has three fundamental parts. First, the construct focuses on the mission - the actions we want to perform with the weapon system. We call those actions critical operational issues - critical mission elements or mission objectives. Second, the test construct captures the things we want the weapons system to be. Those attributes or characteristics are identified as operational capabilities of the weapon system. Third, the test construct lists measures that link the first two things together. The test construct shows the warfighter the relationship between the desired capabilities and mission accomplishment, based on the results of test and evaluation measures which identify the strengths and weaknesses of the system. I've been working in the test and evaluation business since 1983, and I've yet to come across a test construct as elegant as this. I'm confident we will continue to use this test construct, and I predict the other Services' operational test agencies will implement this or a similar construct.

Q: Would the analogy of this approach be similar to the difference between building a house's foundation on rock versus building it on sand?

A: Exactly. If you build your house on a loose foundation, or even on a pile of rocks that are not bound together with a lot of concrete to form a firm foundation, the house is going to fall apart. This construct forms a firm foundation. It facilitates accurate, balanced, and complete reports, allowing us to make definitive statements. And we can make those statements with a high degree of confidence because of our robust test designs.

Q: In regard to test and evaluation reform, a Feb. 26, 2006, Test and Evaluation Conference speech by the Director of Operational Test and Evaluation, Mr. Thomas Christie, spotlighted the importance of how test reform helped the test and evaluation community avoid the "rush to failure" by doing their homework early. More specifically, the speech highlighted how "testing is for learning" and "that may sound somewhat trite, but how often have we strayed from that dictum and reflected the proverbial 'pass/fail' mentality we're so often accused of." What are your thoughts on that?

A: The "rushing to failure" phrase is one that I have heard more than a few times. When you think about it, what is the purpose of test? As professional testers, we know there are lots of reasons we perform tests. We sometimes test to characterize a capability. We test to optimize a capability. Sometimes we're merely validating some of the requirements or confirming a specification through specification compliance tests. Sometimes it's about doing a functional acceptance test. Other tests focus on fault detection. All of these are important reasons for testing, and each helps us learn about the capabilities of the weapon system. Our capabilities-based rating system allows us to report the things we learned during the test. So, in that regard, it really does go back to the idea that testing is for learning, as Mr. Christie put it five years ago. However, AFOTEC's capabilities-based rating system ushers out the effective/not effective, suitable/not suitable, 'pass/fail' mentality that served the acquisition system so poorly for many years.

Q: So, in the end, if AFOTEC does its job correctly, does that translate into our operational forces going out and successfully doing their job?

A: It really does. It is very satisfying to learn that the weapon system was employed in the Area of Responsibility, in combat, and performed the way we predicted it would, based on our test results. We never want to hear the warfighter say, "I thought the tester said this thing worked!" AFOTEC must ensure our reports are timely, accurate, balanced, and complete, with a clear, concise, and cogent description of mission capability.