System Test Tutorial

System Test TAC

Tutorial on System Test

System test describes the testing of the composition of heterogeneous components. Systems have several features which makes their test problem unique from traditional IC test.

Heterogeneity:
A system is built from components which are heterogeneous in terms of design style. These components may be pre-designed or pre-manufactured, and the system designer may not have access to the detailed component design information. Design heterogeneity appears in several forms:
- Hardware-Software Covalidation and Test - Complex systems typically contain a software program running on a processor which communicates with other hardware components. Both hardware and software components may contain design errors which must be identified. The hardware and software design communities employ unique design strategies and as a result, the nature of design errors in the two domains can be substantially different. Manufacturing defects can only occur in hardware components, but they must be tested as well and the processor can be programmed to assist in hardware test.
- Mixed-Signal - Computer-based systems are digital at their core, but in order to interface to the external environment analog conversion stages are often required. The resulting system is termed Mixed-Signal to indicate the use of both analog and digital components.
- Asynchronous Systems - The physical size of large systems makes the distribution of a single global clock infeasible. Asynchronous interfaces are often required between components. The lack of a global clock violates the assumptions of most automatic test equipment and test generation CAD tools.
Functional Test:
System test requires the evaluation of the functionality of the system as a whole. For this reason functional test, testing through the functional system interface, is essential to system test. Functional test has the advantage that the system is tested in the same environmental context in which it is to function. This improves the detection of manufacturing faults by performing test at-speed and by eliminating the possibility of sensing defects which do not manifest themselves as functional faults. Functional test is also required for validation because it is the only way to evaluate the interaction between system components. The drawback of functional test is that it excludes the use of the divide-and-conquer techniques applied to manage test complexity. As a result, functional test generation and response evaluation currently rely heavily on manual interaction, and are therefore expensive.
Abstract Behaviors:
The sheer volume of output data produced by a large scale system complicates the task of observing fault effects. To address this problem, it is common to define a set of abstract behaviors which capture important features of the output data in a concise way. Tests are then constructed so that resulting fault effects have some impact on the abstract behaviors, and are therefore easily observed. Abstract behaviors describe gross features of output data. For example, it is common to evaluate a computer system by determining whether or not it correctly boots with a given operating system when it is powered up. The abstract behavior of booting an operating system is easy to evaluate when compared to an alternative such as examining the detailed cycle-by-cycle activity on the system bus. Another motivation for the use of abstract behaviors is to accomodate the perceptual limitations of the humans evaluating the system. The interface of many complex systems interact directly with one or more humans using the system, and require that human factors be considered during functional test. When applying functional tests, the type of data which a human can apply and the rate at which data can be applied restricts the range of feasible tests. When test results must be evaluated manually, the limits of human perception must be considered to ensure fault detection. A human may not have the ability to detect an error in a single pixel of an image produced by a video system, so that system may be considered functionally correct in spite of a small error.
To consider output dat volume and human factors, we must characterize system behavior with a set of abstract behavioral metrics which describe important properties of the output. The definition of acceptable abstract behaviors is completely application specific. Once abstract behavioral metrics have been defined, metric thresholds are used to differentiate faulty behavior from correct behavior. For example, a video system might be characterized by its frame rate and a frame rate of less that 24 frames per second may be considered indicative of a fault because the quality loss is noticeable by most humans. Evaluation of higher order behaviors is referred to as quality of service (QoS) in the multimedia community.
The focus on abstact behaviors presents many fundamental questions for the test process. The correctness of system functionality is no longer a property of the system alone, but of how that system's outputs are perceived by users. Since humans are unique, this makes the definition of system correctness a moving target. This problem may be addressed by either targeting the majority of likely users such as using a frame rate of 24 frames per second to characterize a video system because that rate is satisfactory for most people. Another approach is to evaluate an abstract behavior whose states are clearly discrete, such as evaluating whether a computer system boots its operating system or not. To make manual test evaluation efficient, fault effects must impact these abstract behaviors. On one hand, this complicates the test generation problem by raising the standard for fault effect propagation. The use of abstract behaviors also simplifies the manual test evaluation process by ensuring that fault effects are easily noticeable. Because abstract behaviors are abstractions by definition, many detailed effects are ignored and diagnosis becomes difficult.
Interface Test
System test addresses the whole system which includes the system components as well as the interfaces between the components. System are commonly built using pre-designed and pre-manufactured components whose testing is guaranteed by the vendor. For this reason, the importance of interface testing is relatively increased because the interfaces depend on the composition of components which is unique to each system. The heterogeneity of system components contributes to the complexity of interface design which must translate between different domains. For example, an interface may transmit data between clock domains, between analog and digital components, or between hardware and software components. Interface logic is a common place for design errors because its design requires agreement between the designers of the communicating components. Components are often designed by different groups of people and a lack of communication between groups often results in contradictory interface assumptions.
Repairability
In systems built from separable components, the ratio between system cost and repair cost makes system repair a cost effective option. In high-cost and life-critical systems, timely repair is essential to extend the lifetime of the system. The need for repair increases the importance of the diagnosis problem. The ability to perform accurate diagnosis is complicated by limited observability through complex system components. The use of higher order behaviors for observability provides only imprecise diagnosis information. Manual observation tools including logic analyzers, oscilloscopes, and software debuggers are typically applied to enable sufficient observability. Even with the assistance of observation tools, the diagnosis process is largely manual.

Heterogeneity:

Functional Test:

Abstract Behaviors:

Interface Test

Repairability