A Structured Approach for
Software Module Testing and
Its Implications for Integration
Testing
Bahram Khalili Murat
M. Tanik
Fidelity Investments Department of
Computer and Information Science
400 Las Colinas Blvd. East New
Jersy Institute of Technology
Irving, TX 75039-5579 GITC
Building, Room 4400
NJIT University Heights
Newark,
NJ 07102
ABSTRACT
To ensure the quality of modules, software testing is often practiced. This paper presents a methodology to perform module testing and investigates the reuse of module testing data for integration testing (McConnel, 1996). The underlying hypothesis is that a module is best understood by its developer(s) at the time of developing, at which time the developer is most qualified for constructing test cases. These test cases are recorded in a script language and are associated with the module as a mark of quality. They can also serve to test modified version of the module in the lifetime of the software system. When test cases are recorded in the proposed script language, a test driver is automatically generated to execute the module on the test cases. The proposed module testing approach has been applied in three projects and improvement of overall system quality is observed. Reusing module testing data for integration testing is studied. Implications of such an approach to achieve better software system reliability with less cost are addressed.
1. INTRODUCTION
Testing is one of the most critical areas of system development that must be considered carefully throughout all stages of the software development cycle. Module testing is the process of testing the internal operation of all the sub-components of a software system, referred to as modules or subroutines (Hoffman, 1989). This paper presents a structured approach for module testing by constructing necessary test cases during the development of modules. The proposed testing methodology is presented in Section 2 below. Experience of practicing the proposed methodology is reported in Section 3. Reusing module testing data for integration testing purposes is presented in section 4. Conclusion and future directions are addressed in Section 5.
2. STRUCTURED MODULE TESTING
In general, most research and development efforts have been placed on developing complete integration testing with little emphasis on module testing. Without effective methods, the development and maintenance of reliable and reusable modules is very difficult. The functional goal of this section is to describe an approach for systematic module testing which would improve system quality and reduce maintenance costs. A formal method for generating test cases is introduced in this section. The generation of these test cases is manual and is done during the design stages of the modules (Weyuker, 1988). Since the format of these test cases are pre-defined and straightforward, they will be generated with minimum effort by the designer. However, the program that invokes these test cases and thus tests the modules, is generated automatically. The following steps constitute this systematic solution:
· Construction of a test-case language, and
· Design of a test program for the purpose of executing the test cases.
All test cases, test programs, and test prototypes are in the C programming language. This method can easily be extended to other high level languages by careful consideration of its syntax (Pesch et al., 1985).
2.1 TEST CASE LANGUAGE
A dedicated notation is required to develop test cases for the purpose of module testing. Simplicity and completeness of these test cases are very crucial since the test designer must deal with them continuously. The following five steps introduce one such language that is both simple and functionally complete:
Step1) A unique method to activate a module with valid or invalid input(s).
Step2) The expected output in the form of integer, string, character, or any other designated type with the exact form.
Step3) The type of the returned value (e.g. int, char, in 'C'). This is used to compare the results from step 1 (the actual output) with the expected results hard coded in step 2.
Step4) Exception handler function for erroneous input, where no valid output is expected to be generated an exception handler function must be invoked.
Step5) An expected optional expected execution time for this module in an specific unit of time such as milliseconds. This parameter is intended for use in real-time applications where the execution time of certain modules is of critical importance. A NULL value could be substituted if not interested in execution time.
The above requirements are illustrated in the following simple module implemented in C:
/* This module will accept a string of size 3 and verifies if the input string is a valid month of the year. The months are abbreviated in 3 letters, in upper case, and are available in a small database ( simulated here by an array of size 12). If the input is a correct month, Boolean TRUE is returned, otherwise appropriate exception handler function is invoked and a FALSE is returned. */
#define MONTH_LEN 3
#define NUM_OF_MONTHS 12
MONTHS_OF_YEAR months[NUM_OF_MONTHS]=
{"JAN", "FEB", "MAR", "APR","MAY", "JUN",
"JUL", "AUG”, “SEP", "OCT", "NOV", "DEC" };
Assume user-defined structure MONTHS_OF_YEAR and the buffer "months" are declared and initialized elsewhere and the following module has access to them:
BOOL validate_month( char *s )
{
int i, length;
length = strlen(s); /* get the length */
if (length != MONTH_LEN) /* length of month */
len_exception(s); /* invoke exception handler */
else
{
toupper(s); /* convert to upper case */
for ( i = 0 ; i < NUM_OF_MONTHS ; ++i )
{
if ( strcmp(s, months[i]) == 0 )
return(TRUE); /* month is found */
}
/* if execution thread gets here, month is invalid */
month_exception(s);
}
return (FALSE);
}
The five steps defined above can be applied to this example as follows:
Step 1) "INVOKE"
Syntax: INVOKE:
Description: Used to indicate the start of a new test case. It is followed by the actual module call and the input, "JAN".
Step 2) "EXPECTED-OUTPUT"
Syntax: TRUE,
Description: The boolean TRUE is expected to be returned since "JAN" is a valid month. If no output is expected a NULL should be used instead.
Step 3) "OUTPUT-TYPE"
Syntax: BOOL,
Description: The type of output expected from this module. The actual returned value and EXPECTED-OUTPUT value are both of this type.
Notes:
· OUTPUT-TYPE could by any valid type provided by either the language (e.g. int, char in 'C') or user-defined ( using typedef's).
· OUTPUT-TYPE must be NULL if no returned value expected.
Step 4) "EXCEPTION-HANDLER"
Syntax: Exception handler routine, or NULL,
Description: This module should not invoke any exception handler for "JAN" since it is a valid input. Exception Function "len_exception" would have been invoked if the size of input was not exactly equal to MONTH_LEN. Exception Function "month_exception" would have been invoked if the size was equal to MONTH_LEN but it was not a correct month.
Step 5) "EXECUTION-TIME"
Syntax: LONG Integer, or NULL;
Description: Total expected execution time. A NULL is used for this parameter in this example since it is not of real-time nature. ';' is used to indicate the end of the test case.
The above 5 steps will constitute the following test case language:
INVOKE:module_name(inputs),
EXPECTED-OUTPUT,
OUTPUT-TYPE,
EXCEPTION-HANDLER,
EXECUTION-TIME;
This 5-tuple is sufficient to generate test cases for any given module. Some possible test cases for "validate_month()" module are:
INVOKE:validate_months("ABCDEF"), FALSE, BOOL,
len_exception, NULL;
INVOKE:validate_months(""), FALSE, BOOL,
len_exception, NULL;
INVOKE:validate_months(-1), FALSE, BOOL,
len_exception, NULL;
INVOKE:validate_months("SMU"), FALSE, BOOL,
month_exception,NULL;
INVOKE:validate_months("DEC"), TRUE, BOOL,
NULL, NULL;
INVOKE:validate_months("DEC1"), FALSE, BOOL,
len_exception, NULL;
The above tests cases basically traverse all the possible execution paths (Clarke et al., 1989) of the module under discussion. The tset cases should be written at the time the module is implemented, since they are relatively simple to generate in comparison with the time it takes to design the module. The test cases are designed manually for the following reasons:
· A Well defined module that follows the software engineering requirements should not be very long or complicated that would require many test cases.
· A relatively large scale system may have hundreds of modules and if there are an excessive number of test cases for each module, the final module testing may take considerable time that is not acceptable.
If the above five cases are examined closely, it will become clear that there is actually no need to test many different size input strings because they are all going to take the same paths. However, if the test designer selects to use an extensive number of test cases for each module, an automated test case generator could be used (Ramamoorthy, 1976). This process may not be very simple but, is certainly possible. Such automated algorithms are beyond the scope of this paper.
2.2 TEST PROGRAM GENERATION
Upon generation of all test cases for each module either manually or automatically, a test driver is needed to execute the test cases. Even though implementing test drivers manually is straight forward, it is very tedious and error prone and produces code that is costly to maintain. As a result, test driver generation is a good candidate for automated support. The test driver can be implemented as:
- open test case storage file
while ( NOT EOF )
{
- search for "INVOKE:", start of the next test case
- start a time-stamp if execution time is not NULL
- call the module following the "INVOKE:"
- end the time-stamp and report the time if
execution time is not NULL
- compare actual output with expected, use output
type for comparison
- if (any differences)
- print error message
else if (any exception handler invoked)
- print proper message
}
- display test status
The following assumptions are made in designing the above automated driver:
· For each exception, there is a function of that name, serving as an exception handler.
· When an exception occurs, the module must call the appropriate function.
· The module user is expected to implement the exception handlers to take any suitable actions.
3. EXPERIENCE AND RESULTS
The approach described has been implemented in a number of real-time avionics software projects. Verification and improvement of software reliability in real-time systems is often very difficult. This difficulty is primarily due to the fact that real-time systems are much harder to test by their time-critical nature. It is most critical to find and eliminate a high percentage of software errors during the testing phase since their discovery by the end-users could be catastrophic. The real-time projects tested with this approach were of signal processing nature with most of the input provided by various hardware devices such as radar and high frequency beam sensors. A usual benchmark testing approach (Beizer, 1984) would not be objective enough to satisfy many safety constraints. The test case method, mentioned earlier, helped traverse all existing logic paths, specially the paths that required input signals of diverse nature. The data gathered based on a number of reliability metrics indicated that the incident reports on the projects that used the module testing method were approximately twenty five percent less than others. This improvement is highly significant considering the safety-critical nature of the software. The main problem encountered by the team developing test cases was an initial difficulty to follow the requested syntax and standards. This was initially remedied by creating various editorial macros that helped simplify the editing efforts. Once code developers became familiar with the process, they found the mechanism easy-to-use.
A number of code-review committees were established to enforce and control the approach. The main shortfalls discovered by these committees were:
· Creating test cases for subroutines that required complex input parameters were relatively difficult and error prone. Complex inputs were usually in the form of user-defined data structures.
· Synchronization among tasks in the multi-tasking environments were done by both message passing and sharing common memory. Testing such synchronization were relatively cumbersome.
· Generating new test cases for testing after-release changes were often neglected.
Using the presented approach as basis, more complex and customized modifications were incrementally developed to deal with various shortfalls.
Table 1 summarizes the results of three real- time projects that implemented a more customized version of the test-case methodology presented here. Total improvements are based on comparison with similar projects that did not use this methodology:
4. REUSING MODULE TESTING DATA FOR
INTEGRATION
TESTING
There have been experiences showing that module testing alone is not sufficient to guarantee the quality of a software system (Pressman, 1992). After integrating modules into system, there are potential complications such as incompatible interfaces among modules, side effects from one module to another, accumulation of imprecision, difference between the expected functionality and that of the combined modules. To overcome such complications in integration, integration testing is often performed to ensure the quality of a software system.
Integration testing is systematic technique to apply test data for uncovering the errors associated with interfacing among modules (Cho, 1987). The objective is to ensure that the integrated system delivers the functionality as specified in the requirements. Integration testing can be conducted in a non-incremental manner or incremental manner. Non-incremental integration testing (Leung and White, 1989) is to test the entire system with all modules in their places. This approach, however, generally suffers difficulties in pinpointing errors when problems occurs. In contrast, incremental testing starts to test with only a few modules and gradually increases the number of modules in testing. Incremental testing can be further classified into top-down integration and bottom-up integration. Top-down integration conducts testing from modules at higher control hierarchy level to lower levels. Bottom-up integration performs testing from lower hierarchy level to higher hierarchy levels.
In our practical experience, we have learned that test data prepared by the system developers at the time of design and coding are suitable for module testing. We will investigate the potential of reusing such test data for integration testing.
4.1 AN EXTENDED HYPOTHESIS FOR
INTEGRATION
TESTING
Although the hypothesis proposed in the previous sections is by no means completely validated, our experience in several projects suggests its validation. In this section, assuming the truth of the hypothesis, we will study the implications in integration testing.
Assuming the truth of the hypothesis, the test data prepared by the developers at the developing time are suitable in testing the functionality of modules. In other words, the test data characterize the functionality of the modules in an abstract sense. The test data, when represented in a canonical format, contains type and interface information that can be useful in integration. Such information can be utilized to strengthen integration testing.
The extended hypothesis about integration testing is described as follows:
The expected output of test data from module testing characterizes the interface to its downstream modules and such test data, when applied to the integrated system, characterize expected functionality of the system.
The underlying rationale for proposing this hypothesis is that proper test data for module testing can be proper test data for integration testing. We will study two of the common problems in integration. The first common problem that has to be resolved is the interface compatibility among the connected modules. Proper module testing data, exercising logical paths of individual modules, contain information about the expected outputs of modules. Passing an expected output of a module as input to the next downstream module triggers a testing of interface compatibility between these two modules. If module testing data convey proper interface information, such testing can be robust for verifying interface compatibility in integration.
The second common problem in integration is the difference between joint functionality of two connected modules and their expected functionality (Duran, 1984). When module testing data characterize the functionality of a module, the output in such test data can be useful in verifying the joint functionality with its downstream modules.
4.2 CONVERTING MODULE TESTING DATA
FOR
INTEGRATION TESTING
Of one the merits of reusing module testing data for integration testing is the cost savings in preparing integration testing data (Chen et al., 1993). We will investigate how to convert module testing data for integration testing in this section.
For testing interface compatibility in integration, the module testing data can be reused as they are since the correctness of the functionality is not the concern. In an integrated system, outputs of upper stream modules are simply passed as inputs to down stream modules. Executing inputs passed from upper stream modules constitutes the testing of interface compatibility.
For testing the joint functionality of modules, converting module testing data for integration testing may require some effort (Chen and Tanik, 1992). The best case for reusing module testing data is when all such test data are created in an organized manner so that every output of a module is treated as an input to its down stream modules. This will require a complete detail of the interconnection of the modules so that these module testing data can be prepared. In fact, when module testing data are created this way, the integration testing on such test data is implicitly done.
The worst case is the other extreme of the best case, where no expected module output is listed as input of any of its down stream modules. To be able to reuse module testing data in such case, we have to prepare the expected output of the down stream module for the input coming from the output of the upper stream module. The reality should be between the best case and worst case. However, it is noted that the more effort is spent in preparing organized module testing data, the less effort will be spent in performing integration testing.
We can analyze the number of test cases resulted from the reuse of module testing data. For the sake of simplicity, we will assume that the modules in a complete system is connected as an -ary tree. The number of test cases are summarized in Table 2.
As illustrated in Table 2, reusing module testing data generates a number of test cases for integration testing. Assume that the modules are integrated as a tree with arity there are at least test cases and at most test cases for modules at level h in the tree (Chen and Tanik, 1992). Note that the minimum occurs when each output of an upper stream module goes to only one immediate down stream module and the maximum occurs when each output goes to all immediate down stream modules.
4.3 IMPLICATIONS
Most testing of software is done by a separate testing group in practice (Beizer, 1983). One of the reasons for having testing done in a separate group from the designers/programmers is the unwillingness of the developers to find fault with the system. The developers generally have known much of the functionality of the system after coding was done. Ideally, software testing should be performed by the developers that understand the system most at the time which they know most. One important barrier to this is the unwillingness to unveil the shortcomings of the developers' own system.
When a programmer claims to finish his/her module, he/she and other members of his/her team expect the module to function as specified in the specification, although minor exceptions may be allowed. To achieve confidence of the proper functionality of the software modules, most designers/programmers conduct module testing at the time of coding. However, such module testing data are generally used for individual modules only. Their potential to help integration is neglected.
Reusing module testing data for integration testing takes advantage of the potential of module testing data. Module testing data can be used in integration for checking interface compatibility and joint functionality of integrated modules. The effectiveness of this approach depends much on the robustness of module testing data. When appropriate incentives are implemented such as rewards for finding integration problems with the module testing data, designers/programmers are more determined to construct the most robust test data at their control. Note also that proper distribution of modules at all levels to programmers is important since down stream modules will have fewer chances to test other modules than upstream modules. Reusing module testing data for integration testing has the potential to achieve better system reliability with less cost than traditional testing approach.
5. CONCLUSION
In this paper, we have presented the basic issues of software engineering to facilitate the task of software testing. A module testing approach is proposed based the hypothesis that the developers are most suitable for preparing module test data at the time of development. Experiences showing improvement of overall system quality from three projects are reported. Furthermore, the reuse of module testing data for integration testing is investigated and its implications are studied. Future works include the application of the module testing approach in more projects and the experiments of reusing module testing data for integration testing.
REFERENCES
McConnel, S., 1996, “Rapid Development”,
Microsoft Press, Redmond, Washington.
Beizer, B., 1983, “Software Testing
Techniques”, New York: Van Nostrand Reinhold Co.
Duran, J. W. , 1984, “An Evaluation of
Random Testing”, IEEE Trans. Software Eng., Vol. SE-10, No. 4, pp. 438-443.
Chen, Y.,T., Tanik, M., 1992, “An
Axiomatic Approach of Software Functionality Measure”, 3rd International
Workshop on Rapid System Prototyping, Research Triangle, N. Carolina, pp.
181-187.
Ramamoorthy, C. V. , Ho, S. F., Chen, W.
T., 1976 “On the Automated Generation of Program Test Data”, IEEE Trans.
Software Eng., pp. 293-300.
Beizer, B., 1984, “Software System Testing
and Quality Assurance”, New York: Van Nostrand Reinhold Co.
Pesch, H., Schaller, H., Schnupp, P.,
Spirk, A. P., 1985 “Test Case Generation Using Prolog”, IEEE Software Eng., pp.
252-258.
Hoffman, D., 1989 “A Case Study in Module
Testing”, IEEE Conf. Software Maint., Session 3B, pp. 100-105.
Leung, H. K. N., White, L., 1989 “Insights
into Regression Testing”, IEEE Conf. Software Maint., Session 2A, pp. 60-69.
Clarke, L. A., Podgurski, A., Richardson,
D. J., Zeil, S. J., 1985 “A Comparison of Data Flow Path Selection Criteria”,
IEEE Software Eng., pp. 244-251, 1985.
Cho, Chin-Kuei, 1987 “Quality Programming:
Developing and Testing Software with Statistical Quality Control”, New York:
John Wiley & Sons, Inc.
Weyuker, E. J., 1988 “The Evaluation of
Program-Based Software Test Data Adequacy Criteria”, Commun. ACM, Vol. 31, No.
6, pp. 668-675.
Chen, Y.,T., Bayraktar, I., Tanik, M.,
1993 “Techniques of Software Reuse in Design and Specification”, Advances in
Control and Dynamic Systems, Vol. 58 (Ed. C.T. Leondes), Academic Press, San Diego,
CA.
Pressman, R. S., 1992 “Software
Engineering - A Practitioner's Approach”, 3rd Edition, McGraw-Hill Book
Company, New York, N.Y.
TABLES
Table 1. Summary of Experiments of the Proposed Methodology
ACTIVITIES |
PROJECT I |
PROJECT II |
PROJECT III |
Functional Type |
Transponder |
Radar |
Calibrator |
Lines of 'C' code |
170,000 |
112,000 |
225,000 |
No. of 'C' Modules |
1,100 |
850 |
1,600 |
No. of test cases |
15,200 |
10,780 |
19,500 |
Execution time of all test
cases |
2.6 hours |
1.7 hours |
3.2 hours |
No. of execution
recurrence |
18 |
15 |
32 |
Incidents discovered PRIOR
to release |
420 |
310 |
550 |
Incidents reported AFTER
release |
A=55 |
35 |
70 |
Incidents discovered PRIOR
to release for a similar project NOT using test cases |
270 |
175 |
340 |
Incidents reported AFTER
release for a similar project NOT using test cases |
B=72 |
43 |
87 |
Total Improvement % |
B/A=30% |
22% |
24% |
Table 2. Analysis of Integration Testing Cases from Module Testing
|
Maximum |
Minimum |
Test cases run through level h |
|
|
Total Test Cases |
|
|
Average
test cases for |
|
|
= number of modules = number of test cases for each module = arity of tree |