Investigating Test Effectiveness on Object-Oriented Programs
Ming-Hung Kao, Mei-Huei Tang and Mei-Hwa Chen
ABSTRACT
We present a case study that investigates the effectiveness of traditional
testing techniques as well as existing state-based testing strategy on
detecting faults in object-oriented programs.
In this study, we applied a black-box approach, functional testing
and two white-box approaches on which statement and decision coverage
criteria were used in test case selection and as a guideline for
determining test adequacy.
To apply the state-based testing strategy on object-oriented programs
instead of on each class individually, we introduce two coverage criteria:
all-states and all-transitions. For each execution of a test case, the visited
states and the transitions to enter these states for each class are recorded.
New test cases are designed to traverse the uncovered states and transitions.
These testing techniques were applied to three industrial systems with sizes
ranging from 5.6k to 21.3k LOC.
The investigation began with classifying faults found in these three systems
over the past three years. Based on their relevance to the object-oriented
features, the faults are classified into three types where {\em type I}
is strongly related to the object-oriented features such as inheritance and
polymorphism; {\em type II} is related to object management, and {\em type
III}
is the type of faults that can be found in the non-object-oriented
software as well.
After applying these testing techniques to the faulty systems,
we observed that the majority of {\em type I } and {\em type II} faults
still remained in the systems. This result implies the likelihood that
traditional testing techniques are not adequate for detecting object-oriented
faults and the state-based testing is not good enough to address OO faults.
Furthermore, we investigate the feasibility of using existing OO metrics
to estimate the percentage of OO faults in a given system.
Our results show that
three out of the four CK metrics we used might not be sufficient as
predictors of fault-prone and OO fault-prone classes.
Our observations suggest that other metrics, that
can capture the dynamic behavior of the program and the scenarios on which the
instances of the classes are referenced, might be better indicators for
the fault-prone and OO fault-prone classes.