dc.description.abstract | Testing is one of the most crucial steps in providing quality for software products.
Two key and heavily used testing levels are unit and system testing, each level hav ing different benefits and drawbacks. For their testing process a software team needs
to decide where the focus and effort should be put, often by creating a strategy for
testing. Because the resources are limited, a critical question is “what is a good
trade-off between different levels of testing in order to maximize the effect, i.e. qual ity improvement and assurance while using as few resources as possible”. There are
a lot of factors to consider for this trade-off, from costs to time and knowledge of
testers, but in this study we focus on a critical one: the fault-finding ability and
behaviour of tests on different levels.
To evaluate the fault finding behaviour of unit and system tests, a reliable method
for test level categorization is needed. Based on the attributes found in the common
definitions for the testing levels, a framework for categorization was developed and
applied to analyse the usage of unit and system level testing on selected Java open
source projects from the Defects4j framework. Furthermore, using information pro vided by the Defects4j tool the fault detection level of different tests and, ultimately,
of the different levels can be determined. The 16 analysed projects contained 25477
tests, where 78.4% of tests were categorized as low level and 21.56% as high level.
The results indicate that lower, unit level tests are used more in the investigated
Java open source projects. Looking at fault finding ability, from the 25477 tests,
only 998 tests were able to uncover bugs. From these 998 tests, 65.73% of tests
were categorized as low level and 34.27% were categorized as high level. It should
be mentioned that this result can be attributed to the fact that there was a higher
number of low level tests in the sample. Considering the rate of bug discoverability
from the total number of tests on the respective levels, the high level performed
better, with a discoverability rate of 6.23% while the low level had a rate of 3.28%.
The major contribution of this paper is a framework for test categorization on a low to a high-level scale, based on concrete and objective metrics that can be practically
applied on projects that subscribe to the object-oriented paradigm. Additionally,
backed on empirical data, we found out that high level tests have a higher rate of
discovering bugs. | sv |