Evaluating the Trade-offs of Diversity-Based Test Prioritization: An Experiment
Sammanfattning
Different test prioritization techniques
detect faults at earlier stages of test execution. To this end,
Diversity-based techniques (DBT) have been cost-effective by
prioritizing the most dissimilar test cases to maintain effectiveness
and coverage with lower resources at different stages of the
software development life cycle, called levels of testing (LoT).
Diversity is measured on static test specifications to convey how
different test cases are from one another. However, there is little
research on DBT applied to semantic similarities of words within
tests. Moreover, diversity has been extensively studied within
individual LoT (unit, integration and system), but the trade-offs
of such techniques across different levels are not well understood.
Objective and Methodology: This paper aims to reveal relationships
between DBT and the LoT, as well as to compare and
evaluate the cost-effectiveness and coverage of different diversity
measures, namely Jaccard’s Index, Levenshtein, Normalized
Compression Distance (NCD), and Semantic Similarity (SS). We
perform an experiment on the test suites of 7 open source projects
on the unit level, 1 industrial project on the integration level, and
4 industry projects on the system level (where one project is used
on both system and integration levels).
Results: Our results show that SS increases test coverage for
system-level tests, and the differences in failure detection rate
of each diversity increase as more prioritised tests execute. In
terms of execution time, we report that Jaccard is the fastest,
whereas Levenshtein is the slowest and, in some cases, simply
infeasible to run. In contrast, Levenshtein detects more failures
on integration level, and Jaccard more on system level.
Conclusion: Future work can be done on SS to be implemented
on code artefacts, as well as including other DBT in the
comparison. Suspected test suite properties that seem to affect
DBT performance can be investigated in greater detail.
Examinationsnivå
Student essay
Samlingar
Fil(er)
Datum
2020-12-03Författare
Khojah, Ranim
Hong Chao, Chi
Nyckelord
Diversity-based testing
Test Case Prioritization
Natural Language Processing (NLP)
Level of Testing (LoT)
Språk
eng