• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

Benchmarking Deep Learning Testing Techniques A Methodology and Its Application

Benchmarking Deep Learning Testing Techniques A Methodology and Its Application

Sammanfattning
With the adoption of Deep Learning (DL) systems within the security and safetycritical domains, a variety of traditional testing techniques, novel techniques, and new ideas are increasingly being adopted and implemented within DL testing tools. However, there is currently no benchmark method that can help practitioners to compare the performance of the different DL testing tools. The primary objective of this study is to attempt to construct a benchmarking method to help practitioners in their selection of a DL testing tool. In this paper, we perform an exploratory study on fifteen DL testing tools to construct a benchmarking method and have made one of the first steps towards designing a benchmarking method for DL testing tools. We propose a set of seven tasks using a requirement-scenario-task model, to benchmark DL testing tools. We evaluated four DL testing tools using our benchmarking tool. The results show that the current focus within the field of DL testing is on improving the robustness of the DL systems, however, common performance metrics to evaluate DL testing tools are difficult to establish. Our study suggests that even though there is an increase in DL testing research papers, the field is still in an early phase; it is not sufficiently developed to run a full benchmarking suite. However, the benchmarking tasks defined in the benchmarking method can be helpful to the DL practitioners in selecting a DL testing tool. For future research, we recommend a collaborative effort between the DL testing tool researchers to extend the benchmarking method.
Examinationsnivå
Student essay
Övrig beskrivning
With the adoption of Deep Learning (DL) systems within the security and safetycritical domains, a variety of traditional testing techniques, novel techniques, and new ideas are increasingly being adopted and implemented within DL testing tools. However, there is currently no benchmark method that can help practitioners to compare the performance of the different DL testing tools. The primary objective of this study is to attempt to construct a benchmarking method to help practitioners in their selection of a DL testing tool. In this paper, we perform an exploratory study on fifteen DL testing tools to construct a benchmarking method and have made one of the first steps towards designing a benchmarking method for DL testing tools. We propose a set of seven tasks using a requirement-scenario-task model, to benchmark DL testing tools. We evaluated four DL testing tools using our benchmarking tool. The results show that the current focus within the field of DL testing is on improving the robustness of the DL systems, however, common performance metrics to evaluate DL testing tools are difficult to establish. Our study suggests that even though there is an increase in DL testing research papers, the field is still in an early phase; it is not sufficiently developed to run a full benchmarking suite. However, the benchmarking tasks defined in the benchmarking method can be helpful to the DL practitioners in selecting a DL testing tool. For future research, we recommend a collaborative effort between the DL testing tool researchers to extend the benchmarking method.
URL:
http://hdl.handle.net/2077/65507
Samlingar
  • Masteruppsatser
Fil(er)
gupea_2077_65507_1.pdf (4.497Mb)
Datum
2020-07-06
Författare
Chuphal, Himanshu
Dimitrov, Kristiyan
Nyckelord
Deep Learning
DL
DL testing tools
testing
software engineering
design
benchmark
model
datasets
tasks
tools
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV