dc.contributor.author | Sileikis, Saimonas | |
dc.date.accessioned | 2016-06-27T12:01:10Z | |
dc.date.available | 2016-06-27T12:01:10Z | |
dc.date.issued | 2016-06-27 | |
dc.identifier.uri | http://hdl.handle.net/2077/44667 | |
dc.description.abstract | A vulnerability database for a large C++ program was
used to mark source code files responsible for the vulnerability
either as clean or vulnerable. The whole source code was used
with latent Dirchlet allocation (LDA) to extract hidden topics from
it. Each file was given a topic distribution probability, as well as the
status of being either clean or vulnerable. This data was used to
train machine learning algorithm to detect vulnerable source files,
based only on their topic distribution. In total, three different sets
of data were prepared from the original source code with varying
number of topics, number of documents, and iterations of LDA
performed. None of data sets showed ability to predict software
vulnerability based on LDA and machine learning. | sv |
dc.language.iso | eng | sv |
dc.title | Predicting software vulnerabilities using topic modeling | sv |
dc.type | text | |
dc.setspec.uppsok | Technology | |
dc.type.uppsok | M2 | |
dc.contributor.department | Göteborgs universitet/Institutionen för data- och informationsteknik | swe |
dc.contributor.department | University of Gothenburg/Department of Computer Science and Engineering | eng |
dc.type.degree | Student essay | |