AuTopEx: Automated Topic Extraction Techniques Applied in the Software Engineering Domain
| dc.contributor.author | Johansson, Magnus | |
| dc.contributor.author | Klemetz, Jonathan | |
| dc.contributor.department | Göteborgs universitet/Institutionen för data- och informationsteknik | swe |
| dc.contributor.department | University of Gothenburg/Department of Computer Science and Engineering | eng |
| dc.date.accessioned | 2016-06-27T11:47:51Z | |
| dc.date.available | 2016-06-27T11:47:51Z | |
| dc.date.issued | 2016-06-27 | |
| dc.description.abstract | Automatically extracting topics from scientific papers can be very beneficial when a researcher needs to classify a large number of such papers. In this thesis we develop and evaluate an approach for Automatic Topic Extraction, Au- TopEx. The approach is comprised of four parts: 1) Text pre-processing. 2) Training a Latent Dirichlet Allocation model on part of a corpus. 3) Manually identifying relevant topics from the model. 4) Querying the model using the rest of the corpus. We show that it is possible to automatically extract topics by applying AuTopEx on a corpus of scientific papers on autonomous vehicles. According to our evaluation AuTopEx works better on full-text articles than texts consisting of just title, abstract and key-words. Finally we show that this approach is vastly faster than human annotators, although not as accurate. | sv |
| dc.identifier.uri | http://hdl.handle.net/2077/44662 | |
| dc.language.iso | eng | sv |
| dc.setspec.uppsok | Technology | |
| dc.title | AuTopEx: Automated Topic Extraction Techniques Applied in the Software Engineering Domain | sv |
| dc.type | text | |
| dc.type.degree | Student essay | |
| dc.type.uppsok | M2 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- gupea_2077_44662_1.pdf
- Size:
- 3.44 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 876 B
- Format:
- Item-specific license agreed upon to submission
- Description: