• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Kandidatuppsatser
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Kandidatuppsatser
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

Automatic Topic Extraction from Research Articles Using N-gram Analysis

Sammanfattning
Identifying the topic of an article can involve a lot of manual work. The manual processes can be exhaustive when it comes to a large volume of articles. In order to tackle this problem, we propose an automated topic extraction approach, which is able to extract topics for a large number of articles with a consideration to efficiency. To support the automatic topic extraction, our research focuses on existing N-gram analysis, which only calculates the words appearing frequency in a document. But in our research, we apply our customized filtering standards to improve the efficiency. And also to eliminate the irrelevant or noncritical phrases as many as possible. By doing that, we can make sure that our final selected keyphrases to each article are unique labels, which can represent the core idea of each specific article. In our case, we choose to focus on the research papers within the autonomous vehicle domain because the research papers are highly demanded in our daily life. Since most of the research papers are available only in PDF format, we need to process the PDF format files into the editable file types such as TXT. In order to realize the automation, we have selected a large number of autonomous vehicle-related articles to test our proposed idea. Then we observe the result and compare it with the manual topic extraction result to evaluate our approach.
Examinationsnivå
Student essay
URL:
http://hdl.handle.net/2077/44663
Samlingar
  • Kandidatuppsatser
Fil(er)
Thesis (1.058Mb)
Datum
2016-06-27
Författare
Chen, Maomao
Huang, Maoyi
Nyckelord
automatic topic extraction
N-gram
keyphrase
frequency statistic
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV