Conferences >2015 7th Conference on Inform...

Incorporating local word relationships into probabilistic topic models

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Probabilistic topic models have been very popular in automatic text analysis since introduction. As a dimensionality reduction method, they are similar to term clustering...Show More

Metadata

Abstract:

Probabilistic topic models have been very popular in automatic text analysis since introduction. As a dimensionality reduction method, they are similar to term clustering methods. These models work based on word co-occurrence but are not very flexible with context in which co-occurrence is defined. Probabilistic topic models do not let us to bring local or spatial data into account and therefore their performance is poor when it comes to short documents or applications that are bound to local data. Despite their generally better performance compared to term clustering methods, probabilistic topic models do not benefit from one of the key features of term clustering methods; flexibility in defining context in which co-occurrence is calculated. In this paper we introduce a perspective to look at probabilistic topic models which can lead to more flexible models and a model which according to the perspective has the mentioned flexibility.

Published in: 2015 7th Conference on Information and Knowledge Technology (IKT)

Date of Conference: 26-28 May 2015

Date Added to IEEE Xplore: 05 October 2015

ISBN Information:

DOI: 10.1109/IKT.2015.7288758

Conference Location: Urmia, Iran

Contents

I. Introduction

Nowadays we are faced with a vast amount of online digitalized information. As the amount continues to grow, it becomes more and more difficult to find what we are looking for but, it will be way more facile if we could look for our needed information by exploring based on thematic data instead of raw data. Probabilistic topic modeling introduces methods which can extract thematic structure of documents. The basic idea of these methods is that a document is a mixture of latent topics and each topic is a distribution over words. Suppose we have documents where each document consists of words and such that there are topics and unique words . The topic assigned to each word is denoted by • Based on this view we can approach the problem of extracting topics of a corpus like this: each topic is a distribution over words where the words are exchangeable i.e. each document is a bag of words. Documents are also exchangeable. Each word in each document is extracted from the distribution of its assigned topic. For each document there is a distribution over topics which shows how the topics have been mixed to produce the document. Then there are two parameters in model; distribution of words in topics and distribution of topics in documents .

References is not available for this document.

MIT Libraries

MIT Libraries

Incorporating local word relationships into probabilistic topic models

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Incorporating local word relationships into probabilistic topic models

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?