Loading [MathJax]/extensions/MathMenu.js
Cluster Labeling for the Blogosphere | IEEE Conference Publication | IEEE Xplore

Cluster Labeling for the Blogosphere


Abstract:

Hierarchical Cluster Labeling helps users to quickly understand and analyze hierarchical clusters. This may be used to enhance search engine results or interactive browsi...Show More

Abstract:

Hierarchical Cluster Labeling helps users to quickly understand and analyze hierarchical clusters. This may be used to enhance search engine results or interactive browsing like it is being used in the Blog Intelligence application. The hierarchical organization of data helps to represent different levels of detail. Hierarchical clustering may be quite common, but there are few good solutions for labeling those clusters. We decided to lay the focus of this work on labeling binary hierarchical clusters. Current approaches focus either on statistical features of the clustered documents or external sources like Wikipedia. We combined those ideas to profit from both advantages and created an algorithm, that can handle clustered documents as well as terms.
Date of Conference: 03-05 December 2014
Date Added to IEEE Xplore: 09 February 2015
Electronic ISBN:978-1-4799-6719-3
Conference Location: Sydney, NSW, Australia
References is not available for this document.

I. Introduction

Since every day millions of posts are being published the huge collection of web documents inside the blogosphere is getting bigger and bigger. Clustering this ever-changing collection is a very time consuming task. BlogIntelligence

http://www.blog-intelligence.com

is providing a smart search engine for the blogosphere, including harvesting, analysis and presenting the results in a very meaningful way.

Select All
1.
D. Carmel, H. Roitman, and N. Zwerdling. Enhancing cluster labeling using wikipedia. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 139-146. ACM, 2009.
2.
S.-L. Chuang and L.-F. Chien. A practical web-based approach to generating topic hierarchy for text segments. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 127-136. ACM, 2004.
3.
K. Coursey, R. Mihalcea, andW. Moen. Using encyclopedic knowledge for automatic topic identification. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pages 210-218. Association for Computational Linguistics, 2009.
4.
Open Directory Project. http://www.dmoz.org/, 1998. [Online; accessed 14-Februar-2014].
5.
C. Fellbaum. WordNet. Wiley Online Library, 1999.
6.
E. Glover, D. M. Pennock, S. Lawrence, and R. Krovetz. Inferring hierarchical descriptions. In Proceedings of the eleventh international conference on Information and knowledge management, pages 507-514. ACM, 2002.
7.
J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194:28-61, 2013.
8.
I. Hulpus, C. Hayes, M. Karnstedt, and D. Greene. Unsupervised graph-based topic labelling using dbpedia. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 465-474. ACM, 2013.
9.
A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice-Hall, Inc., 1988.
10.
C. Kohlschtter. boilerpipe. https://code.google.com/p/boilerpipe/, 2009. [Online; accessed 07-Februar-2014].
11.
R. Krovetz. Viewing morphology as an inference process. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages 191-202. ACM, 1993.
12.
J. H. Lau, K. Grieser, D. Newman, and T. Baldwin. Automatic labelling of topic models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies- Volume 1, pages 1536-1545. Association for Computational Linguistics, 2011.
13.
J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, et al. Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal, 2013.
14.
D. Magatti, S. Calegari, D. Ciucci, and F. Stella. Automatic labeling of topics. In Intelligent Systems Design and Applications, 2009. ISDA09. Ninth International Conference on, pages 1227-1232. IEEE, 2009.
15.
Q. Mei, X. Shen, and C. Zhai. Automatic labeling of multinomial topic models. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 490-499. ACM, 2007.
16.
P. N. Mendes, M. Jakob, and C. Bizer. Dbpedia for nlp: A multilingual cross-domain knowledge base. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC12), Istanbul, Turkey, May 2012.
17.
A. Miles, B. Matthews, M. Wilson, and D. Brickley. Skos core: simple knowledge organisation for the web. In International Conference on Dublin Core and Metadata Applications, pages pp-3, 2005.
18.
M. Muhr, R. Kern, and M. Granitzer. Analysis of structural relationships for hierarchical cluster labeling. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 178-185. ACM, 2010.
19.
I. Niles and A. Pease. Towards a standard upper ontology. In Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001, pages 2-9. ACM, 2001.
20.
T. Nomoto. Wikilabel: an encyclopedic approach to labeling documents en masse. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 2341-2344. ACM, 2011.
21.
P. Schönhofen. Identifying document topics using the wikipedia category network. Web Intelligence and Agent Systems, 7(2):195-207, 2009.
22.
Z. S. Syed, T. Finin, and A. Joshi. Wikipedia as an ontology for describing documents. In ICWSM, 2008.
23.
P. Treeratpituk and J. Callan. Automatically labeling hierarchical clusters. In Proceedings of the 2006 international conference on Digital government research, pages 167-176. Digital Government Society of North America, 2006.
24.
P. Treeratpituk and J. Callan. An experimental study on automatically labeling hierarchical clusters using statistical features. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 707-708. ACM, 2006.

References

References is not available for this document.