![]() |
||||||||||
TUBE (Text-cUBE) for Discovering Entity Associations from a Document Collection
Abstract User-driven discovery of associations among entities, along with the documents that provide evidence for these associations, is an important search task conducted by researchers and domain information specialists. Entities here refer to physical or abstract objects such as people, organizations, ideologies, etc. Associations are the inter-relationships among entities. Current works in query-driven document retrieval and finding representative subgraphs are ill-suited for the task as they lack an awareness of entity types, nor do they provide an intuitive representation of associations. In this seminar, I will give a brief overview of various models for retrieving information from text documents. Following that, I will introduce the TUBE, a text-cube model that we developed for managing and discovering entity associations, along with the documentary evidence of these associations. The model consists of a multi-dimensional view of document data, flexible representation of multi-document summaries, and a set of operations for data manipulation. In addition, I will discuss techniques for extracting entity hierarchies to form the dimensions, as well as system implementation and validation issues. Biography PANG Hwee Hwa is an associate professor at the School of Information Systems, Singapore Management University. Before joining SMU, he was a Principal Scientist and Division Director at the A*Star Institute for Infocomm Research. His research interest includes database management systems, data security, and information retrieval. |
||||||||||
| © Copyright 2007 by Singapore Management University. All Rights Reserved. | ||||||||||