Managing info overload


UMSI Assistant Professor Qiaozhu Mei has been awarded a grant from the National Science Foundation to refine language models to improve tools for managing web information overload.

It has never been easier for Web users to express themselves in online communities. The overload of text information generated from online conversations is of concern to various consumer groups. Many effective tools to manage this information overload, such as text retrieval systems, rely on the use of statistical language models. The quality of language models is limited by the sparseness of data, the mismatch of context, and the incapability of modeling semantic relations.

In this project, cloud computing and novel map-reduce algorithms will be employed to extract heterogeneous language networks from Web-scale text collections. These language networks will be used to smooth and contextualize language models in various domains, making them accurate and robust. The refined language models will help improve state-of-the-art text retrieval and mining techniques, enhancing the information access and knowledge acquisition experience of real users across community and language boundaries.

The techniques and resources (e.g., language networks and refined language models) will benefit a broad range of users that analyze text content in social media and many other domains. Research results of this project, including tools and resources, will be published later.