Caimei Lu | Drexel University (original) (raw)

Caimei Lu

My research interests include knowlege organization, web mining, information retrieval and digital library. I have done research in metadata quality evaluation, automatic metadata generation, and properties of user-created metadata (social annotations). Currently, I am investigating how to utilize user-created social metadata (social annotations) to improve web organization and web search.
Supervisors: Tony Hu and Jung-ran Park

less

Uploads

Papers by Caimei Lu

Research paper thumbnail of Exploiting Wikipedia as external knowledge for document clustering

Abstract In traditional text clustering methods, documents are represented as" bags of words" wit... more Abstract In traditional text clustering methods, documents are represented as" bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms.

Research paper thumbnail of A probabilistic topic-connection model for automatic image annotation

Abstract The explosive increase of image data on Internet has made it an important, yet very chal... more Abstract The explosive increase of image data on Internet has made it an important, yet very challenging task to index and automatically annotate image data. To achieve that end, sophisticated algorithms and models have been proposed to study the correlation between image content and corresponding text description. Despite the success of previous works, however, researchers are still facing two major difficulties that may undermine their effort of providing reliable and accurate annotations for images.

Research paper thumbnail of Probabilistic models for topic learning from images and captions in online biomedical literatures

Abstract Biomedical images and captions are one of the major sources of information in online bio... more Abstract Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and provide rich information about the main themes in published papers.

Research paper thumbnail of Exploit the Tripartite Network of Social Tagging for Web Clustering

Research paper thumbnail of Metadata Effectiveness: A Comparison between User-Created Social Tags and Author-Provided Metadata

Research paper thumbnail of Exploiting Wikipedia as external knowledge for document clustering

Abstract In traditional text clustering methods, documents are represented as" bags of words" wit... more Abstract In traditional text clustering methods, documents are represented as" bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms.

Research paper thumbnail of A probabilistic topic-connection model for automatic image annotation

Abstract The explosive increase of image data on Internet has made it an important, yet very chal... more Abstract The explosive increase of image data on Internet has made it an important, yet very challenging task to index and automatically annotate image data. To achieve that end, sophisticated algorithms and models have been proposed to study the correlation between image content and corresponding text description. Despite the success of previous works, however, researchers are still facing two major difficulties that may undermine their effort of providing reliable and accurate annotations for images.

Research paper thumbnail of Probabilistic models for topic learning from images and captions in online biomedical literatures

Abstract Biomedical images and captions are one of the major sources of information in online bio... more Abstract Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and provide rich information about the main themes in published papers.

Research paper thumbnail of Exploit the Tripartite Network of Social Tagging for Web Clustering

Research paper thumbnail of Metadata Effectiveness: A Comparison between User-Created Social Tags and Author-Provided Metadata

Log In