Representational Issues in Machine Learning of User Profiles (original) (raw)
As more information becomes available electronically, tools for finding information of interest to users become increasingly important. Building tools for assisting users in finding relevant information is often complicated by the difficulty in articulating user interest in a form that can be used for searching. The goal of the research described here is to build a system for generating comprehensible user profiles that accurately capture user interest with minimum user interaction. Machine learning methods offer a promising approach to solving this problem. The research described here focuses on the importance of a suitable generalization hierarchy and representation for learning profiles which are predictively accurate and comprehensible. In our experiments using AQ15c and C4.5 we evaluated both traditional features based on weighted term vectors as well as subject features corresponding to categories which could be drawn from a thesaurus. Our experiments, conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web (the IDD News Browser) demonstrate the importance of a generalization hierarchy in obtaining high predictive accuracy, precision and recall, and stability of learning.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.