KDD-2000 Workshop on Text Mining (original) (raw)
August 20, 2000
Held at KDD-2000, Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-23, 2000, Boston, MA, USA
Call for papers (expired) on the following topics was open till May 15, 2000. We received 40 submissions that entered the review process. The list of pre-registered workshop attendees contains about 80 names.
Workshop Schedule, Workshop Proceedings and the list of accepted papers are available on the Web as well as theWorkshop Summary (to appera in SIGKDD Explorations, January 2001).
Topics of interest
The objective of this workshop is to enable presentation and exchange of ideas on various aspects of Text Mining. Our desire is to facilitate communication among researchers and practitioners from related and complementary research areas, who are working on similar problems but with possibly different focus and problem solving approaches. More precisely, we invite papers from the four areas:
- Text Mining (or Text Learning) (TM)
- Information Retrieval (IR)
- Natural Language Processing (NLP)
- Information Extraction (IE). Particular topics of interest for the workshop include but are not limited to:
- text mining & information retrieval
- text mining & natural language processing
- text mining & web mining
- text representation
- text categorization
- text segmentation
- information extraction
- scalability of developed approaches
- performance evaluation measures
- feature selection
- multilingual approaches to text mining
- influence of domain and domain specific text mining
- innovative applications of text mining.
Workshop schedule
The workshop consists of two invited talks, presentation of refereed papers and posters, and discussions. We hope that the program will stimulate future collaboration among researchers on text mining problems.
- 8:45am - 8:55am Opening
- 8:55am - 9:25am Invited talk by Ronen Feldman: "Text Mining: Opportunities and Challenges"
- 9:25am - 10:15am Papers Ia: Information Extraction and Text Mining (session intro. + 3 papers - 12 mins each)
- 9:30am - 9:45am Data Mining on Symbolic Knowledge Extracted from the Web, R. Ghani, R. Jones, D. Mladenic, K. Nigam, S. Slattery
- 9:45am - 10:00am Using Information Extraction to Aid the Discovery of Prediction Rules from Text, U. Y. Nahm, R. J. Mooney
- 10:00am - 10:15am High Precision Information Extraction, R. Caruana, P. G. Hodor
- 10:15am - 10:35am Coffee Break
- 10:35am - 11:05am Papers Ib: Text categorization methods using machine learning (2 papers - 12 mins each)
- 10:35am - 10:50am Large Margin Winnow Methods for Text Categorization, T. Zhang
- 10:50am - 11:05am A Feature Weight Adjustment Algorithm for Document Categorization, S. Shankar, G. Karypis
- 11:05am - 11:40am Papers IIa: Text Mining applications: Mining time-tagged text (session intro. + 2 papers - 12 mins each)
- 11:10am - 11:25am TimeMines: Constructing Timelines with Statistical Models of Word Usage, R. Swan, D. Jensen
- 11:25am - 11:40am Mining of Concurrent Text and Time-Series, V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan
- 11:40am - 12:00pm Poster Session I (session intro. + 8 posters - 1 min each)
- 12:00pm - 1:00pm Lunch Break
- 1:00pm - 1:30pm Invited talk by David Lewis: "ATTICS: A Toolkit for Text Classification and Text Mining"
- 1:30pm - 2:30pm Papers IIb: Text Mining applications: Finding themes/topics in text, Mining e-mail data (4 papers - 12 mins each)
- 1:30pm - 1:45pm Mining Themes from Bookmarks, S. Chakrabarti, Y. Batterywala
- 1:45pm - 2:00pm Discovering Encyclopedic Structure and Topics in Text, L. A. Mather, J. Note
- 2:00pm - 2:15pm Mining E-mail Authorship, O. De Vel
- 2:15pm - 2:30pm ifile: An Application of Machine Learning to E-Mail Filtering, J. Rennie
- 2:30pm - 2:45pm Poster Session II (7 posters - 1 min each)
- 2:45pm - 3:30pm Posters on the boards
- 3:30pm - 4:00pm Discussion and closing
Organization
Program Chairs
Marko Grobelnik
J.Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Marko.Grobelnik@ijs.si
Dunja Mladenic
J.Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia and Carnegie Mellon University, School of Computer Science, Pittsburgh, USA, 5000 Forbes Ave, Pittsburgh, PA 15213, USA
Dunja.Mladenic@cs.cmu.edu
Natasa Milic-Frayling
Microsoft Research Ltd, St. George House, 1 Guildhall Street Cambridge, CB2 3NH, United Kingdom
natasamf@microsoft.com
Program Committee
- Helena Ahonen,University of Helsinki, Helsinki, Finland
- Simon Corston-Oliver, Microsoft Research, Redmond, WA
- Mark Craven, University of Wisconsin, Madison, Wisconsin
- Walter Daelemans, University of Antwerp, Antwerpen, Belgium
- Susan Dumais, Microsoft Research, Redmond, WA
- David Elworthy, Microsoft Research Ltd, Cambridge, UK
- Ronen Feldman, Instinct Software, Israel
- Marko Grobelnik, J.Stefan Institute, Ljubljana, Slovenia
- Thorsten Joachims, Universitaet Dortmund, Dortmund, Germany
- Rosie Jones, Carnegie Mellon University, Pittsburgh, PA
- Natasa Milic-Frayling, Microsoft Research Ltd, Cambridge, UK
- Dunja Mladenic, J.Stefan Institute, Ljubljana, Slovenia
- Jason Rennie, Massachusetts Institute of Technology, MA
- Stephen Robertson, Microsoft Research Ltd, Cambridge, UK
- Sean Slattery, Carnegie Mellon University, Pittsburgh, PA
- Ian Witten, University of Waikato, Hamilton, New Zealand
This Workshop is partially supported by the European FP5 project "Data Mining and Decision Support for Business Competitiveness: A European Virtual Enterprise (Sol-Eu-Net)".