Algorithms for mining distance-based outliers in large datasets (original) (raw)
1998, Proc. of the 24th lnt
Abstract
Algorithms for Mining Distance-Based Outliers in Large ... Abstract This paper deals with finding outliers (ex- ceptions) in large, multidimensional datasets. ...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (26)
- A. A rning, R. Agrawal, and P. Raghavan. A linear method for deviation detection in large databases. In Proc. KDD, pages 164- 169, 1996.
- R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami. An interval clas- Table 2: CPU + I/O Times (in Seconds) for a Variable Number of Tuples, Dimensions, and Cells-for p = 0.9999. 3-D 4-D 5-D N CS(10") NL CS(104) NL CS(105) CS(85)
- CS(65) NL 100,000 10.77
- 2,000,000 253.90 2332.10 606.56 1421.16 >2147 >2147 >2147 1555.78 sifier for database mining applications. In PTOC. 18th VLDB, pages 560-573, 1992.
- R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In PTOC. ACM SIGMOD, pages 207-216,1993.
- AL881 D. Angluin and P. Laird. Learning from noisy examples. Machine Learning, 2(4):343-370, 1988.
- I. S. Bhandari, E. Colet, J. Parker, Z. Pines, R. Pratap, and K. Ramanujam. Advanced scout: Data mining and knowl- edge discovery in NBA data. Data Min- ing and Knowledge Discovery, 1(1):121- 125, 1997.
- Ben751 J. L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 18(9):509-517, 1975.
- V. Barnett and T. Lewis. Outliers in Sta- tistical Data. John Wiley, 3rd edition, 1994.
- M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for dis- covering clusters in large spatial databases with noise. In PTOC. KDD, pages 226-231, 1996.
- D. Freedman, R. Pisani, and R. Purves. Statistics. W.W. Norton, New York, 1978.
- Gut841 R. Guttmann. A dynamic index structure for spatial searching. In PTOC. ACM SIG- MOD, pages 47-57, 1984.
- D. Hawkins. Identijcation of Outliers.
- Chapman and Hall, London, 1980.
- J. H an, Y. Cai, and N. Cercone. Knowl- edge discovery in databases: An attribute- oriented approach. In PTOC. 18th VLDB, pages 547-559,1992.
- J. Hellerstein, E. Koutsoupias, and C. Pa- padimitriou. On the analysis of indexing schemes. In Proc. PODS, pages 249-256, 1997. [JW92] [KN96] [KN97] [Kno97] [MT961 [MTV95] [NH941 [PS88] [RR961 [SamSO] [ZRL96] R. A. Johnson and D. W. Wichern. Applied Multivariate Statistical Analysis. Prentice- Hall, 3rd edition, 1992.
- E. M. Knorr and R. T. Ng. Finding aggre- gate proximity relationships and common- alities in spatial data mining. IEEE Trans- actions on Knowledge and Data Engineer- ing, 8(6):884-897, 1996.
- E. M. Knorr and R. T. Ng. A unified no- tion of outliers: Properties and computa- tion. In PTOC. KDD, pages 219-222, 1997. An extended version of this paper appears as: E. M. Knorr and R.T. Ng. A Unified Approach for Mining Outliers. In PTOC. 7th CASCON, pages 236-248, 1997.
- E. M. Knorr. On digital money and card technologies. Technical Report 97-02, Uni- versity of British Columbia, 1997.
- H. Mannila and H. Toivonen. Discovering generalized episodes using minimal occur- rences. In Proc. KDD, pages 146-151,1996.
- H. Mannila, H. Toivonen, and A. Verkamo. Discovering frequent episodes in sequences. In Proc. KDD, pages 210-215, 1995.
- R. Ng and J. Han. Efficient and effective clustering methods for spatial data mining. In PTOC. 20th VLDB, pages 144-155, 1994.
- F. Preparata and M. Shamos. Com- putational Geometry: an Introduction.
- I. Ruts and P. Rousseeuw. Computing depth contours of bivariate point clouds. Computational Statistics and Data Analy- sis, 23:153-168, 1996.
- H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990. T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An efficient data clustering method for very large databases. In PTOC.
- ACM SIGMOD, pages 103-114,1996.