A note on the effect of term weighting on selecting intrinsic dimensionality of data (original) (raw)

Abstract: The effect of term weighting on selecting intrinsic dimensionality of data is discussed. Experiments are conducted, using different term weighting and dimensionality selection methods, on four testing document collections (namely Medline, Cranfield, CACM and CISI). The results point that transforming the data matrix using a term weighting scheme plays a vital role in identifying the intrinsic dimensionality. Keywords: Dimensionality selection, Latent semantic indexing, Ssingular value decomposition, Term weighting.