Computer Science and Information Systems 2009 Volume 6, Issue 2, Pages: 217-227
Full text ( 205 KB)
Analysis of unsupervised dimensionality reduction techniques
Kumar Aswani Ch.
Domains such as text, images etc contain large amounts of redundancies and ambiguities among the attributes which result in considerable noise effects (i.e. the data is high dimension). Retrieving the data from high dimensional datasets is a big challenge. Dimensionality reduction techniques have been a successful avenue for automatically extracting the latent concepts by removing the noise and reducing the complexity in processing the high dimensional data. In this paper we conduct a systematic study on comparing the unsupervised dimensionality reduction techniques for text retrieval task. We analyze these techniques from the view of complexity, approximation error and retrieval quality with experiments on four testing document collections.
Keywords: dimensionality reduction, information retrieval, latent semantic indexing, matrix decompositions
More data about this article available through SCIndeks