kanishka Mane - Academia.edu (original) (raw)

Related Authors

Aziz Barbar

American University of Science and Technology

Kirk Borne

Souptik Datta

Anis Ismail

Eugenio  Cesario

Uploads

Papers by kanishka Mane

Research paper thumbnail of Distributed Identification of Top-l Inner Product Elements and its Application in a Peer-to-Peer Network

IEEE Transactions on Knowledge and Data Engineering, 2008

Inner product measures how closely two feature vectors are related. It is an important primitive ... more Inner product measures how closely two feature vectors are related. It is an important primitive for many popular data mining tasks, e.g., clustering, classification, correlation computation, and decision tree construction. If the entire data set is available at a single site, then computing the inner product matrix and identifying the top (in terms of magnitude) entries is trivial. However, in many real-world scenarios, data is distributed across many locations and transmitting the data to a central server would be quite communication-intensive and not scalable. This paper presents an approximate local algorithm for identifying top-l inner products among pairs of feature vectors in a large asynchronous distributed environment such as a peer-to-peer (P2P) network. We develop a probabilistic algorithm for this purpose using order statistics and Hoeffding bound. We present experimental results to show the effectiveness and scalability of the algorithm. Finally, we demonstrate an application of this technique for interest-based community formation in a P2P environment.

Research paper thumbnail of fgdfggdg

Research paper thumbnail of Distributed Identification of Top-l Inner Product Elements and its Application in a Peer-to-Peer Network

IEEE Transactions on Knowledge and Data Engineering, 2008

Inner product measures how closely two feature vectors are related. It is an important primitive ... more Inner product measures how closely two feature vectors are related. It is an important primitive for many popular data mining tasks, e.g., clustering, classification, correlation computation, and decision tree construction. If the entire data set is available at a single site, then computing the inner product matrix and identifying the top (in terms of magnitude) entries is trivial. However, in many real-world scenarios, data is distributed across many locations and transmitting the data to a central server would be quite communication-intensive and not scalable. This paper presents an approximate local algorithm for identifying top-l inner products among pairs of feature vectors in a large asynchronous distributed environment such as a peer-to-peer (P2P) network. We develop a probabilistic algorithm for this purpose using order statistics and Hoeffding bound. We present experimental results to show the effectiveness and scalability of the algorithm. Finally, we demonstrate an application of this technique for interest-based community formation in a P2P environment.

Research paper thumbnail of fgdfggdg

Log In