vladimir castro - Academia.edu (original) (raw)
Uploads
Papers by vladimir castro
Recently, privacy issues have become important in clustering analysis, especially when data is ho... more Recently, privacy issues have become important in clustering analysis, especially when data is horizontally partitioned over several parties. Associative queries are the core retrieval operation for many data mining algorithms, especially clustering and k-NN classification. The algorithms that efficiently support k-NN queries are of special interest. We show how to adapt well-known data structures to the privacy preserving context and what is the overhead of this adaptation. We present an algorithm for k-NN in secure multiparty computation. This is based on presenting private computation of several metrics. As a result, we can offer three approaches to associative queries over horizontally partitioned data with progressively less security. We show privacy preserving algorithms for data structures that induce a partition on the space; such as KD-Trees. Our next preference is our Privacy Preserving SASH. However, we demonstrate that the most effective approach to achieve privacy is separate data structures for parties, where associative queries work separately, followed by secure combination to produce the overall output. This idea not only enhances security but also reduces communication cost between data holders. Our results and protocols also enable us to improve on previous approaches for k-NN classification.
Proceedings - IEEE International Conference on Data Mining, ICDM, 2007
It is not surprising that there is strong interest in k-NN queries to enable clustering, classifi... more It is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlierdetection tasks. However, previous approaches to privacypreserving k-NN are costly and can only be realistically applied to small data sets. We provide efficient solutions for k-NN queries queries for vertically partitioned data. We provide the first solution for the L ∞ (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L ∞ by providing a solution to the Yao's Millionaire Problem with more than two parties. This is based on a new and practical solution to Yao's Millionaire with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global that handles the large dimensionality and diversity of attributes common in vertically partitioned data.
Recently, privacy issues have become important in clustering analysis, especially when data is ho... more Recently, privacy issues have become important in clustering analysis, especially when data is horizontally partitioned over several parties. Associative queries are the core retrieval operation for many data mining algorithms, especially clustering and k-NN classification. The algorithms that efficiently support k-NN queries are of special interest. We show how to adapt well-known data structures to the privacy preserving context and what is the overhead of this adaptation. We present an algorithm for k-NN in secure multiparty computation. This is based on presenting private computation of several metrics. As a result, we can offer three approaches to associative queries over horizontally partitioned data with progressively less security. We show privacy preserving algorithms for data structures that induce a partition on the space; such as KD-Trees. Our next preference is our Privacy Preserving SASH. However, we demonstrate that the most effective approach to achieve privacy is separate data structures for parties, where associative queries work separately, followed by secure combination to produce the overall output. This idea not only enhances security but also reduces communication cost between data holders. Our results and protocols also enable us to improve on previous approaches for k-NN classification.
Proceedings - IEEE International Conference on Data Mining, ICDM, 2007
It is not surprising that there is strong interest in k-NN queries to enable clustering, classifi... more It is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlierdetection tasks. However, previous approaches to privacypreserving k-NN are costly and can only be realistically applied to small data sets. We provide efficient solutions for k-NN queries queries for vertically partitioned data. We provide the first solution for the L ∞ (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L ∞ by providing a solution to the Yao's Millionaire Problem with more than two parties. This is based on a new and practical solution to Yao's Millionaire with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global that handles the large dimensionality and diversity of attributes common in vertically partitioned data.