Novel Approaches for Privacy Preserving Data Mining in K-Anonymity Model P (original) (raw)
Related papers
Novel Approaches for Privacy Preserving Data Mining in k-Anonymity Model
In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymiza-tion based approaches for preserving the sensitive private information of an individual. The k-anonymity model is one of the widely used anonymization based approach. However , the anonymization based approaches suffer from the issue of information loss. To minimize the information loss various state-of-the-art anonymization based clustering approaches viz. Greedy k-member algorithm and Systematic clustering algorithm have been proposed. Among them, the Systematic clustering algorithm gives lesser information loss. In addition, these approaches make use of all attributes during the creation of an anonymized database. Therefore, the risk of disclosure of sensitive private data is higher via publication of all the attributes. In this paper, we propose two approaches for minimizing the disclosure risk and preserving the privacy by using systematic clustering algorithm. First approach creates an unequal combination of quasi-identifier and sensitive attribute. Second approach creates an equal combination of quasi-identifier and sensitive attribute. We also evaluate our approach empirically focusing on the information loss and execution time as vital metrics. We illustrate the effectiveness of the proposed approaches by comparing them with the existing clustering algorithms.
An Efficient Data Mining Method for Clustering on Privacy Preserving Concept
— Privacy preserving data mining has become increasingly popular because it allows sharing of private sensitive data for analysis purposes. The concept of privacy preserving data mining has been proposed in response to these privacy concerns. The main goal of this research work has introduced a new k-Anonymity algorithm which is capable of transforming a non anonymous data set into a k-Anonymity data set. K-Anonymity model is thus to transform a table so that no one can make high-probability associations between records in the table and the corresponding entities. In order to achieve this goal, the K-Anonymity model requires that any record in a table be indistinguishable from at least (k−1) other records with respect to the predetermined quasi-identifier. Finally the modified dataset is used for clustering.
Sensitive Attributes based Privacy Preserving in Data Mining using k-anonymity
knowledge from huge amount of data. In recent years, there has been a tremendous growth in the amount of personal data that can be collected and analyzed by the organizations. Organizations such as credit card companies, real estate companies and hospitals collect and hold large volumes of data for their research purposes. E.g. National Institute of health. When these organizations publish data containing a lot of sensitive information. The importance of sharing data for research and knowledge discovery has been well-recognized. However, sharing data that contains sensitive personal information, such as insurance data, medical record, etc across organization boundaries can raise serious privacy concerns. There is a need to preserve the privacy of the individuals in data set . K-anonymity is one of the easy and efficient techniques to achieve privacy in many data publishing applications. In k-anonymity, all tuples of releasing database are generalized to make it anonymize which lead to data utility reduction and more information loss of publishing table. Sensitive attribute based anonymity method is very useful in preserving the privacy of individuals in organization’s publication of data. It reduces information loss to the researchers by providing sensitive levels. This method also avoids Homogeneity attack and Background attacks.
Privacy Preservation of Data in Data mining using K-anonymity and Randomization Method
Increasing the business prospective the sharing of data is the most important. But when Sensitive data are share between two parties at that time the privacy of data is the major problem. In day to day life the Sharing, transferring, mining and publishing data are the major factor in privacy preservation. When sensitive data are share between two parties then the privacy of data is the major problem. The main aim of the privacy preservation is protecting the sensitive information in data while extracting knowledge from large amount of data. There are many techniques are use in privacy preservation like k-anonymity, l-diversity, t-closeness, blocking based method and cryptography techniques. Privacy preserving techniques available but still they have shortcomings. Like Anonymity technique gives privacy protection and usability of data but it suffers from homogeneity and background attack. Blocking method suffers from information loss and random perturbation technique does not provide usability of data. Cryptography technique gives privacy protection but does not provide usability of data and it requires more computational overhead. So in this work we use the k-anonymity method to prevent our data and we can get better accuracy as compare to previously used methods.
Privacy Preserving Data Mining - Optimization in K-Anonymity using Machine Learning Approach
In now days the information sharing is very important. One organization shares the information of user to another organization for the better survey purpose. But the sensitive data of user will not be disclosed. So for that purpose we have to hide some sensitive data of user for that the data must be encrypted. K-anonymity algorithm is one of the ways to encrypt data so that data cannot be stealing and the information in the data will not modify. But there is some way to attack on the k-anonymity encrypted data. One of the way is background knowledge attack, in this if the attacker knows some basic information about the use then he can get the detail from database. If we can add some more data in the original database and the apply k-anonymity algorithm so that the attacker is get more rows of data and he will confuse so the data should be protected from the attacker.