Privacy-preserving data mining: A feature set partitioning approach (original) (raw)
Related papers
Data Restructuring with Feature Set Partitioning for Privacy Preserving using K-anonymity
2018
Most of the data available for knowledge discovery and information retrieval are prone to identity and privacy disclosure. The major act to disclose the identity is through exploring the pattern of attributes involved in data formation. The existing benchmarking models are anonym zing the data either by generalizing, deleting the sensitive attributes or adding noise to the data. Either of these approaches are not guaranteed in optimality and accuracy in results that obtained from the mining models applied on that dataset. The deviation in results often causes falsified decision making, which is unconditionally not acceptable in few domains like health mining and real time environment. In order to fill the gap, here we proposed a hybridization of feature set partitioning and data restructuring to achieve the pattern anonymization. The model is particularly aimed to restructure the data for supervised learning. To the best of our knowledge, pattern anonymization is first of its kind t...
A Hybrid Approach for privacy preserving using randomization for data mining
International Journal of Advance Research and Innovative Ideas in Education, 2016
Many organizations large amount of data are collected. These data are further used by the organizations for the analysis purposes which help gaining useful knowledge. The data collected may contain private or sensitive information which should be protected. Privacy protection is an important issue if we release data for the mining or sharing purpose. Our technique protects the sensitive data with less information loss which increase data usability and also prevent the sensitive data for various types of attack. Data can also be reconstructed using our proposed technique. A novel hybrid method to achieve k-support anonymity based on statistical observations on the datasets. Our comprehensive experiments on real as well as synthetic datasets show that our techniques are effective and provide moderate privacy. A hybrid approach for used to improved security and accuracy to private data. Our novel hybrid approach towards privacy preserving k-anonymity and artificial neural network techn...
Novel Approaches for Privacy Preserving Data Mining in K-Anonymity Model P
2015
In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymization based approaches for preserving the sensitive private information of an individual. The k-anonymity model is one of the widely used anonymization based approach. However, the anonymization based approaches suffer from the issue of information loss. To minimize the information loss various state-of-the-art anonymization based clustering approaches viz. Greedy k-member algorithm and Systematic clustering algorithm have been proposed. Among them, the Systematic clustering algorithm gives lesser information loss. In addition, these approaches make use of all attributes during the creation of an anonymized database. Therefore, the risk of disclosure of sensitive private data is higher via publication of all the attributes. In this paper, we propose two approaches for minimizing the disclosure risk and preserving the...
Novel Approaches for Privacy Preserving Data Mining in k-Anonymity Model
In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymiza-tion based approaches for preserving the sensitive private information of an individual. The k-anonymity model is one of the widely used anonymization based approach. However , the anonymization based approaches suffer from the issue of information loss. To minimize the information loss various state-of-the-art anonymization based clustering approaches viz. Greedy k-member algorithm and Systematic clustering algorithm have been proposed. Among them, the Systematic clustering algorithm gives lesser information loss. In addition, these approaches make use of all attributes during the creation of an anonymized database. Therefore, the risk of disclosure of sensitive private data is higher via publication of all the attributes. In this paper, we propose two approaches for minimizing the disclosure risk and preserving the privacy by using systematic clustering algorithm. First approach creates an unequal combination of quasi-identifier and sensitive attribute. Second approach creates an equal combination of quasi-identifier and sensitive attribute. We also evaluate our approach empirically focusing on the information loss and execution time as vital metrics. We illustrate the effectiveness of the proposed approaches by comparing them with the existing clustering algorithms.
2017
A gigantic quantity of individual health information is accessible in modern decades and dispositioning of any part of this information establishes a huge risk in the field of health care. Enduring anonymization methods are only appropriate for single susceptible and low down dimensional data to remain with privacy particularly like generalization and bucketization. We propose an anonymization technique that is a amalgamation of the betterment of anatomization, and improved slicing approach observing to the principle of k-anonymity and l-diversity for the reason of dealing with high dimensional data along with multiple susceptible data. The anatomization approach disrupts the correlation detected between the quasi identifier attributes and susceptible attributes (SA) and turnouts’ two different tables with non-overlapping attributes. Hence, experimental outcomes specify that the suggested method can preserve privacy of data with various sensitive attributes. The anatomization approa...
Framework for Privacy Preserving Classification in Data Mining
JETIR
In the present period of developing innovation the information gathered by associations has the necessity to protect the security of the people. It needs to keep up protection of the people since clients delicate information is put away online over the brought together vault. The procedures like anonymization, randomization are utilized to accomplish the protection. In any case, anonymization prompts certain level of data misfortune while safeguarding protection. To beat this disadvantage, cross breed approach is utilized. The proposed framework includes blend of two strategies i.e anatomization and irritation methods. The semi identifiers like postal division, age, sexual orientation of a man does not appear to be critical to ensure but rather these fields when connected with some different traits can uncover the character or delicate data of a person. The cross breed technique centers around the objective of saving protection by examining and irritating the semi identifiers in the touchy information of clients put away on incorporated information archive without making any misfortune the data.