Sonajharia Minz | Jawaharlal Nehru University (original) (raw)
Papers by Sonajharia Minz
Communications in computer and information science, 2017
Multiple views of a dataset are constructed using the Feature Set Partitioning (FSP) methods for ... more Multiple views of a dataset are constructed using the Feature Set Partitioning (FSP) methods for Multi-view Ensemble Learning (MEL). The way of partitioning of features influences the classification performance of MEL. The possible numbers of features set partition of the dataset are equal to the Bell number, which is in polynomial nature and a NP-hard problem (Shown in Fig. 1). It is essential to find an optimal classification performance of MEL for a features set partition among all possible features set partition in high dimension scenario. Therefore, an optimal multi-view ensemble learning approach using constrained particle swarm optimization method (OMEL-C-PSO) is proposed for high dimensional data classification. The experiments have been performed on ten high dimensional datasets. Using four exiting features set partitioning methods; the result shows that OMEL-C-PSO approach is feasible and effective for high dimensional data classification.
Lecture notes in networks and systems, Aug 31, 2018
Acquiring labeled data is very tough and time-taking process due to various constraints present i... more Acquiring labeled data is very tough and time-taking process due to various constraints present in the area of remote sensing. Supervised classification of remote sensing images requires huge amount of labeled data for the better performance. Many semi-supervised techniques have been developed and explored which require very few labeled data to train the classifiers. Self-training is a popular semi-supervised technique which trains any supervised classifiers in various iterations with the help few labeled and large pool of unlabeled samples. Traditional self-training method is not suitable for the remote sensing images because it does not utilize spatial properties of the image. In this paper, an enhanced version of traditional self-training method has been proposed which utilize spatial properties for confident sample selection. The experimental results show that the proposed method has been performed better than traditional self-training method.
One of the challenges in remote sensing image classification is to define a suitable training set... more One of the challenges in remote sensing image classification is to define a suitable training set as labeling of samples is expensive and/or time-consuming. Active learning has been shown in literature as an optimized solution for aforementioned problem as it finds the most informative (uncertain) samples avoiding redundancy of samples thus achieving good classification accuracy with less number of training data set. In this paper, the most informative samples have been selected utilizing the concept of multiview with support vector machine as a classifier. A novel approach of generating different views using Fisher Discriminant Ratio (FDR) technique has been proposed here. Effectiveness of the proposed method is tested on two hyperspectral data sets and compared the results with state-of-the-art active learning methods such as random sampling and marginal sampling with support vector machine classifier.
Advances in intelligent systems and computing, 2014
ABSTRACT
Journal of The Indian Society of Remote Sensing, Feb 1, 2022
Classification of the remotely sensed images is an arduous task due to the limited availability o... more Classification of the remotely sensed images is an arduous task due to the limited availability of labeled samples. The collection of labeled samples in remote sensing areas is a very tedious, time-consuming, and costly process. To solve the problem of labeled samples, self-training semi-supervised techniques have been explored in many ways to classify remotely sensed images. However, the traditional self-training semi-supervised classification technique may produce poor classification accuracy due to the selection of low-quality pseudo-labeled samples. In this work, a new self-training semi-supervised classification method has been proposed, which selects the pseudo labeled pixels based on spatial confidence and diversity criterion. The proposed sample selection technique produce correctly classified and diverse pixels for the training set, which may be informative. The experimental results show that the proposed method has the potential to improve the discriminative power of the classifiers trained in a self-training framework.
International Journal of Hybrid Intelligent Systems, Aug 23, 2005
Smart innovation, systems and technologies, 2014
Poem is a piece of writing in which the expression of feeling and ideas is given intensity by par... more Poem is a piece of writing in which the expression of feeling and ideas is given intensity by particular attention to diction, rhythm and imagery [1]. In this modern age, the poem collection is ever increasing on the internet. Therefore, to classify poem correctly is an important task. Sentiment information of the poem is useful to enhance the classification task. SentiWordNet is an opinion lexicon. To each term are assigned two numeric scores indicating positive and negative sentiment information. Multiple views of the poem data may be utilized for learning to enhance the classification task. In this research, the effect of sentiment information has been explored for poem data classification using Multi-view ensemble learning. The experiments include the use of Support Vector Machine (SVM) for learning classifier corresponding to each view of the data.
Thesmart computing review, Oct 31, 2014
In the real world, reconciling a choice between multiple conflicting objectives is a common probl... more In the real world, reconciling a choice between multiple conflicting objectives is a common problem. Solutions to a multi-objective problem are those that have the best possible negotiation given the objectives. An evolutionary algorithm called Particle swarm optimization is used to find a solution from the solution space. It is a population-based optimization technique that is effective, efficient, and easy to implement. Changes in the particle swarm optimization technique are required in order to get solutions to a multi-objective optimization problem. Therefore, this paper provides the proper concept of particle swarm optimization and the multi-objective optimization problem in order to build a basic background with which to conduct multi-objective particle swarm optimization. Then, we discuss multi-objective particle swarm optimization concepts. Multi-objective particle swarm optimization techniques and some of the most important future research directions are also included.
Asian Journal of Geoinformatics, Jan 23, 2014
The effectiveness of the three types of unsupervised learning techniques for change detection in ... more The effectiveness of the three types of unsupervised learning techniques for change detection in water, vegetation and built-up land cover classes of a part of Delhi region in India has been analyzed. A total of eight images of Landsat TM and ETM+ from year 1998 to 2011 were preprocessed for atmospheric corrections. Subsequently three features, Soil Adjusted Vegetation Index (SAVI), Modified Normalized Difference Water Index (MNDWI), and Builtup from Normalized Difference Built-up Index (NDBI) were extracted at the preprocessing stage. The three clustering algorithms kmeans, fuzzy c mean and expectationmaximization were selected to represent the partition based, fuzzy, and probability based technique respectively. The three algorithms were implemented to cluster the pixels of all the eight images using the features SAVI, MNDWI and NDBI. The Silhouette coefficient was used to evaluate the cluster quality that takes into consideration both intra-cluster and inter-cluster distance between clusters. The outcome of clustering has been quantified in terms of the percentage of total pixels grouped in each of the three clusters indicating vegetation, urban and water. Change detection has been performed comparing the outcomes of clustering done on each of the eight images.
Procedia Computer Science, 2015
Remote Sensing Letters, Sep 24, 2018
Knowledge and Information Systems, Sep 21, 2015
Multi-view ensemble learning has the potential to address issues related to the high dimensionali... more Multi-view ensemble learning has the potential to address issues related to the high dimensionality of data. It attempts to utilize all the relevant only discarding the irrelevant features. The view of a dataset is the sub-table of the training data with respect to a subset of the feature set. The problem of discarding the irrelevant features and obtaining subsets of the relevant features is useful for dimension reduction and dealing with the problem of having fewer training examples than even the reduced set of relevant features. A feature set partitioning resulting in the blocks of relevant features may not yield multiple-view-based classifiers with good classification performance. In this work the optimal feature set partition approach has been proposed. Further, the ensemble learning from views aims to maximize the performance of the classifier. The experiments study the performance of random feature set partitioning, attribute bagging, view generation using attribute clustering, view construction using genetic algorithm and OFSP proposed method. The blocks of relevant feature subsets are used to construct the multi-view classifier ensemble using K-nearest neighbor, Naïve Bayesian and support vector machine algorithm applied to sixteen high-dimensional data sets from UCI machine learning repository. The performance parameters considered for comparison are classification accuracy, disagreement among the classifiers, execution time and percentage reduction of attributes.
Advances in intelligent systems and computing, Sep 30, 2022
Journal of intelligent systems, Nov 20, 2018
Recommender systems have focused on algorithms for a recommendation for individuals. However, in ... more Recommender systems have focused on algorithms for a recommendation for individuals. However, in many domains, it may be recommending an item, for example, movies, restaurants etc. for a group of persons for which some remarkable group recommender systems (GRSs) has been developed. GRSs satisfy a group of people optimally by considering the equal weighting of the individual preferences. We have proposed a multi-expert scheme (MES) for group recommendation using genetic algorithm (GA) MES-GRS-GA that depends on consensus techniques to further improve group recommendations. In order to deal with this problem of GRS, we also propose a consensus scheme for GRSs where consensus from multiple experts are brought together to make a single recommended list of items in which each expert represents an individual inside the group. The proposed GA based consensus scheme is modeled as many consensus schemes within two phases. In the consensus phase, we have applied GA to obtain the maximum utility offer for each expert and generated the most appropriate rating for each item in the group. In the recommendation generation phase, again GA has been employed to produce the resulting group profile, i.e. the list of ratings with the minimum sum of distances from the group members. Finally, the results of computational experiments that bear close resemblance to real-world scenarios are presented and compared to baseline GRS techniques that illustrate the superiority of the proposed model.
Communications in computer and information science, 2023
Procedia Computer Science
Journal of Information Technology and Digital World
Farmer suicidal hotspot detection proposed in this paper aims to reduce the death of the farmers.... more Farmer suicidal hotspot detection proposed in this paper aims to reduce the death of the farmers. Using geographical information system is vital in predicting potential hotspots for farmer suicide. This study has collected and analyzed data on farmer suicide in India, using state-wise information from the National Crime Records Bureau and has determined the recent higher rate of farmer suicide. Spatial statistics analysis tools that address average nearest neighbor analysis has been used. Global analysis through Moran's Index, analyzed that the farmer suicides have a clustered pattern and plotted a farmer suicidal hotspot map using Getis-Ord (Gi*) analysis. The results show the highest farmer suicide index is in Maharashtra and hence, farmer suicidal hotspot has been found district wise. There are four farmer suicidal factors such as, number of farmer suicide, the population density of farmers, climate, and income. This hotspot geographical region helps to identify future suicid...
Communications in computer and information science, 2017
Multiple views of a dataset are constructed using the Feature Set Partitioning (FSP) methods for ... more Multiple views of a dataset are constructed using the Feature Set Partitioning (FSP) methods for Multi-view Ensemble Learning (MEL). The way of partitioning of features influences the classification performance of MEL. The possible numbers of features set partition of the dataset are equal to the Bell number, which is in polynomial nature and a NP-hard problem (Shown in Fig. 1). It is essential to find an optimal classification performance of MEL for a features set partition among all possible features set partition in high dimension scenario. Therefore, an optimal multi-view ensemble learning approach using constrained particle swarm optimization method (OMEL-C-PSO) is proposed for high dimensional data classification. The experiments have been performed on ten high dimensional datasets. Using four exiting features set partitioning methods; the result shows that OMEL-C-PSO approach is feasible and effective for high dimensional data classification.
Lecture notes in networks and systems, Aug 31, 2018
Acquiring labeled data is very tough and time-taking process due to various constraints present i... more Acquiring labeled data is very tough and time-taking process due to various constraints present in the area of remote sensing. Supervised classification of remote sensing images requires huge amount of labeled data for the better performance. Many semi-supervised techniques have been developed and explored which require very few labeled data to train the classifiers. Self-training is a popular semi-supervised technique which trains any supervised classifiers in various iterations with the help few labeled and large pool of unlabeled samples. Traditional self-training method is not suitable for the remote sensing images because it does not utilize spatial properties of the image. In this paper, an enhanced version of traditional self-training method has been proposed which utilize spatial properties for confident sample selection. The experimental results show that the proposed method has been performed better than traditional self-training method.
One of the challenges in remote sensing image classification is to define a suitable training set... more One of the challenges in remote sensing image classification is to define a suitable training set as labeling of samples is expensive and/or time-consuming. Active learning has been shown in literature as an optimized solution for aforementioned problem as it finds the most informative (uncertain) samples avoiding redundancy of samples thus achieving good classification accuracy with less number of training data set. In this paper, the most informative samples have been selected utilizing the concept of multiview with support vector machine as a classifier. A novel approach of generating different views using Fisher Discriminant Ratio (FDR) technique has been proposed here. Effectiveness of the proposed method is tested on two hyperspectral data sets and compared the results with state-of-the-art active learning methods such as random sampling and marginal sampling with support vector machine classifier.
Advances in intelligent systems and computing, 2014
ABSTRACT
Journal of The Indian Society of Remote Sensing, Feb 1, 2022
Classification of the remotely sensed images is an arduous task due to the limited availability o... more Classification of the remotely sensed images is an arduous task due to the limited availability of labeled samples. The collection of labeled samples in remote sensing areas is a very tedious, time-consuming, and costly process. To solve the problem of labeled samples, self-training semi-supervised techniques have been explored in many ways to classify remotely sensed images. However, the traditional self-training semi-supervised classification technique may produce poor classification accuracy due to the selection of low-quality pseudo-labeled samples. In this work, a new self-training semi-supervised classification method has been proposed, which selects the pseudo labeled pixels based on spatial confidence and diversity criterion. The proposed sample selection technique produce correctly classified and diverse pixels for the training set, which may be informative. The experimental results show that the proposed method has the potential to improve the discriminative power of the classifiers trained in a self-training framework.
International Journal of Hybrid Intelligent Systems, Aug 23, 2005
Smart innovation, systems and technologies, 2014
Poem is a piece of writing in which the expression of feeling and ideas is given intensity by par... more Poem is a piece of writing in which the expression of feeling and ideas is given intensity by particular attention to diction, rhythm and imagery [1]. In this modern age, the poem collection is ever increasing on the internet. Therefore, to classify poem correctly is an important task. Sentiment information of the poem is useful to enhance the classification task. SentiWordNet is an opinion lexicon. To each term are assigned two numeric scores indicating positive and negative sentiment information. Multiple views of the poem data may be utilized for learning to enhance the classification task. In this research, the effect of sentiment information has been explored for poem data classification using Multi-view ensemble learning. The experiments include the use of Support Vector Machine (SVM) for learning classifier corresponding to each view of the data.
Thesmart computing review, Oct 31, 2014
In the real world, reconciling a choice between multiple conflicting objectives is a common probl... more In the real world, reconciling a choice between multiple conflicting objectives is a common problem. Solutions to a multi-objective problem are those that have the best possible negotiation given the objectives. An evolutionary algorithm called Particle swarm optimization is used to find a solution from the solution space. It is a population-based optimization technique that is effective, efficient, and easy to implement. Changes in the particle swarm optimization technique are required in order to get solutions to a multi-objective optimization problem. Therefore, this paper provides the proper concept of particle swarm optimization and the multi-objective optimization problem in order to build a basic background with which to conduct multi-objective particle swarm optimization. Then, we discuss multi-objective particle swarm optimization concepts. Multi-objective particle swarm optimization techniques and some of the most important future research directions are also included.
Asian Journal of Geoinformatics, Jan 23, 2014
The effectiveness of the three types of unsupervised learning techniques for change detection in ... more The effectiveness of the three types of unsupervised learning techniques for change detection in water, vegetation and built-up land cover classes of a part of Delhi region in India has been analyzed. A total of eight images of Landsat TM and ETM+ from year 1998 to 2011 were preprocessed for atmospheric corrections. Subsequently three features, Soil Adjusted Vegetation Index (SAVI), Modified Normalized Difference Water Index (MNDWI), and Builtup from Normalized Difference Built-up Index (NDBI) were extracted at the preprocessing stage. The three clustering algorithms kmeans, fuzzy c mean and expectationmaximization were selected to represent the partition based, fuzzy, and probability based technique respectively. The three algorithms were implemented to cluster the pixels of all the eight images using the features SAVI, MNDWI and NDBI. The Silhouette coefficient was used to evaluate the cluster quality that takes into consideration both intra-cluster and inter-cluster distance between clusters. The outcome of clustering has been quantified in terms of the percentage of total pixels grouped in each of the three clusters indicating vegetation, urban and water. Change detection has been performed comparing the outcomes of clustering done on each of the eight images.
Procedia Computer Science, 2015
Remote Sensing Letters, Sep 24, 2018
Knowledge and Information Systems, Sep 21, 2015
Multi-view ensemble learning has the potential to address issues related to the high dimensionali... more Multi-view ensemble learning has the potential to address issues related to the high dimensionality of data. It attempts to utilize all the relevant only discarding the irrelevant features. The view of a dataset is the sub-table of the training data with respect to a subset of the feature set. The problem of discarding the irrelevant features and obtaining subsets of the relevant features is useful for dimension reduction and dealing with the problem of having fewer training examples than even the reduced set of relevant features. A feature set partitioning resulting in the blocks of relevant features may not yield multiple-view-based classifiers with good classification performance. In this work the optimal feature set partition approach has been proposed. Further, the ensemble learning from views aims to maximize the performance of the classifier. The experiments study the performance of random feature set partitioning, attribute bagging, view generation using attribute clustering, view construction using genetic algorithm and OFSP proposed method. The blocks of relevant feature subsets are used to construct the multi-view classifier ensemble using K-nearest neighbor, Naïve Bayesian and support vector machine algorithm applied to sixteen high-dimensional data sets from UCI machine learning repository. The performance parameters considered for comparison are classification accuracy, disagreement among the classifiers, execution time and percentage reduction of attributes.
Advances in intelligent systems and computing, Sep 30, 2022
Journal of intelligent systems, Nov 20, 2018
Recommender systems have focused on algorithms for a recommendation for individuals. However, in ... more Recommender systems have focused on algorithms for a recommendation for individuals. However, in many domains, it may be recommending an item, for example, movies, restaurants etc. for a group of persons for which some remarkable group recommender systems (GRSs) has been developed. GRSs satisfy a group of people optimally by considering the equal weighting of the individual preferences. We have proposed a multi-expert scheme (MES) for group recommendation using genetic algorithm (GA) MES-GRS-GA that depends on consensus techniques to further improve group recommendations. In order to deal with this problem of GRS, we also propose a consensus scheme for GRSs where consensus from multiple experts are brought together to make a single recommended list of items in which each expert represents an individual inside the group. The proposed GA based consensus scheme is modeled as many consensus schemes within two phases. In the consensus phase, we have applied GA to obtain the maximum utility offer for each expert and generated the most appropriate rating for each item in the group. In the recommendation generation phase, again GA has been employed to produce the resulting group profile, i.e. the list of ratings with the minimum sum of distances from the group members. Finally, the results of computational experiments that bear close resemblance to real-world scenarios are presented and compared to baseline GRS techniques that illustrate the superiority of the proposed model.
Communications in computer and information science, 2023
Procedia Computer Science
Journal of Information Technology and Digital World
Farmer suicidal hotspot detection proposed in this paper aims to reduce the death of the farmers.... more Farmer suicidal hotspot detection proposed in this paper aims to reduce the death of the farmers. Using geographical information system is vital in predicting potential hotspots for farmer suicide. This study has collected and analyzed data on farmer suicide in India, using state-wise information from the National Crime Records Bureau and has determined the recent higher rate of farmer suicide. Spatial statistics analysis tools that address average nearest neighbor analysis has been used. Global analysis through Moran's Index, analyzed that the farmer suicides have a clustered pattern and plotted a farmer suicidal hotspot map using Getis-Ord (Gi*) analysis. The results show the highest farmer suicide index is in Maharashtra and hence, farmer suicidal hotspot has been found district wise. There are four farmer suicidal factors such as, number of farmer suicide, the population density of farmers, climate, and income. This hotspot geographical region helps to identify future suicid...