Dr. Deepti Sisodia | Alliance University (original) (raw)
Uploads
Papers by Dr. Deepti Sisodia
International Journal of Recent Surgical and Medical Sciences
Background Epilepsy's psychological effects are variable, some may experience a few mental he... more Background Epilepsy's psychological effects are variable, some may experience a few mental health issues while some may experience serious problems such as anxiety, depression, attention deficit hyperkinetic disorder (ADHD), and mood disorders. Hence, there is a need to screen these problems at an early age for timely intervention. So, our study was conducted to determine the prevalence of emotional and behavioral problems in children with epilepsy. Methods This was a prospective observational study on 111 children, 6 to 14 years of age. The overall prevalence of emotional and behavioral problems in childhood was determined by calculating the percentage of children with child behavior checklist score indicative of specific emotional and behavioral problems. The prevalence for specific morbidities was also calculated and reported separately for each condition. Results were presented in the form of tables, charts, graphs, and narratives. Results The overall prevalence of emotional...
IFIP advances in information and communication technology, 2023
Lecture Notes in Electrical Engineering
Applied Intelligence in Human-Computer Interaction
Expert Systems with Applications
2023 2nd International Conference for Innovation in Technology (INOCON)
Springer International Publishing eBooks, Nov 29, 2022
Engineering Science and Technology, an International Journal
In online advertising, the user-clicks dataset based fraudulent publishers' classification models... more In online advertising, the user-clicks dataset based fraudulent publishers' classification models exhibit poor performance due to high skewness in class distribution of the publishers. The nearest-neighbor based classification techniques are popularly used to reduce the impact of class skewness on performance. The Nearest-Neighbor techniques use Prototype Selection (PS) methods to select promising samples before classifying them for reducing the size of training data. Although Nearest-Neighbor techniques are simple to use and reduce the negative impact of the loss of potential information, they suffer from higher storage requirements and slower classification speed when applied on datasets with skewed class distributions. In this paper, we propose a Quad Division Prototype Selection-based k-Nearest Neighbor classifier (QDPSKNN) by introducing quad division method for handling uneven class distribution. The quad-division divides the data into four quartiles (groups) and performs controlled under-sampling for balancing class distribution. It reduces the size of the training dataset by selecting only the relevant prototypes in the form of nearest-neighbors. The performance of QDPSKNN is evaluated on Fraud Detection in Mobile Advertising (FDMA) user-click dataset and fifteen other benchmark imbalanced datasets to test its generalizing behaviour. The performance is also compared with one baseline model (k-NN) and four other prototype selection methods such as NearMiss-1, NearMiss-2, NearMiss-3, and Condensed Nearest-Neighbor. The results show improved classification performance with QDPSKNN in terms of precision, recall, f-measure, g-mean, reduction rate and execution time, compared to existing prototype selection methods in the classification of fraudulent publishers as well as on other benchmark imbalanced datasets. Wilcoxon signed ranked test is conducted to demonstrate significant differences amid QDPSKNN and state-of-the-art methods.
ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. T... more ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. The unique property of the Internet being a medium with bidirectional information flows has enabled performance-based pricing models that tie online advertising payments directly to campaign measurement data such as click-through and purchases. These pricing models have become increasingly popular in the online advertising industry and this is done by websites. Website is the backbone of any Internet marketing plan. Whether a company utilizes search engine marketing, e-mail marketing, affiliate marketing or contextual advertising, a website is the element that the campaigns are built upon. Companies have been redefining their marketing and branding strategies due to the unique characteristics of the internet and its capacity to change old rules
Data Technologies and Applications, 2022
PurposeThe problem of choosing the utmost useful features from hundreds of features from time-ser... more PurposeThe problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's classification. Selecting feature subsets is a key issue in such classification tasks. Practically, the use of filter approaches is common; however, they neglect the correlations amid features. Conversely, wrapper approaches could not be applied due to their complexities. Moreover, in particular, existing feature selection methods could not handle such data, which is one of the major causes of instability of feature selection.Design/methodology/approachTo overcome such issues, a majority voting-based hybrid feature selection method, namely feature distillation and accumulated selection (FDAS), is proposed to investigate the optimal subset of relevant features for analyzing the publisher's fraudulent conduct. FDAS works in two phases: (1) feature distillation, where significant features from standard fi...
IETE Technical Review, 2021
In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than tha... more In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than that of genuine publishers. This high-class imbalance between fraudulent and genuine publishers poses a...
Data Technologies and Applications, 2020
PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publ... more PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click d...
Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining... more Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks. Clustering or data grouping is the key technique of the data mining. It is an unsupervised learning task where one seeks to identify a finite set of categories termed clusters to describe the data . The grouping of data into clusters is based on the principle of maximizing the intra class similarity and minimizing the inter class similarity. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? This paper deal with the study of various clustering algorithms of data mining and it focus on the clustering basics, requirement, classification, problem and application area of the clustering algorithms.
Forecasting is an art which combines scientific methods and past experiences with an aim to extra... more Forecasting is an art which combines scientific methods and past experiences with an aim to extract maximum possible information needed for an extrapolation. Forecasting is a very interesting research topic and has been attracting many researchers from the last few decades. Before forecasting, the weather observations are collected. Weather is a continuous, dataintensive, multidimensional, dynamic and chaotic process, and these properties make weather forecasting a big challenge. This manuscript discusses some weather forecasting and temperature forecasting methods. For weather forecasting methods like CBR (Case Based Reasoning) and FST (Fuzzy Set Theory) are studied which deals with the forecasting problems. While for temperature forecasting MRNFS (Mamdani Recurrent Neuro-Fuzzy System) has been studied, which is trained with the help of two robust population-based algorithms.
Advances in Intelligent Systems and Computing, 2018
The increased consumption of alcohol among secondary school students has been a matter of concern... more The increased consumption of alcohol among secondary school students has been a matter of concern these days. Alcoholism not only affects individual’s decision-making ability but also have a negative effect on academic performance. The early prediction of a student consuming alcohol can be helpful in preventing them from such risks and failures. This paper evaluates classification algorithms for prediction of certain risks of secondary school student due to alcohol consumption. The classification algorithms considered here are three individual classifiers including Naive Bayes Classifier, Random Tree, Simple Logistic and three ensemble classifiers: Random Forest, Bagging, and Adaboost. The dataset is taken from the UCI repository. The performance of these algorithms is evaluated using standard evaluation metrics such as Accuracy, Precision, Recall and F-Measure. The results suggested that Simple Logistic and Random Forest performed better than the other classifiers.
Procedia Computer Science, 2018
The main objective of the research is to predict the diabetes patient and Normal patient based on... more The main objective of the research is to predict the diabetes patient and Normal patient based on test results or test reports using classification algorithms. In Data mining, different techniques can be used for solving problems. For example, classification, prediction, clustering are data mining techniques. Classification is the process of classify the data according to the features of the data with predefined set of classes. Prediction is Used for predicting the class label for new data. The weka tool is used to develop a classifier for predicting the diabetes patient and normal patient. The Diabetes dataset is used for prediction process. The data set can be divided into two subsets. The first one is training set and other one is test set. The training Set contains set of attributes with class labels. The test set contains set of attributes and it doesn't contain class labels. It was predicted by classifier or model. The research takes three algorithms such as Naive Bayes, Multilayer Perceptron and IBK. Each algorithm provides best accuracy for prediction process. The accuracy of the Naive Bayes algorithm is 100%.
International Journal of Recent Surgical and Medical Sciences
Background Epilepsy's psychological effects are variable, some may experience a few mental he... more Background Epilepsy's psychological effects are variable, some may experience a few mental health issues while some may experience serious problems such as anxiety, depression, attention deficit hyperkinetic disorder (ADHD), and mood disorders. Hence, there is a need to screen these problems at an early age for timely intervention. So, our study was conducted to determine the prevalence of emotional and behavioral problems in children with epilepsy. Methods This was a prospective observational study on 111 children, 6 to 14 years of age. The overall prevalence of emotional and behavioral problems in childhood was determined by calculating the percentage of children with child behavior checklist score indicative of specific emotional and behavioral problems. The prevalence for specific morbidities was also calculated and reported separately for each condition. Results were presented in the form of tables, charts, graphs, and narratives. Results The overall prevalence of emotional...
IFIP advances in information and communication technology, 2023
Lecture Notes in Electrical Engineering
Applied Intelligence in Human-Computer Interaction
Expert Systems with Applications
2023 2nd International Conference for Innovation in Technology (INOCON)
Springer International Publishing eBooks, Nov 29, 2022
Engineering Science and Technology, an International Journal
In online advertising, the user-clicks dataset based fraudulent publishers' classification models... more In online advertising, the user-clicks dataset based fraudulent publishers' classification models exhibit poor performance due to high skewness in class distribution of the publishers. The nearest-neighbor based classification techniques are popularly used to reduce the impact of class skewness on performance. The Nearest-Neighbor techniques use Prototype Selection (PS) methods to select promising samples before classifying them for reducing the size of training data. Although Nearest-Neighbor techniques are simple to use and reduce the negative impact of the loss of potential information, they suffer from higher storage requirements and slower classification speed when applied on datasets with skewed class distributions. In this paper, we propose a Quad Division Prototype Selection-based k-Nearest Neighbor classifier (QDPSKNN) by introducing quad division method for handling uneven class distribution. The quad-division divides the data into four quartiles (groups) and performs controlled under-sampling for balancing class distribution. It reduces the size of the training dataset by selecting only the relevant prototypes in the form of nearest-neighbors. The performance of QDPSKNN is evaluated on Fraud Detection in Mobile Advertising (FDMA) user-click dataset and fifteen other benchmark imbalanced datasets to test its generalizing behaviour. The performance is also compared with one baseline model (k-NN) and four other prototype selection methods such as NearMiss-1, NearMiss-2, NearMiss-3, and Condensed Nearest-Neighbor. The results show improved classification performance with QDPSKNN in terms of precision, recall, f-measure, g-mean, reduction rate and execution time, compared to existing prototype selection methods in the classification of fraudulent publishers as well as on other benchmark imbalanced datasets. Wilcoxon signed ranked test is conducted to demonstrate significant differences amid QDPSKNN and state-of-the-art methods.
ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. T... more ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. The unique property of the Internet being a medium with bidirectional information flows has enabled performance-based pricing models that tie online advertising payments directly to campaign measurement data such as click-through and purchases. These pricing models have become increasingly popular in the online advertising industry and this is done by websites. Website is the backbone of any Internet marketing plan. Whether a company utilizes search engine marketing, e-mail marketing, affiliate marketing or contextual advertising, a website is the element that the campaigns are built upon. Companies have been redefining their marketing and branding strategies due to the unique characteristics of the internet and its capacity to change old rules
Data Technologies and Applications, 2022
PurposeThe problem of choosing the utmost useful features from hundreds of features from time-ser... more PurposeThe problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's classification. Selecting feature subsets is a key issue in such classification tasks. Practically, the use of filter approaches is common; however, they neglect the correlations amid features. Conversely, wrapper approaches could not be applied due to their complexities. Moreover, in particular, existing feature selection methods could not handle such data, which is one of the major causes of instability of feature selection.Design/methodology/approachTo overcome such issues, a majority voting-based hybrid feature selection method, namely feature distillation and accumulated selection (FDAS), is proposed to investigate the optimal subset of relevant features for analyzing the publisher's fraudulent conduct. FDAS works in two phases: (1) feature distillation, where significant features from standard fi...
IETE Technical Review, 2021
In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than tha... more In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than that of genuine publishers. This high-class imbalance between fraudulent and genuine publishers poses a...
Data Technologies and Applications, 2020
PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publ... more PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click d...
Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining... more Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks. Clustering or data grouping is the key technique of the data mining. It is an unsupervised learning task where one seeks to identify a finite set of categories termed clusters to describe the data . The grouping of data into clusters is based on the principle of maximizing the intra class similarity and minimizing the inter class similarity. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? This paper deal with the study of various clustering algorithms of data mining and it focus on the clustering basics, requirement, classification, problem and application area of the clustering algorithms.
Forecasting is an art which combines scientific methods and past experiences with an aim to extra... more Forecasting is an art which combines scientific methods and past experiences with an aim to extract maximum possible information needed for an extrapolation. Forecasting is a very interesting research topic and has been attracting many researchers from the last few decades. Before forecasting, the weather observations are collected. Weather is a continuous, dataintensive, multidimensional, dynamic and chaotic process, and these properties make weather forecasting a big challenge. This manuscript discusses some weather forecasting and temperature forecasting methods. For weather forecasting methods like CBR (Case Based Reasoning) and FST (Fuzzy Set Theory) are studied which deals with the forecasting problems. While for temperature forecasting MRNFS (Mamdani Recurrent Neuro-Fuzzy System) has been studied, which is trained with the help of two robust population-based algorithms.
Advances in Intelligent Systems and Computing, 2018
The increased consumption of alcohol among secondary school students has been a matter of concern... more The increased consumption of alcohol among secondary school students has been a matter of concern these days. Alcoholism not only affects individual’s decision-making ability but also have a negative effect on academic performance. The early prediction of a student consuming alcohol can be helpful in preventing them from such risks and failures. This paper evaluates classification algorithms for prediction of certain risks of secondary school student due to alcohol consumption. The classification algorithms considered here are three individual classifiers including Naive Bayes Classifier, Random Tree, Simple Logistic and three ensemble classifiers: Random Forest, Bagging, and Adaboost. The dataset is taken from the UCI repository. The performance of these algorithms is evaluated using standard evaluation metrics such as Accuracy, Precision, Recall and F-Measure. The results suggested that Simple Logistic and Random Forest performed better than the other classifiers.
Procedia Computer Science, 2018
The main objective of the research is to predict the diabetes patient and Normal patient based on... more The main objective of the research is to predict the diabetes patient and Normal patient based on test results or test reports using classification algorithms. In Data mining, different techniques can be used for solving problems. For example, classification, prediction, clustering are data mining techniques. Classification is the process of classify the data according to the features of the data with predefined set of classes. Prediction is Used for predicting the class label for new data. The weka tool is used to develop a classifier for predicting the diabetes patient and normal patient. The Diabetes dataset is used for prediction process. The data set can be divided into two subsets. The first one is training set and other one is test set. The training Set contains set of attributes with class labels. The test set contains set of attributes and it doesn't contain class labels. It was predicted by classifier or model. The research takes three algorithms such as Naive Bayes, Multilayer Perceptron and IBK. Each algorithm provides best accuracy for prediction process. The accuracy of the Naive Bayes algorithm is 100%.