Dr. Deepti Sisodia | Alliance University (original) (raw)

Uploads

Papers by Dr. Deepti Sisodia

Research paper thumbnail of Behavior Problems in Children With Epilepsy (Age 6–14 years): A Prospective Observational Study

International Journal of Recent Surgical and Medical Sciences

Background Epilepsy's psychological effects are variable, some may experience a few mental he... more Background Epilepsy's psychological effects are variable, some may experience a few mental health issues while some may experience serious problems such as anxiety, depression, attention deficit hyperkinetic disorder (ADHD), and mood disorders. Hence, there is a need to screen these problems at an early age for timely intervention. So, our study was conducted to determine the prevalence of emotional and behavioral problems in children with epilepsy. Methods This was a prospective observational study on 111 children, 6 to 14 years of age. The overall prevalence of emotional and behavioral problems in childhood was determined by calculating the percentage of children with child behavior checklist score indicative of specific emotional and behavioral problems. The prevalence for specific morbidities was also calculated and reported separately for each condition. Results were presented in the form of tables, charts, graphs, and narratives. Results The overall prevalence of emotional...

Research paper thumbnail of Ontological Representation of Medical Decision Support System using Machine Learning Classifiers

Research paper thumbnail of Gradient Boosting-Based Predictive Click Fraud Detection Using Manifold Criterion Variable Elimination

IFIP advances in information and communication technology, 2023

Research paper thumbnail of Evaluating Feature Importance to Investigate Publishers Conduct for Detecting Click Fraud

Lecture Notes in Electrical Engineering

Research paper thumbnail of A reliable click-fraud detection system for the investigation of fraudulent publishers in online advertising

Applied Intelligence in Human-Computer Interaction

Research paper thumbnail of A transfer learning framework towards identifying behavioral changes of fraudulent publishers in pay-per-click model of online advertising for click fraud detection

Expert Systems with Applications

Research paper thumbnail of Stacked Generalization Architecture for Predicting Publisher Behaviour from Highly Imbalanced User-Click Data Set for Click Fraud Detection

Research paper thumbnail of Bone Cancer Identification and Separation Using K-Means and KNN Classifiers

2023 2nd International Conference for Innovation in Technology (INOCON)

Research paper thumbnail of Data Sampling Methods for Analyzing Publishers Conduct from Highly Imbalanced Dataset in Web Advertising

Springer International Publishing eBooks, Nov 29, 2022

Research paper thumbnail of A hybrid data‐level sampling approach in learning from skewed user‐click data for click fraud detection in online advertising

Research paper thumbnail of Feature space transformation of user-clicks and deep transfer learning framework for fraudulent publisher detection in online advertising

Research paper thumbnail of Quad division prototype selection-based k-nearest neighbor classifier for click fraud detection from highly skewed user click dataset

Engineering Science and Technology, an International Journal

In online advertising, the user-clicks dataset based fraudulent publishers' classification models... more In online advertising, the user-clicks dataset based fraudulent publishers' classification models exhibit poor performance due to high skewness in class distribution of the publishers. The nearest-neighbor based classification techniques are popularly used to reduce the impact of class skewness on performance. The Nearest-Neighbor techniques use Prototype Selection (PS) methods to select promising samples before classifying them for reducing the size of training data. Although Nearest-Neighbor techniques are simple to use and reduce the negative impact of the loss of potential information, they suffer from higher storage requirements and slower classification speed when applied on datasets with skewed class distributions. In this paper, we propose a Quad Division Prototype Selection-based k-Nearest Neighbor classifier (QDPSKNN) by introducing quad division method for handling uneven class distribution. The quad-division divides the data into four quartiles (groups) and performs controlled under-sampling for balancing class distribution. It reduces the size of the training dataset by selecting only the relevant prototypes in the form of nearest-neighbors. The performance of QDPSKNN is evaluated on Fraud Detection in Mobile Advertising (FDMA) user-click dataset and fifteen other benchmark imbalanced datasets to test its generalizing behaviour. The performance is also compared with one baseline model (k-NN) and four other prototype selection methods such as NearMiss-1, NearMiss-2, NearMiss-3, and Condensed Nearest-Neighbor. The results show improved classification performance with QDPSKNN in terms of precision, recall, f-measure, g-mean, reduction rate and execution time, compared to existing prototype selection methods in the classification of fraudulent publishers as well as on other benchmark imbalanced datasets. Wilcoxon signed ranked test is conducted to demonstrate significant differences amid QDPSKNN and state-of-the-art methods.

Research paper thumbnail of Importance of internet marketing

ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. T... more ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. The unique property of the Internet being a medium with bidirectional information flows has enabled performance-based pricing models that tie online advertising payments directly to campaign measurement data such as click-through and purchases. These pricing models have become increasingly popular in the online advertising industry and this is done by websites. Website is the backbone of any Internet marketing plan. Whether a company utilizes search engine marketing, e-mail marketing, affiliate marketing or contextual advertising, a website is the element that the campaigns are built upon. Companies have been redefining their marketing and branding strategies due to the unique characteristics of the internet and its capacity to change old rules

Research paper thumbnail of Feature distillation and accumulated selection for automated fraudulent publisher classification from user click data of online advertising

Data Technologies and Applications, 2022

PurposeThe problem of choosing the utmost useful features from hundreds of features from time-ser... more PurposeThe problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's classification. Selecting feature subsets is a key issue in such classification tasks. Practically, the use of filter approaches is common; however, they neglect the correlations amid features. Conversely, wrapper approaches could not be applied due to their complexities. Moreover, in particular, existing feature selection methods could not handle such data, which is one of the major causes of instability of feature selection.Design/methodology/approachTo overcome such issues, a majority voting-based hybrid feature selection method, namely feature distillation and accumulated selection (FDAS), is proposed to investigate the optimal subset of relevant features for analyzing the publisher's fraudulent conduct. FDAS works in two phases: (1) feature distillation, where significant features from standard fi...

Research paper thumbnail of Data Sampling Strategies for Click Fraud Detection Using Imbalanced User Click Data of Online Advertising: An Empirical Review

IETE Technical Review, 2021

In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than tha... more In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than that of genuine publishers. This high-class imbalance between fraudulent and genuine publishers poses a...

Research paper thumbnail of Gradient boosting learning for fraudulent publisher detection in online advertising

Data Technologies and Applications, 2020

PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publ... more PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click d...

Research paper thumbnail of Clustering Techniques: A Brief Survey of Different Clustering Algorithms

Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining... more Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks. Clustering or data grouping is the key technique of the data mining. It is an unsupervised learning task where one seeks to identify a finite set of categories termed clusters to describe the data . The grouping of data into clusters is based on the principle of maximizing the intra class similarity and minimizing the inter class similarity. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? This paper deal with the study of various clustering algorithms of data mining and it focus on the clustering basics, requirement, classification, problem and application area of the clustering algorithms.

Research paper thumbnail of Comprehensive Analysis on Weather Prediction Methods

Forecasting is an art which combines scientific methods and past experiences with an aim to extra... more Forecasting is an art which combines scientific methods and past experiences with an aim to extract maximum possible information needed for an extrapolation. Forecasting is a very interesting research topic and has been attracting many researchers from the last few decades. Before forecasting, the weather observations are collected. Weather is a continuous, dataintensive, multidimensional, dynamic and chaotic process, and these properties make weather forecasting a big challenge. This manuscript discusses some weather forecasting and temperature forecasting methods. For weather forecasting methods like CBR (Case Based Reasoning) and FST (Fuzzy Set Theory) are studied which deals with the forecasting problems. While for temperature forecasting MRNFS (Mamdani Recurrent Neuro-Fuzzy System) has been studied, which is trained with the help of two robust population-based algorithms.

Research paper thumbnail of A Comparative Performance of Classification Algorithms in Predicting Alcohol Consumption Among Secondary School Students

Advances in Intelligent Systems and Computing, 2018

The increased consumption of alcohol among secondary school students has been a matter of concern... more The increased consumption of alcohol among secondary school students has been a matter of concern these days. Alcoholism not only affects individual’s decision-making ability but also have a negative effect on academic performance. The early prediction of a student consuming alcohol can be helpful in preventing them from such risks and failures. This paper evaluates classification algorithms for prediction of certain risks of secondary school student due to alcohol consumption. The classification algorithms considered here are three individual classifiers including Naive Bayes Classifier, Random Tree, Simple Logistic and three ensemble classifiers: Random Forest, Bagging, and Adaboost. The dataset is taken from the UCI repository. The performance of these algorithms is evaluated using standard evaluation metrics such as Accuracy, Precision, Recall and F-Measure. The results suggested that Simple Logistic and Random Forest performed better than the other classifiers.

Research paper thumbnail of Prediction of Diabetes using Classification Algorithms

Procedia Computer Science, 2018

The main objective of the research is to predict the diabetes patient and Normal patient based on... more The main objective of the research is to predict the diabetes patient and Normal patient based on test results or test reports using classification algorithms. In Data mining, different techniques can be used for solving problems. For example, classification, prediction, clustering are data mining techniques. Classification is the process of classify the data according to the features of the data with predefined set of classes. Prediction is Used for predicting the class label for new data. The weka tool is used to develop a classifier for predicting the diabetes patient and normal patient. The Diabetes dataset is used for prediction process. The data set can be divided into two subsets. The first one is training set and other one is test set. The training Set contains set of attributes with class labels. The test set contains set of attributes and it doesn't contain class labels. It was predicted by classifier or model. The research takes three algorithms such as Naive Bayes, Multilayer Perceptron and IBK. Each algorithm provides best accuracy for prediction process. The accuracy of the Naive Bayes algorithm is 100%.

Research paper thumbnail of Behavior Problems in Children With Epilepsy (Age 6–14 years): A Prospective Observational Study

International Journal of Recent Surgical and Medical Sciences

Background Epilepsy's psychological effects are variable, some may experience a few mental he... more Background Epilepsy's psychological effects are variable, some may experience a few mental health issues while some may experience serious problems such as anxiety, depression, attention deficit hyperkinetic disorder (ADHD), and mood disorders. Hence, there is a need to screen these problems at an early age for timely intervention. So, our study was conducted to determine the prevalence of emotional and behavioral problems in children with epilepsy. Methods This was a prospective observational study on 111 children, 6 to 14 years of age. The overall prevalence of emotional and behavioral problems in childhood was determined by calculating the percentage of children with child behavior checklist score indicative of specific emotional and behavioral problems. The prevalence for specific morbidities was also calculated and reported separately for each condition. Results were presented in the form of tables, charts, graphs, and narratives. Results The overall prevalence of emotional...

Research paper thumbnail of Ontological Representation of Medical Decision Support System using Machine Learning Classifiers

Research paper thumbnail of Gradient Boosting-Based Predictive Click Fraud Detection Using Manifold Criterion Variable Elimination

IFIP advances in information and communication technology, 2023

Research paper thumbnail of Evaluating Feature Importance to Investigate Publishers Conduct for Detecting Click Fraud

Lecture Notes in Electrical Engineering

Research paper thumbnail of A reliable click-fraud detection system for the investigation of fraudulent publishers in online advertising

Applied Intelligence in Human-Computer Interaction

Research paper thumbnail of A transfer learning framework towards identifying behavioral changes of fraudulent publishers in pay-per-click model of online advertising for click fraud detection

Expert Systems with Applications

Research paper thumbnail of Stacked Generalization Architecture for Predicting Publisher Behaviour from Highly Imbalanced User-Click Data Set for Click Fraud Detection

Research paper thumbnail of Bone Cancer Identification and Separation Using K-Means and KNN Classifiers

2023 2nd International Conference for Innovation in Technology (INOCON)

Research paper thumbnail of Data Sampling Methods for Analyzing Publishers Conduct from Highly Imbalanced Dataset in Web Advertising

Springer International Publishing eBooks, Nov 29, 2022

Research paper thumbnail of A hybrid data‐level sampling approach in learning from skewed user‐click data for click fraud detection in online advertising

Research paper thumbnail of Feature space transformation of user-clicks and deep transfer learning framework for fraudulent publisher detection in online advertising

Research paper thumbnail of Quad division prototype selection-based k-nearest neighbor classifier for click fraud detection from highly skewed user click dataset

Engineering Science and Technology, an International Journal

In online advertising, the user-clicks dataset based fraudulent publishers' classification models... more In online advertising, the user-clicks dataset based fraudulent publishers' classification models exhibit poor performance due to high skewness in class distribution of the publishers. The nearest-neighbor based classification techniques are popularly used to reduce the impact of class skewness on performance. The Nearest-Neighbor techniques use Prototype Selection (PS) methods to select promising samples before classifying them for reducing the size of training data. Although Nearest-Neighbor techniques are simple to use and reduce the negative impact of the loss of potential information, they suffer from higher storage requirements and slower classification speed when applied on datasets with skewed class distributions. In this paper, we propose a Quad Division Prototype Selection-based k-Nearest Neighbor classifier (QDPSKNN) by introducing quad division method for handling uneven class distribution. The quad-division divides the data into four quartiles (groups) and performs controlled under-sampling for balancing class distribution. It reduces the size of the training dataset by selecting only the relevant prototypes in the form of nearest-neighbors. The performance of QDPSKNN is evaluated on Fraud Detection in Mobile Advertising (FDMA) user-click dataset and fifteen other benchmark imbalanced datasets to test its generalizing behaviour. The performance is also compared with one baseline model (k-NN) and four other prototype selection methods such as NearMiss-1, NearMiss-2, NearMiss-3, and Condensed Nearest-Neighbor. The results show improved classification performance with QDPSKNN in terms of precision, recall, f-measure, g-mean, reduction rate and execution time, compared to existing prototype selection methods in the classification of fraudulent publishers as well as on other benchmark imbalanced datasets. Wilcoxon signed ranked test is conducted to demonstrate significant differences amid QDPSKNN and state-of-the-art methods.

Research paper thumbnail of Importance of internet marketing

ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. T... more ABSTRACT: The Internet is a much more accountable and measurable medium than traditional media. The unique property of the Internet being a medium with bidirectional information flows has enabled performance-based pricing models that tie online advertising payments directly to campaign measurement data such as click-through and purchases. These pricing models have become increasingly popular in the online advertising industry and this is done by websites. Website is the backbone of any Internet marketing plan. Whether a company utilizes search engine marketing, e-mail marketing, affiliate marketing or contextual advertising, a website is the element that the campaigns are built upon. Companies have been redefining their marketing and branding strategies due to the unique characteristics of the internet and its capacity to change old rules

Research paper thumbnail of Feature distillation and accumulated selection for automated fraudulent publisher classification from user click data of online advertising

Data Technologies and Applications, 2022

PurposeThe problem of choosing the utmost useful features from hundreds of features from time-ser... more PurposeThe problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's classification. Selecting feature subsets is a key issue in such classification tasks. Practically, the use of filter approaches is common; however, they neglect the correlations amid features. Conversely, wrapper approaches could not be applied due to their complexities. Moreover, in particular, existing feature selection methods could not handle such data, which is one of the major causes of instability of feature selection.Design/methodology/approachTo overcome such issues, a majority voting-based hybrid feature selection method, namely feature distillation and accumulated selection (FDAS), is proposed to investigate the optimal subset of relevant features for analyzing the publisher's fraudulent conduct. FDAS works in two phases: (1) feature distillation, where significant features from standard fi...

Research paper thumbnail of Data Sampling Strategies for Click Fraud Detection Using Imbalanced User Click Data of Online Advertising: An Empirical Review

IETE Technical Review, 2021

In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than tha... more In the pay-per-click online advertisement model, fraudulent publishers’ presence is rare than that of genuine publishers. This high-class imbalance between fraudulent and genuine publishers poses a...

Research paper thumbnail of Gradient boosting learning for fraudulent publisher detection in online advertising

Data Technologies and Applications, 2020

PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publ... more PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click d...

Research paper thumbnail of Clustering Techniques: A Brief Survey of Different Clustering Algorithms

Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining... more Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks. Clustering or data grouping is the key technique of the data mining. It is an unsupervised learning task where one seeks to identify a finite set of categories termed clusters to describe the data . The grouping of data into clusters is based on the principle of maximizing the intra class similarity and minimizing the inter class similarity. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? This paper deal with the study of various clustering algorithms of data mining and it focus on the clustering basics, requirement, classification, problem and application area of the clustering algorithms.

Research paper thumbnail of Comprehensive Analysis on Weather Prediction Methods

Forecasting is an art which combines scientific methods and past experiences with an aim to extra... more Forecasting is an art which combines scientific methods and past experiences with an aim to extract maximum possible information needed for an extrapolation. Forecasting is a very interesting research topic and has been attracting many researchers from the last few decades. Before forecasting, the weather observations are collected. Weather is a continuous, dataintensive, multidimensional, dynamic and chaotic process, and these properties make weather forecasting a big challenge. This manuscript discusses some weather forecasting and temperature forecasting methods. For weather forecasting methods like CBR (Case Based Reasoning) and FST (Fuzzy Set Theory) are studied which deals with the forecasting problems. While for temperature forecasting MRNFS (Mamdani Recurrent Neuro-Fuzzy System) has been studied, which is trained with the help of two robust population-based algorithms.

Research paper thumbnail of A Comparative Performance of Classification Algorithms in Predicting Alcohol Consumption Among Secondary School Students

Advances in Intelligent Systems and Computing, 2018

The increased consumption of alcohol among secondary school students has been a matter of concern... more The increased consumption of alcohol among secondary school students has been a matter of concern these days. Alcoholism not only affects individual’s decision-making ability but also have a negative effect on academic performance. The early prediction of a student consuming alcohol can be helpful in preventing them from such risks and failures. This paper evaluates classification algorithms for prediction of certain risks of secondary school student due to alcohol consumption. The classification algorithms considered here are three individual classifiers including Naive Bayes Classifier, Random Tree, Simple Logistic and three ensemble classifiers: Random Forest, Bagging, and Adaboost. The dataset is taken from the UCI repository. The performance of these algorithms is evaluated using standard evaluation metrics such as Accuracy, Precision, Recall and F-Measure. The results suggested that Simple Logistic and Random Forest performed better than the other classifiers.

Research paper thumbnail of Prediction of Diabetes using Classification Algorithms

Procedia Computer Science, 2018

The main objective of the research is to predict the diabetes patient and Normal patient based on... more The main objective of the research is to predict the diabetes patient and Normal patient based on test results or test reports using classification algorithms. In Data mining, different techniques can be used for solving problems. For example, classification, prediction, clustering are data mining techniques. Classification is the process of classify the data according to the features of the data with predefined set of classes. Prediction is Used for predicting the class label for new data. The weka tool is used to develop a classifier for predicting the diabetes patient and normal patient. The Diabetes dataset is used for prediction process. The data set can be divided into two subsets. The first one is training set and other one is test set. The training Set contains set of attributes with class labels. The test set contains set of attributes and it doesn't contain class labels. It was predicted by classifier or model. The research takes three algorithms such as Naive Bayes, Multilayer Perceptron and IBK. Each algorithm provides best accuracy for prediction process. The accuracy of the Naive Bayes algorithm is 100%.