Aditya Dubey CSEIT | Itm University (original) (raw)

Papers by Aditya Dubey CSEIT

Research paper thumbnail of COVID-19 Detection Using Raw Chest X-Ray Images

2022 IEEE World Conference on Applied Intelligence and Computing (AIC)

Research paper thumbnail of Usage of Clustering and Weighted Nearest Neighbors for Efficient Missing Data Imputation of Microarray Gene Expression Dataset

Advanced Theory and Simulations

Research paper thumbnail of Covid-19 Detection based on Transfer Learning & LSTM Network using X-ray Images

2022 IEEE World Conference on Applied Intelligence and Computing (AIC)

Research paper thumbnail of Detection of Liver Cancer using Image Processing Techniques

2019 International Conference on Communication and Signal Processing (ICCSP), 2019

Research paper thumbnail of Outlier Detection Techniques: A Comparative Study

Lecture Notes in Electrical Engineering

Research paper thumbnail of Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour

Scientific Reports, 2021

For most bioinformatics statistical methods, particularly for gene expression data classification... more For most bioinformatics statistical methods, particularly for gene expression data classification, prognosis, and prediction, a complete dataset is required. The gene sample value can be missing due to hardware failure, software failure, or manual mistakes. The missing data in gene expression research dramatically affects the analysis of the collected data. Consequently, this has become a critical problem that requires an efficient imputation algorithm to resolve the issue. This paper proposed a technique considering the local similarity structure that predicts the missing data using clustering and top K nearest neighbor approaches for imputing the missing value. A similarity-based spectral clustering approach is used that is combined with the K-means. The spectral clustering parameters, cluster size, and weighting factors are optimized, and after that, missing values are predicted. For imputing each cluster’s missing value, the top K nearest neighbor approach utilizes the concept o...

Research paper thumbnail of Classification of Actual and Fake News in Pandemic

2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2021

Research paper thumbnail of Supervised Multimodal Emotion Analysis of Violence on Doctors Tweets

2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2021

With the advent of novel coronavirus pandemic doctors, health workers, and the government too, ar... more With the advent of novel coronavirus pandemic doctors, health workers, and the government too, are trying their best of their capacity to deal with contemporary situations. It is genuine that when a person’s close one is lost, they will react vociferously but accusing the doctors and workers and harming them is also morally indignant as the person saving so many lives his/her own life is in danger. With the boom of technology and how the world has come so close on social media, many social media users are expressing their views in either the support or opposition of the saviors of this pandemic, the doctors and the health care workers. These views of people are enough to create a good or bad impression of any doctor in minds of people and can even create a hostile behavior for that doctor by others, analyzing the stand of the person towards the ongoing violent situation towards workers using a multimodal emotional analysis combining both visual and textual data. This paper uses a Multimodal Transformer model which combines both visual and textual data is the sole purpose of this paper. Apart from the main aim, the paper will also explain whether in social media more information has been carried out by a text or more information can spread through images posted on social media. The paper will explain the use of appropriate loss function for imbalanced data also.

Research paper thumbnail of Improve the Performance of Frequent Itemsets Using Apriori and FP Tree Algorithm

Today’s era is based on IT technologies, so data storage is increasing day by day. Result of that... more Today’s era is based on IT technologies, so data storage is increasing day by day. Result of that big amount of data stored in databases and warehouses. Therefore the Data mining becomes popular to explore and analyze the databases for finding the the interesting and unknown patterns and rules known as association rule mining. Association rule mining is one of the essential tasks of descriptive technique which can be found meaningful patterns from big collection of data. Mining frequent item set is basic principle of association rule mining. Many algorithms have been proposed from last many years including Efficient Mining of Frequent Item Sets on Large Uncertain Databases. An efficient Approach for the implementation of FP Tree computes the minimum-support for mining frequent patterns. Now a day, various techniques face the problem of data redundancy, candidate generation, memory consumption problem (FP-tree Algorithms) and other frequent patterns problem. Because of retailer indus...

Research paper thumbnail of Data Mining based Dimensionality Reduction Techniques

2022 International Conference for Advancement in Technology (ICONAT), 2022

Research paper thumbnail of Stock Closing Price Forecasting using Machine Learning Models

2022 International Conference for Advancement in Technology (ICONAT), 2022

Research paper thumbnail of Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation

In the era of big data, a significant amount of data is produced in many applications areas. Howe... more In the era of big data, a significant amount of data is produced in many applications areas. However due to various reasons including sensor failures, communication failures, environmental disruptions, and human errors, missing values are found frequently These missing data in the observed data make a challenge for other data mining approaches, requiring the missed data to be handled at the preprocessing stage of data mining. Several approaches for handling the missing data have been proposed in the past. These approaches consider the whole dataset for making a prediction, making the whole imputation approach to be cumbersome. This paper proposes the procedure which makes use of the local similarity structure of the dataset for making an Imputation. The K-means clustering technique along with the weighted KNN makes efficient imputation of the missed value. The results are compared against imputations by mean substitution and Fuzzy C Means (FCM). The proposed imputation technique sho...

Research paper thumbnail of Analyzing the Performance of Anomaly Detection Algorithms

International Journal of Advanced Computer Science and Applications, 2021

An outlier is a data observation that is considerably irregular from the rest of the dataset. The... more An outlier is a data observation that is considerably irregular from the rest of the dataset. The outlier present in the dataset may cause the integrity of the dataset. Implementing machine learning techniques in various real-world applications and applying those techniques to the healthcare-related dataset will completely change the particular field's present scenario. These applications can highlight the physiological data having anomalous behavior, which can ultimately lead to a fast and necessary response and help to gather more critical knowledge about the particular area. However, a broad amount of study is available about the performance of anomaly detection techniques applied to popular public datasets. But then again, have a minimal amount of analytical work on various supervised and unsupervised methods considering any physiological datasets. The breast cancer dataset is both a universal and numeric dataset. This paper utilized and analyzed four machine learning techniques and their capacity to distinguish anomalies in the breast cancer dataset.

Research paper thumbnail of A Survey of Machine Learning-Based Approaches for Missing Value Imputation

2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), 2021

Missing values create issues during the analysis of the dataset. Learning algorithms in an asymme... more Missing values create issues during the analysis of the dataset. Learning algorithms in an asymmetrical dataset can generate an overrated classification accuracy due to a bias towards the majority class at the expense of the minority class. Missing values in the dataset have a negative impact on the imputation of accuracy; therefore, it could lead to a different output. Some algorithms cannot handle missing values properly, while some techniques give efficient results to estimate the missing values. It is very important to handle missing data because many machine learning algorithm performances reduces due to missing values. It might be possible that the original datasets have some missing data for many factors like data were not kept in a file, data had been corrupted, etc. In this paper, some techniques are discussed which are employed to impute the missing data, and these techniques are compared using their merits and drawbacks.

Research paper thumbnail of Optimized Weighted Samples Based Semi-supervised Learning

2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), 2021

Through semi-supervised learning with graphs, the machine learning community has achieved many ad... more Through semi-supervised learning with graphs, the machine learning community has achieved many advantages in extracting information from a large volume of data under inadequate initial label information. Recent research has shown the benefit of weighing the samples that are labelled can give improved accuracy. Instead of providing similar consideration for labelled samples, sample weighting establishes higher weights for samples occupied at the border of multiple classes than labelled samples occupied so far from the boundary. This article proposes a faster way to calculate the sample weights by reducing the multiple clustering methods to single clustering. The new method of sample weighting is verified using a 2D feature set so that sample weighting can be easily visualized. The proposed method does not reduce the time complexity but it can reduce the number of steps required for weighting the samples. The obtained results have shown that this method can improve the speed with acceptable accuracy.

Research paper thumbnail of Data Mining based Handling Missing Data

2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2019

Research paper thumbnail of Time Series Missing Value Prediction: Algorithms and Applications

Communications in Computer and Information Science, 2020

Research paper thumbnail of Data Mining Based Imputation Techniques to Handle Missing Values in Gene Expressed Dataset

International Journal of Engineering Trends and Technology, 2021

Research paper thumbnail of Performance Measurement of K-means & Spectral Clustering Graph clustering Algorithms

International Journal of Advanced Computer Science, Jan 29, 2015

Clustering algorithms are one of the ways of extracting the valuable information apart from a lar... more Clustering algorithms are one of the ways of extracting the valuable information apart from a large database by partitioning them. All of these clustering algorithms have their main goal that is to find clusters by maximizing the similarity in intra clusters and reducing the similarity between different clusters. Besides of their main goal, all of these algorithms work on different problem domain. In this paper, two algorithms K-means and spectral clustering algorithm are described. Both algorithm are tested and evaluated on different applications driven dataset. For calculating the efficiency of the clustering algorithm, silhouette index is used. Performance and accuracy of both the clustering algorithm are presented and compared by using validity index.

Research paper thumbnail of A Survey on IOT enabled cloud platforms

2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT), 2020

IOT is a mainstream technology of today’s technology world and has transformed recently with a si... more IOT is a mainstream technology of today’s technology world and has transformed recently with a significant potential for modernizing the lifestyle of modern societies today. Since the term IOT was devised in the year 1999 by Kevin Ashton there is a tremendous increase in the countable devices bridged to the internet in the last few decades. This paper provides a literature survey on various cloud services used in integration with IOT and a study about fog computing used in various application areas. Due to seamless data exchange between IOT devices and sensors, there was a need of platforms for data management, storage, and analysis. Thus cloud computing and fog computing are often used interchangeably for providing high storage capacities and processing capabilities. Different types of IOT based cloud platforms are discussed in this paper depending on their applicability along with their pros and cons precisely.

Research paper thumbnail of COVID-19 Detection Using Raw Chest X-Ray Images

2022 IEEE World Conference on Applied Intelligence and Computing (AIC)

Research paper thumbnail of Usage of Clustering and Weighted Nearest Neighbors for Efficient Missing Data Imputation of Microarray Gene Expression Dataset

Advanced Theory and Simulations

Research paper thumbnail of Covid-19 Detection based on Transfer Learning & LSTM Network using X-ray Images

2022 IEEE World Conference on Applied Intelligence and Computing (AIC)

Research paper thumbnail of Detection of Liver Cancer using Image Processing Techniques

2019 International Conference on Communication and Signal Processing (ICCSP), 2019

Research paper thumbnail of Outlier Detection Techniques: A Comparative Study

Lecture Notes in Electrical Engineering

Research paper thumbnail of Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour

Scientific Reports, 2021

For most bioinformatics statistical methods, particularly for gene expression data classification... more For most bioinformatics statistical methods, particularly for gene expression data classification, prognosis, and prediction, a complete dataset is required. The gene sample value can be missing due to hardware failure, software failure, or manual mistakes. The missing data in gene expression research dramatically affects the analysis of the collected data. Consequently, this has become a critical problem that requires an efficient imputation algorithm to resolve the issue. This paper proposed a technique considering the local similarity structure that predicts the missing data using clustering and top K nearest neighbor approaches for imputing the missing value. A similarity-based spectral clustering approach is used that is combined with the K-means. The spectral clustering parameters, cluster size, and weighting factors are optimized, and after that, missing values are predicted. For imputing each cluster’s missing value, the top K nearest neighbor approach utilizes the concept o...

Research paper thumbnail of Classification of Actual and Fake News in Pandemic

2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2021

Research paper thumbnail of Supervised Multimodal Emotion Analysis of Violence on Doctors Tweets

2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2021

With the advent of novel coronavirus pandemic doctors, health workers, and the government too, ar... more With the advent of novel coronavirus pandemic doctors, health workers, and the government too, are trying their best of their capacity to deal with contemporary situations. It is genuine that when a person’s close one is lost, they will react vociferously but accusing the doctors and workers and harming them is also morally indignant as the person saving so many lives his/her own life is in danger. With the boom of technology and how the world has come so close on social media, many social media users are expressing their views in either the support or opposition of the saviors of this pandemic, the doctors and the health care workers. These views of people are enough to create a good or bad impression of any doctor in minds of people and can even create a hostile behavior for that doctor by others, analyzing the stand of the person towards the ongoing violent situation towards workers using a multimodal emotional analysis combining both visual and textual data. This paper uses a Multimodal Transformer model which combines both visual and textual data is the sole purpose of this paper. Apart from the main aim, the paper will also explain whether in social media more information has been carried out by a text or more information can spread through images posted on social media. The paper will explain the use of appropriate loss function for imbalanced data also.

Research paper thumbnail of Improve the Performance of Frequent Itemsets Using Apriori and FP Tree Algorithm

Today’s era is based on IT technologies, so data storage is increasing day by day. Result of that... more Today’s era is based on IT technologies, so data storage is increasing day by day. Result of that big amount of data stored in databases and warehouses. Therefore the Data mining becomes popular to explore and analyze the databases for finding the the interesting and unknown patterns and rules known as association rule mining. Association rule mining is one of the essential tasks of descriptive technique which can be found meaningful patterns from big collection of data. Mining frequent item set is basic principle of association rule mining. Many algorithms have been proposed from last many years including Efficient Mining of Frequent Item Sets on Large Uncertain Databases. An efficient Approach for the implementation of FP Tree computes the minimum-support for mining frequent patterns. Now a day, various techniques face the problem of data redundancy, candidate generation, memory consumption problem (FP-tree Algorithms) and other frequent patterns problem. Because of retailer indus...

Research paper thumbnail of Data Mining based Dimensionality Reduction Techniques

2022 International Conference for Advancement in Technology (ICONAT), 2022

Research paper thumbnail of Stock Closing Price Forecasting using Machine Learning Models

2022 International Conference for Advancement in Technology (ICONAT), 2022

Research paper thumbnail of Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation

In the era of big data, a significant amount of data is produced in many applications areas. Howe... more In the era of big data, a significant amount of data is produced in many applications areas. However due to various reasons including sensor failures, communication failures, environmental disruptions, and human errors, missing values are found frequently These missing data in the observed data make a challenge for other data mining approaches, requiring the missed data to be handled at the preprocessing stage of data mining. Several approaches for handling the missing data have been proposed in the past. These approaches consider the whole dataset for making a prediction, making the whole imputation approach to be cumbersome. This paper proposes the procedure which makes use of the local similarity structure of the dataset for making an Imputation. The K-means clustering technique along with the weighted KNN makes efficient imputation of the missed value. The results are compared against imputations by mean substitution and Fuzzy C Means (FCM). The proposed imputation technique sho...

Research paper thumbnail of Analyzing the Performance of Anomaly Detection Algorithms

International Journal of Advanced Computer Science and Applications, 2021

An outlier is a data observation that is considerably irregular from the rest of the dataset. The... more An outlier is a data observation that is considerably irregular from the rest of the dataset. The outlier present in the dataset may cause the integrity of the dataset. Implementing machine learning techniques in various real-world applications and applying those techniques to the healthcare-related dataset will completely change the particular field's present scenario. These applications can highlight the physiological data having anomalous behavior, which can ultimately lead to a fast and necessary response and help to gather more critical knowledge about the particular area. However, a broad amount of study is available about the performance of anomaly detection techniques applied to popular public datasets. But then again, have a minimal amount of analytical work on various supervised and unsupervised methods considering any physiological datasets. The breast cancer dataset is both a universal and numeric dataset. This paper utilized and analyzed four machine learning techniques and their capacity to distinguish anomalies in the breast cancer dataset.

Research paper thumbnail of A Survey of Machine Learning-Based Approaches for Missing Value Imputation

2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), 2021

Missing values create issues during the analysis of the dataset. Learning algorithms in an asymme... more Missing values create issues during the analysis of the dataset. Learning algorithms in an asymmetrical dataset can generate an overrated classification accuracy due to a bias towards the majority class at the expense of the minority class. Missing values in the dataset have a negative impact on the imputation of accuracy; therefore, it could lead to a different output. Some algorithms cannot handle missing values properly, while some techniques give efficient results to estimate the missing values. It is very important to handle missing data because many machine learning algorithm performances reduces due to missing values. It might be possible that the original datasets have some missing data for many factors like data were not kept in a file, data had been corrupted, etc. In this paper, some techniques are discussed which are employed to impute the missing data, and these techniques are compared using their merits and drawbacks.

Research paper thumbnail of Optimized Weighted Samples Based Semi-supervised Learning

2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), 2021

Through semi-supervised learning with graphs, the machine learning community has achieved many ad... more Through semi-supervised learning with graphs, the machine learning community has achieved many advantages in extracting information from a large volume of data under inadequate initial label information. Recent research has shown the benefit of weighing the samples that are labelled can give improved accuracy. Instead of providing similar consideration for labelled samples, sample weighting establishes higher weights for samples occupied at the border of multiple classes than labelled samples occupied so far from the boundary. This article proposes a faster way to calculate the sample weights by reducing the multiple clustering methods to single clustering. The new method of sample weighting is verified using a 2D feature set so that sample weighting can be easily visualized. The proposed method does not reduce the time complexity but it can reduce the number of steps required for weighting the samples. The obtained results have shown that this method can improve the speed with acceptable accuracy.

Research paper thumbnail of Data Mining based Handling Missing Data

2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2019

Research paper thumbnail of Time Series Missing Value Prediction: Algorithms and Applications

Communications in Computer and Information Science, 2020

Research paper thumbnail of Data Mining Based Imputation Techniques to Handle Missing Values in Gene Expressed Dataset

International Journal of Engineering Trends and Technology, 2021

Research paper thumbnail of Performance Measurement of K-means & Spectral Clustering Graph clustering Algorithms

International Journal of Advanced Computer Science, Jan 29, 2015

Clustering algorithms are one of the ways of extracting the valuable information apart from a lar... more Clustering algorithms are one of the ways of extracting the valuable information apart from a large database by partitioning them. All of these clustering algorithms have their main goal that is to find clusters by maximizing the similarity in intra clusters and reducing the similarity between different clusters. Besides of their main goal, all of these algorithms work on different problem domain. In this paper, two algorithms K-means and spectral clustering algorithm are described. Both algorithm are tested and evaluated on different applications driven dataset. For calculating the efficiency of the clustering algorithm, silhouette index is used. Performance and accuracy of both the clustering algorithm are presented and compared by using validity index.

Research paper thumbnail of A Survey on IOT enabled cloud platforms

2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT), 2020

IOT is a mainstream technology of today’s technology world and has transformed recently with a si... more IOT is a mainstream technology of today’s technology world and has transformed recently with a significant potential for modernizing the lifestyle of modern societies today. Since the term IOT was devised in the year 1999 by Kevin Ashton there is a tremendous increase in the countable devices bridged to the internet in the last few decades. This paper provides a literature survey on various cloud services used in integration with IOT and a study about fog computing used in various application areas. Due to seamless data exchange between IOT devices and sensors, there was a need of platforms for data management, storage, and analysis. Thus cloud computing and fog computing are often used interchangeably for providing high storage capacities and processing capabilities. Different types of IOT based cloud platforms are discussed in this paper depending on their applicability along with their pros and cons precisely.