jhimli adhikari - Academia.edu (original) (raw)

Uploads

Papers by jhimli adhikari

Research paper thumbnail of Synthesizing Global Exceptional Patterns in Different Data Sources

Intelligent systems reference library, Dec 7, 2013

Many large companies transact from multiple branches. It results in generating multiple databases... more Many large companies transact from multiple branches. It results in generating multiple databases, since local transactions are stored locally. The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on multiple databases. Global exceptional patterns describe interesting individuality of few branches. Therefore, it is interesting to identify such patterns. In this paper, we propose type I and type II global exceptional frequent itemsets in multiple databases by extending the notion of global exceptional frequent itemset. Also, we propose the notion of exceptional sources for a type II global exceptional frequent itemset. We propose type I and type II global exceptional association rules in multiple databases by extending the notion of global exceptional association rule. We propose an algorithm for synthesizing type II global exceptional frequent itemsets. Experimental results are presented on both real and synthetic databases. We compare the proposed algorithm with the existing algorithm theoretically as well as experimentally. The experimental results show that the proposed algorithm is effective and promising.

Research paper thumbnail of Mining Icebergs in Different Time-Stamped Data Sources

Springer eBooks, Dec 7, 2013

Research paper thumbnail of Measuring Influence of an Item in Time-Stamped Databases

Springer eBooks, Dec 7, 2013

Research paper thumbnail of Occupancy Based Pattern Mining: Current Status and Future Directions

International journal of next-generation computing, Jun 7, 2020

The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.

Research paper thumbnail of Theophano Mitsa - Temporal Data Mining 1st Edition (2010) Chapman & Hall, CRC 373 ISBN: 9781420089769

Research paper thumbnail of Clustering Items in Different Data Sources Induced by Stability

Abstract: Many multi-branch companies transact from different branches. Each branch of such a com... more Abstract: Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Occupancy Based Pattern Mining: Current Status and Future Directions

Int. J. Next Gener. Comput., 2020

The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.

Research paper thumbnail of Machine Learning Classifier Model for Prediction of COVID-19

COVID-19 pandemic has become a major threat to the world. In this study a model is designed which... more COVID-19 pandemic has become a major threat to the world. In this study a model is designed which can predict the likelihood of Covid-19 patients with maximum accuracy. Therefore three machine learning classification algorithms namely Decision Tree, Naive Bayes and Logistic Regression classifier are used in this experiment to detect Covid-19 disease at an early stage. The models are trained with 75% of the samples and tested with 25% of data. Since the dataset is imbalanced, the performances of all the three algorithms are evaluated on various measures like F-Measure, Accuracy and Matthews Correlation Coefficient. Accuracy is measured over correctly and incorrectly classified instances. All the analyses were performed with the use of Python, version 3.8.2. Receiver Operating Characteristic (ROC) curves are used to verify the result in a proper and systematic manner.

Research paper thumbnail of Clustering items in different data sources induced by stability

Many multi!branch companies transact from different branches. Each branch of such a company maint... more Many multi!branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Book review

Computer Science Review, 2012

Research paper thumbnail of Mining icebergs in time-stamped databases

Many organizations possess large databases collected over a long period of time. Analysis of such... more Many organizations possess large databases collected over a long period of time. Analysis of such databases might be strategically important for further growth of the organizations. It might be interesting as well as useful to learn about interesting changes in sales over time. In this paper, we have introduced a new pattern, called notch, of an item in time-stamped databases. Based on this pattern, we have proposed two special kinds of notch, called generalized notch and iceberg notch, in time-stamped databases. Also we have identified an application of generalized notch. We have designed an algorithm for mining interesting icebergs in time-stamped databases. We have presented experimental results on both synthetic and real-world databases.

Research paper thumbnail of Mining and Analysis of Time-stamped Databases

Research paper thumbnail of Synthesizing Conditional Patterns in a Database

Though frequent itemsets and association rules express interesting association among items of fre... more Though frequent itemsets and association rules express interesting association among items of frequently occurring itemsets in a database, there may exist other types of interesting associations among the items. A critical analysis of frequent itemsets would provide more insight about a database. In this paper, we introduce the notion of conditional pattern in a database. Conditional patterns are interesting and useful for solving many problems. We propose an algorithm for mining conditional patterns in a database. Experiments are conducted on three real datasets. The results of the experiments show that conditional patterns store significant nuggets of knowledge about a database.

Research paper thumbnail of Mining Periodic Patterns from Non-binary Transactions

Journal of Intelligent Computing

Pattern with time period is more valuable because it can better describe objective knowledge. Pre... more Pattern with time period is more valuable because it can better describe objective knowledge. Previous studies on periodic patterns from market basket data focus on patterns without considering the items with their purchased quantities. But in real-life transactions, an item could be purchased multiple times in a transaction and different items may have different quantity in the transactions. To solve this problem, we incorporate the concept of transaction frequency (TF) and database frequency (DF) of an item in a time interval. Our algorithm works in two phases. In first phase we mined locally frequent item sets along with the set of intervals and their database frequency range and second phase mines the two types of periodic patterns (cyclic and acyclic) from the list of intervals. Experimental results are provided to validate the study.

Research paper thumbnail of Measuring influence of an item in a database over time

Pattern Recognition Letters, Feb 1, 2010

Influence of items on some other items might not be the same as the association between these set... more Influence of items on some other items might not be the same as the association between these sets of items. Many tasks of data analysis are based on expressing influence of items on other items. In this paper, we introduce the notion of an overall influence of a set of items on another set of items. We also propose an extension to the notion of overall association between two items in a database. Using the notion of overall influence, we have designed two algorithms for influence analysis involving specific items in a database. As the number of databases increases on a yearly basis, we have adopted incremental approach in these algorithms. Experimental results are reported for both synthetic and real-world databases.

Research paper thumbnail of Clustering Items in Different Data Sources Induced by Stability

International Arab Journal of Information Technology

Many multi-branch companies transact from different branches. Each branch of such a company maint... more Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Mining Multiple Large Data Sources

Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Loc... more Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Local pattern analysis might extract low quality patterns from multiple large databases. Thus, it is necessary to improve mining multiple databases using local pattern analysis. We present existing ...

Research paper thumbnail of Clustering Items in Time-Stamped Databases Induced by Stability

Intelligent Systems Reference Library, 2013

Research paper thumbnail of Synthesizing Different Extreme Association Rules from Multiple Databases

Intelligent Systems Reference Library, 2013

Research paper thumbnail of Synthesizing Global Patterns in Multiple Large Data Sources

Intelligent Systems Reference Library, 2013

Research paper thumbnail of Synthesizing Global Exceptional Patterns in Different Data Sources

Intelligent systems reference library, Dec 7, 2013

Many large companies transact from multiple branches. It results in generating multiple databases... more Many large companies transact from multiple branches. It results in generating multiple databases, since local transactions are stored locally. The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on multiple databases. Global exceptional patterns describe interesting individuality of few branches. Therefore, it is interesting to identify such patterns. In this paper, we propose type I and type II global exceptional frequent itemsets in multiple databases by extending the notion of global exceptional frequent itemset. Also, we propose the notion of exceptional sources for a type II global exceptional frequent itemset. We propose type I and type II global exceptional association rules in multiple databases by extending the notion of global exceptional association rule. We propose an algorithm for synthesizing type II global exceptional frequent itemsets. Experimental results are presented on both real and synthetic databases. We compare the proposed algorithm with the existing algorithm theoretically as well as experimentally. The experimental results show that the proposed algorithm is effective and promising.

Research paper thumbnail of Mining Icebergs in Different Time-Stamped Data Sources

Springer eBooks, Dec 7, 2013

Research paper thumbnail of Measuring Influence of an Item in Time-Stamped Databases

Springer eBooks, Dec 7, 2013

Research paper thumbnail of Occupancy Based Pattern Mining: Current Status and Future Directions

International journal of next-generation computing, Jun 7, 2020

The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.

Research paper thumbnail of Theophano Mitsa - Temporal Data Mining 1st Edition (2010) Chapman & Hall, CRC 373 ISBN: 9781420089769

Research paper thumbnail of Clustering Items in Different Data Sources Induced by Stability

Abstract: Many multi-branch companies transact from different branches. Each branch of such a com... more Abstract: Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Occupancy Based Pattern Mining: Current Status and Future Directions

Int. J. Next Gener. Comput., 2020

The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.

Research paper thumbnail of Machine Learning Classifier Model for Prediction of COVID-19

COVID-19 pandemic has become a major threat to the world. In this study a model is designed which... more COVID-19 pandemic has become a major threat to the world. In this study a model is designed which can predict the likelihood of Covid-19 patients with maximum accuracy. Therefore three machine learning classification algorithms namely Decision Tree, Naive Bayes and Logistic Regression classifier are used in this experiment to detect Covid-19 disease at an early stage. The models are trained with 75% of the samples and tested with 25% of data. Since the dataset is imbalanced, the performances of all the three algorithms are evaluated on various measures like F-Measure, Accuracy and Matthews Correlation Coefficient. Accuracy is measured over correctly and incorrectly classified instances. All the analyses were performed with the use of Python, version 3.8.2. Receiver Operating Characteristic (ROC) curves are used to verify the result in a proper and systematic manner.

Research paper thumbnail of Clustering items in different data sources induced by stability

Many multi!branch companies transact from different branches. Each branch of such a company maint... more Many multi!branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Book review

Computer Science Review, 2012

Research paper thumbnail of Mining icebergs in time-stamped databases

Many organizations possess large databases collected over a long period of time. Analysis of such... more Many organizations possess large databases collected over a long period of time. Analysis of such databases might be strategically important for further growth of the organizations. It might be interesting as well as useful to learn about interesting changes in sales over time. In this paper, we have introduced a new pattern, called notch, of an item in time-stamped databases. Based on this pattern, we have proposed two special kinds of notch, called generalized notch and iceberg notch, in time-stamped databases. Also we have identified an application of generalized notch. We have designed an algorithm for mining interesting icebergs in time-stamped databases. We have presented experimental results on both synthetic and real-world databases.

Research paper thumbnail of Mining and Analysis of Time-stamped Databases

Research paper thumbnail of Synthesizing Conditional Patterns in a Database

Though frequent itemsets and association rules express interesting association among items of fre... more Though frequent itemsets and association rules express interesting association among items of frequently occurring itemsets in a database, there may exist other types of interesting associations among the items. A critical analysis of frequent itemsets would provide more insight about a database. In this paper, we introduce the notion of conditional pattern in a database. Conditional patterns are interesting and useful for solving many problems. We propose an algorithm for mining conditional patterns in a database. Experiments are conducted on three real datasets. The results of the experiments show that conditional patterns store significant nuggets of knowledge about a database.

Research paper thumbnail of Mining Periodic Patterns from Non-binary Transactions

Journal of Intelligent Computing

Pattern with time period is more valuable because it can better describe objective knowledge. Pre... more Pattern with time period is more valuable because it can better describe objective knowledge. Previous studies on periodic patterns from market basket data focus on patterns without considering the items with their purchased quantities. But in real-life transactions, an item could be purchased multiple times in a transaction and different items may have different quantity in the transactions. To solve this problem, we incorporate the concept of transaction frequency (TF) and database frequency (DF) of an item in a time interval. Our algorithm works in two phases. In first phase we mined locally frequent item sets along with the set of intervals and their database frequency range and second phase mines the two types of periodic patterns (cyclic and acyclic) from the list of intervals. Experimental results are provided to validate the study.

Research paper thumbnail of Measuring influence of an item in a database over time

Pattern Recognition Letters, Feb 1, 2010

Influence of items on some other items might not be the same as the association between these set... more Influence of items on some other items might not be the same as the association between these sets of items. Many tasks of data analysis are based on expressing influence of items on other items. In this paper, we introduce the notion of an overall influence of a set of items on another set of items. We also propose an extension to the notion of overall association between two items in a database. Using the notion of overall influence, we have designed two algorithms for influence analysis involving specific items in a database. As the number of databases increases on a yearly basis, we have adopted incremental approach in these algorithms. Experimental results are reported for both synthetic and real-world databases.

Research paper thumbnail of Clustering Items in Different Data Sources Induced by Stability

International Arab Journal of Information Technology

Many multi-branch companies transact from different branches. Each branch of such a company maint... more Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.

Research paper thumbnail of Mining Multiple Large Data Sources

Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Loc... more Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Local pattern analysis might extract low quality patterns from multiple large databases. Thus, it is necessary to improve mining multiple databases using local pattern analysis. We present existing ...

Research paper thumbnail of Clustering Items in Time-Stamped Databases Induced by Stability

Intelligent Systems Reference Library, 2013

Research paper thumbnail of Synthesizing Different Extreme Association Rules from Multiple Databases

Intelligent Systems Reference Library, 2013

Research paper thumbnail of Synthesizing Global Patterns in Multiple Large Data Sources

Intelligent Systems Reference Library, 2013