jhimli adhikari - Academia.edu (original) (raw)
Uploads
Papers by jhimli adhikari
Intelligent systems reference library, Dec 7, 2013
Many large companies transact from multiple branches. It results in generating multiple databases... more Many large companies transact from multiple branches. It results in generating multiple databases, since local transactions are stored locally. The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on multiple databases. Global exceptional patterns describe interesting individuality of few branches. Therefore, it is interesting to identify such patterns. In this paper, we propose type I and type II global exceptional frequent itemsets in multiple databases by extending the notion of global exceptional frequent itemset. Also, we propose the notion of exceptional sources for a type II global exceptional frequent itemset. We propose type I and type II global exceptional association rules in multiple databases by extending the notion of global exceptional association rule. We propose an algorithm for synthesizing type II global exceptional frequent itemsets. Experimental results are presented on both real and synthetic databases. We compare the proposed algorithm with the existing algorithm theoretically as well as experimentally. The experimental results show that the proposed algorithm is effective and promising.
Springer eBooks, Dec 7, 2013
Springer eBooks, Dec 7, 2013
International journal of next-generation computing, Jun 7, 2020
The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.
Abstract: Many multi-branch companies transact from different branches. Each branch of such a com... more Abstract: Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Int. J. Next Gener. Comput., 2020
The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.
COVID-19 pandemic has become a major threat to the world. In this study a model is designed which... more COVID-19 pandemic has become a major threat to the world. In this study a model is designed which can predict the likelihood of Covid-19 patients with maximum accuracy. Therefore three machine learning classification algorithms namely Decision Tree, Naive Bayes and Logistic Regression classifier are used in this experiment to detect Covid-19 disease at an early stage. The models are trained with 75% of the samples and tested with 25% of data. Since the dataset is imbalanced, the performances of all the three algorithms are evaluated on various measures like F-Measure, Accuracy and Matthews Correlation Coefficient. Accuracy is measured over correctly and incorrectly classified instances. All the analyses were performed with the use of Python, version 3.8.2. Receiver Operating Characteristic (ROC) curves are used to verify the result in a proper and systematic manner.
Many multi!branch companies transact from different branches. Each branch of such a company maint... more Many multi!branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Computer Science Review, 2012
Many organizations possess large databases collected over a long period of time. Analysis of such... more Many organizations possess large databases collected over a long period of time. Analysis of such databases might be strategically important for further growth of the organizations. It might be interesting as well as useful to learn about interesting changes in sales over time. In this paper, we have introduced a new pattern, called notch, of an item in time-stamped databases. Based on this pattern, we have proposed two special kinds of notch, called generalized notch and iceberg notch, in time-stamped databases. Also we have identified an application of generalized notch. We have designed an algorithm for mining interesting icebergs in time-stamped databases. We have presented experimental results on both synthetic and real-world databases.
Though frequent itemsets and association rules express interesting association among items of fre... more Though frequent itemsets and association rules express interesting association among items of frequently occurring itemsets in a database, there may exist other types of interesting associations among the items. A critical analysis of frequent itemsets would provide more insight about a database. In this paper, we introduce the notion of conditional pattern in a database. Conditional patterns are interesting and useful for solving many problems. We propose an algorithm for mining conditional patterns in a database. Experiments are conducted on three real datasets. The results of the experiments show that conditional patterns store significant nuggets of knowledge about a database.
Journal of Intelligent Computing
Pattern with time period is more valuable because it can better describe objective knowledge. Pre... more Pattern with time period is more valuable because it can better describe objective knowledge. Previous studies on periodic patterns from market basket data focus on patterns without considering the items with their purchased quantities. But in real-life transactions, an item could be purchased multiple times in a transaction and different items may have different quantity in the transactions. To solve this problem, we incorporate the concept of transaction frequency (TF) and database frequency (DF) of an item in a time interval. Our algorithm works in two phases. In first phase we mined locally frequent item sets along with the set of intervals and their database frequency range and second phase mines the two types of periodic patterns (cyclic and acyclic) from the list of intervals. Experimental results are provided to validate the study.
Pattern Recognition Letters, Feb 1, 2010
Influence of items on some other items might not be the same as the association between these set... more Influence of items on some other items might not be the same as the association between these sets of items. Many tasks of data analysis are based on expressing influence of items on other items. In this paper, we introduce the notion of an overall influence of a set of items on another set of items. We also propose an extension to the notion of overall association between two items in a database. Using the notion of overall influence, we have designed two algorithms for influence analysis involving specific items in a database. As the number of databases increases on a yearly basis, we have adopted incremental approach in these algorithms. Experimental results are reported for both synthetic and real-world databases.
International Arab Journal of Information Technology
Many multi-branch companies transact from different branches. Each branch of such a company maint... more Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Loc... more Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Local pattern analysis might extract low quality patterns from multiple large databases. Thus, it is necessary to improve mining multiple databases using local pattern analysis. We present existing ...
Intelligent Systems Reference Library, 2013
Intelligent Systems Reference Library, 2013
Intelligent Systems Reference Library, 2013
Intelligent systems reference library, Dec 7, 2013
Many large companies transact from multiple branches. It results in generating multiple databases... more Many large companies transact from multiple branches. It results in generating multiple databases, since local transactions are stored locally. The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on multiple databases. Global exceptional patterns describe interesting individuality of few branches. Therefore, it is interesting to identify such patterns. In this paper, we propose type I and type II global exceptional frequent itemsets in multiple databases by extending the notion of global exceptional frequent itemset. Also, we propose the notion of exceptional sources for a type II global exceptional frequent itemset. We propose type I and type II global exceptional association rules in multiple databases by extending the notion of global exceptional association rule. We propose an algorithm for synthesizing type II global exceptional frequent itemsets. Experimental results are presented on both real and synthetic databases. We compare the proposed algorithm with the existing algorithm theoretically as well as experimentally. The experimental results show that the proposed algorithm is effective and promising.
Springer eBooks, Dec 7, 2013
Springer eBooks, Dec 7, 2013
International journal of next-generation computing, Jun 7, 2020
The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.
Abstract: Many multi-branch companies transact from different branches. Each branch of such a com... more Abstract: Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Int. J. Next Gener. Comput., 2020
The main purpose of data mining and analytics is to find novel, potentially useful patterns that ... more The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. In recent years, a new measure of pattern interestingness called occupancy of a pattern was introduced to ensure that each pattern found represents a large part of transactions where it appears. Main objective of this measure is to enhance the quality of a pattern. This article surveys recent studies on pattern mining and its applications based on occupancy. The goal of the paper is to provide both an introduction to occupancy based pattern mining (OPM), and a survey of recent advances and research opportunities. Moreover, main approaches and strategies to solve occupancy based pattern mining problems are also presented. The paper also presents challenges and research opportunities of using occupancy measure in other popular pattern mining problems.
COVID-19 pandemic has become a major threat to the world. In this study a model is designed which... more COVID-19 pandemic has become a major threat to the world. In this study a model is designed which can predict the likelihood of Covid-19 patients with maximum accuracy. Therefore three machine learning classification algorithms namely Decision Tree, Naive Bayes and Logistic Regression classifier are used in this experiment to detect Covid-19 disease at an early stage. The models are trained with 75% of the samples and tested with 25% of data. Since the dataset is imbalanced, the performances of all the three algorithms are evaluated on various measures like F-Measure, Accuracy and Matthews Correlation Coefficient. Accuracy is measured over correctly and incorrectly classified instances. All the analyses were performed with the use of Python, version 3.8.2. Receiver Operating Characteristic (ROC) curves are used to verify the result in a proper and systematic manner.
Many multi!branch companies transact from different branches. Each branch of such a company maint... more Many multi!branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Computer Science Review, 2012
Many organizations possess large databases collected over a long period of time. Analysis of such... more Many organizations possess large databases collected over a long period of time. Analysis of such databases might be strategically important for further growth of the organizations. It might be interesting as well as useful to learn about interesting changes in sales over time. In this paper, we have introduced a new pattern, called notch, of an item in time-stamped databases. Based on this pattern, we have proposed two special kinds of notch, called generalized notch and iceberg notch, in time-stamped databases. Also we have identified an application of generalized notch. We have designed an algorithm for mining interesting icebergs in time-stamped databases. We have presented experimental results on both synthetic and real-world databases.
Though frequent itemsets and association rules express interesting association among items of fre... more Though frequent itemsets and association rules express interesting association among items of frequently occurring itemsets in a database, there may exist other types of interesting associations among the items. A critical analysis of frequent itemsets would provide more insight about a database. In this paper, we introduce the notion of conditional pattern in a database. Conditional patterns are interesting and useful for solving many problems. We propose an algorithm for mining conditional patterns in a database. Experiments are conducted on three real datasets. The results of the experiments show that conditional patterns store significant nuggets of knowledge about a database.
Journal of Intelligent Computing
Pattern with time period is more valuable because it can better describe objective knowledge. Pre... more Pattern with time period is more valuable because it can better describe objective knowledge. Previous studies on periodic patterns from market basket data focus on patterns without considering the items with their purchased quantities. But in real-life transactions, an item could be purchased multiple times in a transaction and different items may have different quantity in the transactions. To solve this problem, we incorporate the concept of transaction frequency (TF) and database frequency (DF) of an item in a time interval. Our algorithm works in two phases. In first phase we mined locally frequent item sets along with the set of intervals and their database frequency range and second phase mines the two types of periodic patterns (cyclic and acyclic) from the list of intervals. Experimental results are provided to validate the study.
Pattern Recognition Letters, Feb 1, 2010
Influence of items on some other items might not be the same as the association between these set... more Influence of items on some other items might not be the same as the association between these sets of items. Many tasks of data analysis are based on expressing influence of items on other items. In this paper, we introduce the notion of an overall influence of a set of items on another set of items. We also propose an extension to the notion of overall association between two items in a database. Using the notion of overall influence, we have designed two algorithms for influence analysis involving specific items in a database. As the number of databases increases on a yearly basis, we have adopted incremental approach in these algorithms. Experimental results are reported for both synthetic and real-world databases.
International Arab Journal of Information Technology
Many multi-branch companies transact from different branches. Each branch of such a company maint... more Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, we introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, we design an algorithm for clustering items in different data sources. We have proposed the notion of best cluster by considering average degree of variation of a class. Also, we have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.
Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Loc... more Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Local pattern analysis might extract low quality patterns from multiple large databases. Thus, it is necessary to improve mining multiple databases using local pattern analysis. We present existing ...
Intelligent Systems Reference Library, 2013
Intelligent Systems Reference Library, 2013
Intelligent Systems Reference Library, 2013