Mete Celik | Erciyes University (original) (raw)

Papers by Mete Celik

Research paper thumbnail of A hybrid CNN-LSTM model for high resolution melting curve classification

Biomedical Signal Processing and Control

Research paper thumbnail of Hybrid models based on genetic algorithm and deep learning algorithms for nutritional Anemia disease classification

Biomedical Signal Processing and Control

Research paper thumbnail of Structural profile matrices for predicting structural properties of proteins

Journal of Bioinformatics and Computational Biology

Predicting structural properties of proteins plays a key role in predicting the 3D structure of p... more Predicting structural properties of proteins plays a key role in predicting the 3D structure of proteins. In this study, new structural profile matrices (SPM) are developed for protein secondary structure, solvent accessibility and torsion angle class predictions, which could be used as input to 3D prediction algorithms. The structural templates employed in computing SPMs are detected by eight alignment methods in LOMETS server, gap affine alignment method, ScanProsite, PfamScan, and HHblits. The contribution of each template is weighted by its similarity to target, which is assessed by several sequence alignment scores. For comparison, the SPMs are also computed using Homolpro, which uses BLAST for target template alignments and does not assign weights to templates. Incorporating the SPMs into DSPRED classifier, the prediction accuracy improves significantly as demonstrated by cross-validation experiments on two difficult benchmarks. The most accurate predictions are obtained using...

Research paper thumbnail of Cloud Computing Based Socially Important Locations Discovery on Social Media Big Datasets

International Journal of Information Technology & Decision Making

Socially important locations are places which are frequently visited by social media users in the... more Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provides valuable information, such as which locations are frequently visited by a social media user, which locations are common for a social media user group, and which locations are socially important for a group of urban area residents. However, discovering socially important locations is challenging due to huge volume, velocity, and variety of social media datasets, inefficiency of current interest measures and algorithms on social media big datasets, and the need of massive spatial and temporal calculations for spatial social media analyses. In contrast, cloud computing provides infrastructure and platforms to scale compute-intensive jobs. In the literature, limited number of studies related to socially important locations discovery takes into account cloud computing systems to scale increasing dataset size and to ha...

Research paper thumbnail of Mining High-Average Utility Itemsets with Positive and Negative External Utilities

Research paper thumbnail of Modelling surface water-groundwater interactions at the Palas Basin (Turkey) using FREEWAT

Acque Sotterranee - Italian Journal of Groundwater

Palas Basin is a semi-arid closed basin located in the Central Anatolia region of Turkey. The maj... more Palas Basin is a semi-arid closed basin located in the Central Anatolia region of Turkey. The major economic activity in the basin is agriculture; therefore, both surface water and groundwater are used for irrigation. However, intensive use of water resources threatens the hydrologic sustainability of a lake ecosystem (Tuzla Lake) located in the basin. In this study, we analyzed the relationships between agricultural water uses in the Palas Basin and water flows to the Tuzla Lake using groundwater flow model developed with the FREEWAT platform. The model grid with 250 m x 250 m resolution was created based on the entire watershed. Two hydrostratigraphic units were identified. The source terms defined in the model were rainfall recharge and the sink terms were evapotranspiration and wells. The model was run for one year at steady state conditions. Three scenarios were simulated to understand the effect of groundwater use on the lake hydrology. The first scenario assumed that there wa...

Research paper thumbnail of Discovering socially similar users in social media datasets based on their socially important locations

Information Processing & Management

Research paper thumbnail of Assessment of Multi Fragment Melting Analysis System (MFMAS) for the Identification of Food-Borne Yeasts

Current microbiology, 2018

Multi Fragment Melting Analysis System (MFMAS) is a novel approach that was developed for the spe... more Multi Fragment Melting Analysis System (MFMAS) is a novel approach that was developed for the species-level identification of microorganisms. It is a software-assisted system that performs concurrent melting analysis of 8 different DNA fragments to obtain a fingerprint of each strain analyzed. The identification is performed according to the comparison of these fingerprints with the fingerprints of known yeast species recorded in a database to obtain the best possible match. In this study, applicability of the yeast version of the MFMAS (MFMAS-yeast) was evaluated for the identification of food-associated yeast species. For this purpose, in this study, a total of 145 yeast strains originated from foods and beverages and 19 standard yeast strains were tested. The DNAs isolated from these yeast strains were analyzed by the MFMAS, and their species were successfully identified with a similarity rate of 95% or higher. It was shown that the strains belonged to 43 different yeast species ...

Research paper thumbnail of Discovering socially important locations of social media users

Expert Systems with Applications

Research paper thumbnail of Veri Madenciliği Yöntemleri Kullanılarak Meme Kanseri Hücrelerinin Tahmin ve Teşhisi

Research paper thumbnail of Spatial and Spatiotemporal Data Mining: Recent Advances

Explosive growth in geospatial data and the emergence of new spatial technologies emphasize the n... more Explosive growth in geospatial data and the emergence of new spatial technologies emphasize the need for automated discovery of spatial knowledge. Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from large spatial databases. The complexity of spatial data and intrinsic spatial relationships limits the usefulness of conventional data mining techniques for extracting spatial patterns. In this chapter we explore the emerging field of spatial data mining, focusing on four major topics: prediction and classification, outlier detection, co-location mining, and clustering. Spatiotemporal data mining is also briefly discussed.

Research paper thumbnail of Yağış ve İklim İndeksleri Arasındaki Sıralı Birliktelik Örüntülerinin Tespit Edilmesi

Research paper thumbnail of Discovering Patterns of Insurgency via Spatio-Temporal Data Mining

Abstract: The need to discover patterns in spatio-temporal (ST) data has driven much recent resea... more Abstract: The need to discover patterns in spatio-temporal (ST) data has driven much recent research in ST cooccurrence patterns. Early work focused on discovering spatial patterns such as co-location without examining the development of patterns over time or the ...

Research paper thumbnail of Daily and hourly mood pattern discovery of Turkish twitter users

Global Journal of Computer Science, 2015

Massive amount of data-related applications and widespread usage of web technologies has started ... more Massive amount of data-related applications and widespread usage of web technologies has started big data era. Social media data is one of the big data sources. Mining social media data provides useful insights for companies and organizations for developing their services, products or organizations. This study aims to analyze Turkish Twitter users based on daily and hourly social media sharings. By this way, daily and hourly mood patterns of Turkish social media users could be revealed in positive or negative manner. For this purpose, Support Vector Machines (SVM) classification algorithm and Term Frequency-Inverse Document Frequency (TF-IDF) feature selection technique was used. As far as our knowledge, this is the first attempt to analyze people's all sharings on social media and generate results for temporalbased indicators like macro and micro levels.

Research paper thumbnail of Spatial AutoRegression (SAR) Model

ABSTRACT The spatial autoregression model (SAR) is a generalization of the linear regression mode... more ABSTRACT The spatial autoregression model (SAR) is a generalization of the linear regression model to account for spatial autocorrelation. Although this model allows users to make reliable classification procedures, it is computationally expensive to estimate the corresponding parameters. SAR model parameters have been estimated using maximum likelihood theory (ML) or Bayesian statistics. This book focuses on ML-based SAR model estimation. The author shows that the SAR model can be efficiently implemented without loss of accuracy, so that large geospatial autocorrelated datasets can be analysed in a reasonable amount of time. The book is organized into 8 chapters. Chapter 1 highlights the importance of having models and software for analyzing large geospatial datasets. It contains a brief history on the subject and summarizes the major contributions of the book. Chapter 2 describes briefly but rigorously the theory behind the model SAR. In particular, the results that help to optimize the estimation of SAR model parameters are proven in detail. In Chapter 3 a parallel formulation for a general exact estimation procedure for SAR model parameters is developed. An exact SAR model solution is both memory and computer intensive. It is needed to develop approximate solutions that do not sacrifice accuracy and can handle very large datasets. In Chapter 4 two different approximations for solving the SAR model are proposed. Then experimental comparisons of the proposed solutions on real satellite remote sensing imagery are explained. In Chapter 5 parallel approximate SAR models are developed using hybrid programming and sparse matrix algebra in order to reach very large problem sizes. The focus of Chapter 6 is the development of the Gauss-Lanczos approximated SAR model solution. The key idea of this algorithm is to find only some of the eigenvalues of a large matrix instead of finding all the eigenvalues by reducing the size of that matrix. The performance of the proposed method is compared with other approximate solution methods, which were studied in the previous chapter. The new algorithm provides better approximation when the data is strongly correlated. In Chapter 7, conclusions and future work are provided. Chapter 8 includes the supplementary materials that are mentioned in the book. This book is a good text for students as well as for image analysis software developers.

Research paper thumbnail of Associations Between Stream Flow And Climatic Parameters At Kızılırmak River Basin In Turkey

Global Nest Journal, Jul 27, 2012

This study aims to demonstrate the use of association analysis for discovering the relationships ... more This study aims to demonstrate the use of association analysis for discovering the relationships between stream flow and climatic variables in the Kızılırmak River Basin in Turkey. Association analysis is a data mining technique that aims to discover rules in the form of A B that may occur in large datasets with frequency above a given threshold. A and B can be defined as events of a certain type, with the rule if A occurs then B occurs. In this study, A refers to climatic variable(s) (i.e., precipitation, temperature, wind speed, relative humidity) of certain magnitude, and B refers to the magnitude of stream flow. The interesting rules were quantified using support and confidence measures. Stream-flow data from three gauging stations in the Kızılırmak River Basin and climate data from three weather stations in the same basin were included in the analyses. All data were first segregated into three groups that were named as low, medium, and high. Low and high ranges of stream-flow data were further divided into three to increase our focus on extreme events. The analyses were conducted at the annual and seasonal timescales. The analyses indicated that the relationships between precipitation and temperature and stream flow are most prevalent but, relative humidity and wind speed are also important determinants of stream flow in the Kızılırmak River Basin.

Research paper thumbnail of Discovery of hydrometeorological patterns

Turkish Journal of Electrical Engineering and Computer Science, 2014

Hydrometeorological patterns can be defined as meaningful and nontrivial associations between hyd... more Hydrometeorological patterns can be defined as meaningful and nontrivial associations between hydrological and meteorological parameters over a region. Discovering hydrometeorological patterns is important for many applications, including forecasting hydrometeorological hazards (floods and droughts), predicting the hydrological responses of ungauged basins, and filling in missing hydrological or meteorological records. However, discovering these patterns is challenging due to the special characteristics of hydrological and meteorological data, and is computationally complex due to the archival history of the datasets. Moreover, defining monotonic interest measures to quantify these patterns is difficult. In this study, we propose a new monotonic interest measure, called the hydrometeorological prevalence index, and a novel algorithm for mining hydrometeorological patterns (HMP-Miner) out of large hydrological and meteorological datasets. Experimental evaluations using real datasets show that our proposed algorithm outperforms the naïve alternative in discovering hydrometeorological patterns efficiently.

Research paper thumbnail of Partial spatio-temporal co-occurrence pattern mining

Knowledge and Information Systems, 2014

ABSTRACT Spatio-temporal co-occurrence patterns represent subsets of object-types that are often ... more ABSTRACT Spatio-temporal co-occurrence patterns represent subsets of object-types that are often located together in space and time. The aim of the discovery of partial spatio-temporal co-occurrence patterns (PACOPs) is to find co-occurrences of the object-types that are partially present in the database. Discovering PACOPs is an important problem with many applications such as discovering interactions between animals in ecology, identifying tactics in battlefields and games, and identifying crime patterns in criminal databases. However, mining PACOPs is computationally very expensive because the interest measures are computationally complex, databases are larger due to the archival history, and the set of candidate patterns is exponential in the number of object-types. Previous studies on discovering spatio-temporal co-occurrence patterns do not take into account the presence period (i.e., lifetime) of the objects in the database. This paper defines the problem of mining PACOPs, proposes a new monotonic composite interest measure, and proposes novel PACOP mining algorithms. The experimental results show that the proposed algorithms are computationally more efficient than the naïve alternatives.

Research paper thumbnail of Modeling Spatial and Spatio-temporal Co-occurrence Patterns

I wish to thank several people who, in some way, have helped me during the development of this th... more I wish to thank several people who, in some way, have helped me during the development of this thesis. First, I would like to express my gratitude to my supervisor, Prof. Shashi Shekhar for his continued encouragement and invaluable suggestions during this study. It has been a great pleasure to do research with him. I would like to thank members of my advisory committee Professors Jaideep Srivastava, Arindam Banerjee, and Sudipto Banerjee for their guidance and constructive and useful comments. This research also benefited from discussions with Prof. Daniel Boley. I sincerely would like to thank to US Army Corps of Engineers, Topographic Engineering Center Researchers James P. Rogers and Dr. James A. Shine with whom I have collaborated during my PhD research. Their conceptual and technical insights into my thesis work have been invaluable. I am very grateful to the all members of Spatial Databases and Data Mining group members at the Department of Computer Science for their discussions, critiques, and contributions. There are three people I need to mention especially: Betsy George who helped me unselfishly whenever I needed, and Jin Soung Yoo and Baris Kazar who provided assistance during the early stages of this research. I apologize to all I have forgotten, among whom are the anonymous referees of my papers. I assure them that their help was valued as much as that of the people mentioned above. During the course of this work, at the University of Minnesota, I was supported in part by the Turkish Council of Higher Education and with grants from the US Army Corps of Engineers, Topographic Engineering Center. I am grateful for their support. Many people on the staff of the Computer Science Department assisted and encouraged me in various ways during my course of studies. I am grateful to Georganne Tolaas, Bonnie Klein, and Jennifer Barrett. I also would like to thank to Kim Koffolt for her comments to improve the readability of my papers and this dissertation. Lastly, and most importantly, I wish to thank my parents, Fatma Celik and Metin Celik ii for everything they gave me. I am very grateful for my wife Filiz Dadaser Celik, for her love, patience, and support during the PhD period. One of the best experiences that we lived through in this period was the birth of our children Cagri Fatih and Elif Huma.

Research paper thumbnail of Theory behind the SAR Model

SpringerBriefs in Computer Science, 2012

Research paper thumbnail of A hybrid CNN-LSTM model for high resolution melting curve classification

Biomedical Signal Processing and Control

Research paper thumbnail of Hybrid models based on genetic algorithm and deep learning algorithms for nutritional Anemia disease classification

Biomedical Signal Processing and Control

Research paper thumbnail of Structural profile matrices for predicting structural properties of proteins

Journal of Bioinformatics and Computational Biology

Predicting structural properties of proteins plays a key role in predicting the 3D structure of p... more Predicting structural properties of proteins plays a key role in predicting the 3D structure of proteins. In this study, new structural profile matrices (SPM) are developed for protein secondary structure, solvent accessibility and torsion angle class predictions, which could be used as input to 3D prediction algorithms. The structural templates employed in computing SPMs are detected by eight alignment methods in LOMETS server, gap affine alignment method, ScanProsite, PfamScan, and HHblits. The contribution of each template is weighted by its similarity to target, which is assessed by several sequence alignment scores. For comparison, the SPMs are also computed using Homolpro, which uses BLAST for target template alignments and does not assign weights to templates. Incorporating the SPMs into DSPRED classifier, the prediction accuracy improves significantly as demonstrated by cross-validation experiments on two difficult benchmarks. The most accurate predictions are obtained using...

Research paper thumbnail of Cloud Computing Based Socially Important Locations Discovery on Social Media Big Datasets

International Journal of Information Technology & Decision Making

Socially important locations are places which are frequently visited by social media users in the... more Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provides valuable information, such as which locations are frequently visited by a social media user, which locations are common for a social media user group, and which locations are socially important for a group of urban area residents. However, discovering socially important locations is challenging due to huge volume, velocity, and variety of social media datasets, inefficiency of current interest measures and algorithms on social media big datasets, and the need of massive spatial and temporal calculations for spatial social media analyses. In contrast, cloud computing provides infrastructure and platforms to scale compute-intensive jobs. In the literature, limited number of studies related to socially important locations discovery takes into account cloud computing systems to scale increasing dataset size and to ha...

Research paper thumbnail of Mining High-Average Utility Itemsets with Positive and Negative External Utilities

Research paper thumbnail of Modelling surface water-groundwater interactions at the Palas Basin (Turkey) using FREEWAT

Acque Sotterranee - Italian Journal of Groundwater

Palas Basin is a semi-arid closed basin located in the Central Anatolia region of Turkey. The maj... more Palas Basin is a semi-arid closed basin located in the Central Anatolia region of Turkey. The major economic activity in the basin is agriculture; therefore, both surface water and groundwater are used for irrigation. However, intensive use of water resources threatens the hydrologic sustainability of a lake ecosystem (Tuzla Lake) located in the basin. In this study, we analyzed the relationships between agricultural water uses in the Palas Basin and water flows to the Tuzla Lake using groundwater flow model developed with the FREEWAT platform. The model grid with 250 m x 250 m resolution was created based on the entire watershed. Two hydrostratigraphic units were identified. The source terms defined in the model were rainfall recharge and the sink terms were evapotranspiration and wells. The model was run for one year at steady state conditions. Three scenarios were simulated to understand the effect of groundwater use on the lake hydrology. The first scenario assumed that there wa...

Research paper thumbnail of Discovering socially similar users in social media datasets based on their socially important locations

Information Processing & Management

Research paper thumbnail of Assessment of Multi Fragment Melting Analysis System (MFMAS) for the Identification of Food-Borne Yeasts

Current microbiology, 2018

Multi Fragment Melting Analysis System (MFMAS) is a novel approach that was developed for the spe... more Multi Fragment Melting Analysis System (MFMAS) is a novel approach that was developed for the species-level identification of microorganisms. It is a software-assisted system that performs concurrent melting analysis of 8 different DNA fragments to obtain a fingerprint of each strain analyzed. The identification is performed according to the comparison of these fingerprints with the fingerprints of known yeast species recorded in a database to obtain the best possible match. In this study, applicability of the yeast version of the MFMAS (MFMAS-yeast) was evaluated for the identification of food-associated yeast species. For this purpose, in this study, a total of 145 yeast strains originated from foods and beverages and 19 standard yeast strains were tested. The DNAs isolated from these yeast strains were analyzed by the MFMAS, and their species were successfully identified with a similarity rate of 95% or higher. It was shown that the strains belonged to 43 different yeast species ...

Research paper thumbnail of Discovering socially important locations of social media users

Expert Systems with Applications

Research paper thumbnail of Veri Madenciliği Yöntemleri Kullanılarak Meme Kanseri Hücrelerinin Tahmin ve Teşhisi

Research paper thumbnail of Spatial and Spatiotemporal Data Mining: Recent Advances

Explosive growth in geospatial data and the emergence of new spatial technologies emphasize the n... more Explosive growth in geospatial data and the emergence of new spatial technologies emphasize the need for automated discovery of spatial knowledge. Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from large spatial databases. The complexity of spatial data and intrinsic spatial relationships limits the usefulness of conventional data mining techniques for extracting spatial patterns. In this chapter we explore the emerging field of spatial data mining, focusing on four major topics: prediction and classification, outlier detection, co-location mining, and clustering. Spatiotemporal data mining is also briefly discussed.

Research paper thumbnail of Yağış ve İklim İndeksleri Arasındaki Sıralı Birliktelik Örüntülerinin Tespit Edilmesi

Research paper thumbnail of Discovering Patterns of Insurgency via Spatio-Temporal Data Mining

Abstract: The need to discover patterns in spatio-temporal (ST) data has driven much recent resea... more Abstract: The need to discover patterns in spatio-temporal (ST) data has driven much recent research in ST cooccurrence patterns. Early work focused on discovering spatial patterns such as co-location without examining the development of patterns over time or the ...

Research paper thumbnail of Daily and hourly mood pattern discovery of Turkish twitter users

Global Journal of Computer Science, 2015

Massive amount of data-related applications and widespread usage of web technologies has started ... more Massive amount of data-related applications and widespread usage of web technologies has started big data era. Social media data is one of the big data sources. Mining social media data provides useful insights for companies and organizations for developing their services, products or organizations. This study aims to analyze Turkish Twitter users based on daily and hourly social media sharings. By this way, daily and hourly mood patterns of Turkish social media users could be revealed in positive or negative manner. For this purpose, Support Vector Machines (SVM) classification algorithm and Term Frequency-Inverse Document Frequency (TF-IDF) feature selection technique was used. As far as our knowledge, this is the first attempt to analyze people's all sharings on social media and generate results for temporalbased indicators like macro and micro levels.

Research paper thumbnail of Spatial AutoRegression (SAR) Model

ABSTRACT The spatial autoregression model (SAR) is a generalization of the linear regression mode... more ABSTRACT The spatial autoregression model (SAR) is a generalization of the linear regression model to account for spatial autocorrelation. Although this model allows users to make reliable classification procedures, it is computationally expensive to estimate the corresponding parameters. SAR model parameters have been estimated using maximum likelihood theory (ML) or Bayesian statistics. This book focuses on ML-based SAR model estimation. The author shows that the SAR model can be efficiently implemented without loss of accuracy, so that large geospatial autocorrelated datasets can be analysed in a reasonable amount of time. The book is organized into 8 chapters. Chapter 1 highlights the importance of having models and software for analyzing large geospatial datasets. It contains a brief history on the subject and summarizes the major contributions of the book. Chapter 2 describes briefly but rigorously the theory behind the model SAR. In particular, the results that help to optimize the estimation of SAR model parameters are proven in detail. In Chapter 3 a parallel formulation for a general exact estimation procedure for SAR model parameters is developed. An exact SAR model solution is both memory and computer intensive. It is needed to develop approximate solutions that do not sacrifice accuracy and can handle very large datasets. In Chapter 4 two different approximations for solving the SAR model are proposed. Then experimental comparisons of the proposed solutions on real satellite remote sensing imagery are explained. In Chapter 5 parallel approximate SAR models are developed using hybrid programming and sparse matrix algebra in order to reach very large problem sizes. The focus of Chapter 6 is the development of the Gauss-Lanczos approximated SAR model solution. The key idea of this algorithm is to find only some of the eigenvalues of a large matrix instead of finding all the eigenvalues by reducing the size of that matrix. The performance of the proposed method is compared with other approximate solution methods, which were studied in the previous chapter. The new algorithm provides better approximation when the data is strongly correlated. In Chapter 7, conclusions and future work are provided. Chapter 8 includes the supplementary materials that are mentioned in the book. This book is a good text for students as well as for image analysis software developers.

Research paper thumbnail of Associations Between Stream Flow And Climatic Parameters At Kızılırmak River Basin In Turkey

Global Nest Journal, Jul 27, 2012

This study aims to demonstrate the use of association analysis for discovering the relationships ... more This study aims to demonstrate the use of association analysis for discovering the relationships between stream flow and climatic variables in the Kızılırmak River Basin in Turkey. Association analysis is a data mining technique that aims to discover rules in the form of A B that may occur in large datasets with frequency above a given threshold. A and B can be defined as events of a certain type, with the rule if A occurs then B occurs. In this study, A refers to climatic variable(s) (i.e., precipitation, temperature, wind speed, relative humidity) of certain magnitude, and B refers to the magnitude of stream flow. The interesting rules were quantified using support and confidence measures. Stream-flow data from three gauging stations in the Kızılırmak River Basin and climate data from three weather stations in the same basin were included in the analyses. All data were first segregated into three groups that were named as low, medium, and high. Low and high ranges of stream-flow data were further divided into three to increase our focus on extreme events. The analyses were conducted at the annual and seasonal timescales. The analyses indicated that the relationships between precipitation and temperature and stream flow are most prevalent but, relative humidity and wind speed are also important determinants of stream flow in the Kızılırmak River Basin.

Research paper thumbnail of Discovery of hydrometeorological patterns

Turkish Journal of Electrical Engineering and Computer Science, 2014

Hydrometeorological patterns can be defined as meaningful and nontrivial associations between hyd... more Hydrometeorological patterns can be defined as meaningful and nontrivial associations between hydrological and meteorological parameters over a region. Discovering hydrometeorological patterns is important for many applications, including forecasting hydrometeorological hazards (floods and droughts), predicting the hydrological responses of ungauged basins, and filling in missing hydrological or meteorological records. However, discovering these patterns is challenging due to the special characteristics of hydrological and meteorological data, and is computationally complex due to the archival history of the datasets. Moreover, defining monotonic interest measures to quantify these patterns is difficult. In this study, we propose a new monotonic interest measure, called the hydrometeorological prevalence index, and a novel algorithm for mining hydrometeorological patterns (HMP-Miner) out of large hydrological and meteorological datasets. Experimental evaluations using real datasets show that our proposed algorithm outperforms the naïve alternative in discovering hydrometeorological patterns efficiently.

Research paper thumbnail of Partial spatio-temporal co-occurrence pattern mining

Knowledge and Information Systems, 2014

ABSTRACT Spatio-temporal co-occurrence patterns represent subsets of object-types that are often ... more ABSTRACT Spatio-temporal co-occurrence patterns represent subsets of object-types that are often located together in space and time. The aim of the discovery of partial spatio-temporal co-occurrence patterns (PACOPs) is to find co-occurrences of the object-types that are partially present in the database. Discovering PACOPs is an important problem with many applications such as discovering interactions between animals in ecology, identifying tactics in battlefields and games, and identifying crime patterns in criminal databases. However, mining PACOPs is computationally very expensive because the interest measures are computationally complex, databases are larger due to the archival history, and the set of candidate patterns is exponential in the number of object-types. Previous studies on discovering spatio-temporal co-occurrence patterns do not take into account the presence period (i.e., lifetime) of the objects in the database. This paper defines the problem of mining PACOPs, proposes a new monotonic composite interest measure, and proposes novel PACOP mining algorithms. The experimental results show that the proposed algorithms are computationally more efficient than the naïve alternatives.

Research paper thumbnail of Modeling Spatial and Spatio-temporal Co-occurrence Patterns

I wish to thank several people who, in some way, have helped me during the development of this th... more I wish to thank several people who, in some way, have helped me during the development of this thesis. First, I would like to express my gratitude to my supervisor, Prof. Shashi Shekhar for his continued encouragement and invaluable suggestions during this study. It has been a great pleasure to do research with him. I would like to thank members of my advisory committee Professors Jaideep Srivastava, Arindam Banerjee, and Sudipto Banerjee for their guidance and constructive and useful comments. This research also benefited from discussions with Prof. Daniel Boley. I sincerely would like to thank to US Army Corps of Engineers, Topographic Engineering Center Researchers James P. Rogers and Dr. James A. Shine with whom I have collaborated during my PhD research. Their conceptual and technical insights into my thesis work have been invaluable. I am very grateful to the all members of Spatial Databases and Data Mining group members at the Department of Computer Science for their discussions, critiques, and contributions. There are three people I need to mention especially: Betsy George who helped me unselfishly whenever I needed, and Jin Soung Yoo and Baris Kazar who provided assistance during the early stages of this research. I apologize to all I have forgotten, among whom are the anonymous referees of my papers. I assure them that their help was valued as much as that of the people mentioned above. During the course of this work, at the University of Minnesota, I was supported in part by the Turkish Council of Higher Education and with grants from the US Army Corps of Engineers, Topographic Engineering Center. I am grateful for their support. Many people on the staff of the Computer Science Department assisted and encouraged me in various ways during my course of studies. I am grateful to Georganne Tolaas, Bonnie Klein, and Jennifer Barrett. I also would like to thank to Kim Koffolt for her comments to improve the readability of my papers and this dissertation. Lastly, and most importantly, I wish to thank my parents, Fatma Celik and Metin Celik ii for everything they gave me. I am very grateful for my wife Filiz Dadaser Celik, for her love, patience, and support during the PhD period. One of the best experiences that we lived through in this period was the birth of our children Cagri Fatih and Elif Huma.

Research paper thumbnail of Theory behind the SAR Model

SpringerBriefs in Computer Science, 2012