Ayhan Demiriz | Gebze Technical University (original) (raw)
Papers by Ayhan Demiriz
Nucleation and Atmospheric Aerosols, 2016
In modern economies, companies place a premium on managing their workforce efficiently especially... more In modern economies, companies place a premium on managing their workforce efficiently especially in labor intensive service sector, since the services have become the significant portion of the economies. Tour scheduling is an important tool to minimize the overall workforce costs while satisfying the minimum service level constraints. In this study, we consider the workforce management problem of an inbound call-center while satisfying the call demand within the short time periods with the minimum cost. We propose a mixed-integer programming model to assign workers to the daily shifts, to determine the weekly off-days, and to determine the timings of lunch and other daily breaks for each worker. The proposed model has been verified on the weekly demand data observed at a specific call center location of a satellite TV operator. The model was run on both 15 and 10 minutes demand estimation periods (planning time intervals).
A significant part of the overall automotive market is derived from the used car trade. Determini... more A significant part of the overall automotive market is derived from the used car trade. Determining correctly the used car market values will certainly help achieving fairer trade in many economies. By using the web listings as a proxy data source, we can create some models for the used car pricing based on the asking prices listed in the web adverts. This type of data acquisition requires a thorough data cleaning process to generate dependable statistical models after all. This paper proposes a survival analysis based approach to study the lifetime of the used car listings that can be found at web sites like Craigslist. Pricing models can be easily built to determine the market values of the used-cars from this type of data. One of the most important assumptions in our approach is to consider the delisting of an advert as a sale event. This is also equivalent to the death in the survival analysis context. Since the collected data have labels in terms of sale or not, we can utilize the predictive models to determine whether a particular car at a certain price will be successfully sold or not.
International Journal of Systems Science: Operations & Logistics, Oct 12, 2015
ABSTRACTForecasting is an essential task conducted regularly by competitive retailers around the ... more ABSTRACTForecasting is an essential task conducted regularly by competitive retailers around the world. Most retail decisions particularly markdowns are made based on the demand forecasts which may or may not be accurate in the first place. In this study, we propose a framework for forecasting weekly demands of retail items via linear regression models within multi-item groups that incorporate both positive and negative item associations. We then utilise dynamic pricing models to optimise markdown decisions based on the forecasts within multi-item groups. Grouping items can be considered as a form of variable selection to prevent the overfitting in prediction models. We report regression results from multi-item groupings besides results from single-item regression model on a real-world data-set provided by an apparel retailer. We then report markdown optimisation results for the single items and multi-item groupings that multi-item forecasting models are built upon. The results show that the regression mo...
Periodicals of Engineering and Natural Sciences (PEN), Oct 2, 2013
Sahibinden.com is a leading e-commerce site in Turkey where sellers (buyers) may advertise their ... more Sahibinden.com is a leading e-commerce site in Turkey where sellers (buyers) may advertise their goods (needs) with or without a fee. Since it generates a large volume of traffic to the classified car listings, the site plays an important role for determining the market value of the used cars. In this study, we first randomly selected 200 car classifieds from 950 new classified ads on the day of February 22, 2012. We then observed these listings on a daily basis for a month to determine the possible updates and deletions of the ads. We assume that if an ad is taken out it means that the car has been sold. In addition to the cars' features, we observed the posted price and the number of daily views of the ads throughout the data collection. Therefore one can construct survival models to study the effects of the features and price of a car on the life of the ad. In other words, it is possible to study that what features and price levels expedite the sales of used cars.
Mathematical Problems in Engineering, Nov 21, 2018
It may be very difficult to achieve the optimal shift schedule in call centers which have highly ... more It may be very difficult to achieve the optimal shift schedule in call centers which have highly uncertain and peaked demand during short time periods. Overlapping shift systems are usually designed for such cases. This paper studies shift scheduling and rostering problems for inbound call centers where overlapping shift systems are used. An integer programming model that determines which shifts to be opened and how many operators to be assigned to these shifts is proposed for the shift scheduling problem. For the rostering problem both integer programming and constraint programming models are developed to determine assignments of operators to all shifts, weekly days-off, and meal and relief break times of the operators. The proposed models are tested on real data supplied by an outsource call center and optimal results are found in an acceptable computation time. An improvement of 15% in the objective function compared to the current situation is observed with the proposed model for the shift scheduling problem. The computational performances of the proposed integer and constraint programming models for the rostering problem are compared using real data observed at a call center and simulated test instances. In addition, benchmark instances are used to compare our Constraint Programming (CP) approach with the existing models. The results of the comprehensive computational study indicate that the constraint programming model runs more efficiently than the integer programming model for the rostering problem. The originality of this research can be attributed to two contributions: (a) a model for shift scheduling problem and two models for rostering problem are presented in detail and compared using real data and (b) the rostering problem is considered as a task-resource allocation and considerably shorter computation times are obtained by modeling this new problem via CP.
In this work we made a study of several other works were the association and sequence mining tech... more In this work we made a study of several other works were the association and sequence mining techniques were applied to the field of web usage mining. This report is to be submitted to classification to the Data Mining course at the phd program "Diseno, Analisis y Aplicaciones de Sistemas Inteligentes", of University of Granada.
Lecture Notes in Computer Science, 2006
Used car trade is one of the major components of the world economies. It is not uncommon to sell ... more Used car trade is one of the major components of the world economies. It is not uncommon to sell a car by placing an internet advertisement irrespective of the geography in these days. A typical content of an advertisement is usually composed of two parts namely the structured and the free text data. The structured data may include some information about the asking price, make, model, year, mileage of the car and the contact info. In most cases, seller may give important clues about the car's current conditions in the free text data where the title (head) of the advertisement can be included as free text too. This paper reports preliminary results from a text mining study conducted on 75K used car internet listings collected from two major car listing web sites in Turkey. As expected, the words and the phrases related to the description of the car are observed to be frequent. The leading concepts in the free text are found to be regarding how to describe the current condition of a car, for example "no crash history".
Lecture Notes in Computer Science, 2009
This paper presents a simple method for mining both positive and negative association rules in da... more This paper presents a simple method for mining both positive and negative association rules in databases using singular value decomposition (SVD) and similarity measures. In literature, SVD is used for summarizing matrices. We use transaction-item price matrix to generate so called ratio rules in the literature. Transaction-item price matrix is formed by using the price data of corresponding items from the sales transactions. Ratio rules are generated by running SVD on transactionitem price matrix. We then use similarity measures on a subset of rules found by Pareto analysis to determine positive and negative associations. The proposed method can present the positive and negative associations with their strengths. We obtain subsequent results using cosine and correlation similarity measures.
Used-car trade has a significant portion in overall automobile market and determining the values ... more Used-car trade has a significant portion in overall automobile market and determining the values of the cars is an important problem. This study proposes a new methodology for determining the market value of the used-cars by observing the classifieds in an e-commerce site. This type of data acquisition plays an important role to build pricing models and to conduct further analysis in our approach. In data acquisition stage, a set of new listings are chosen randomly each day from an ecommerce site (a web site like Craigslist), then these listings are observed until a predetermined period (e.g. thirty days) or delisting time, whichever comes first. The crucial part of our approach is the assumption of a sale event when the listing is no longer available i.e. delisted from the e-commerce site. The proposed methodology may potentially be used for pricing any used item based on the web listings. A web site was developed to help clients/users for determining the market values of their cars as a decision support tool that can assess the likelihood of selling a particular car at a certain price. We also presented the applicability of predictive models to determine the likelihood of selling a car within thirty day period based on the price set by the owner.
Intelligent Data Analysis, Mar 27, 2020
We propose a hybrid application of Population Based Ant Colony Optimization that uses a data mini... more We propose a hybrid application of Population Based Ant Colony Optimization that uses a data mining procedure to wisely initialize the pheromone entries. Hybridization of metaheuristics with data mining techniques has been studied by several researchers in recent years. In this line of research, frequent patterns in a number of initial high-quality solutions are extracted to guide the subsequent iterations of an algorithm, which results in an improvement in solution quality and computational time. Our proposal possesses certain differences from and contributions to existing literature. Instead of one single run that incorporates both the main metaheuristic and the data mining module inside, we propose to carry out independent runs and collect elite sets over these trials. Another contribution is the way we use the knowledge gained from the application of the data mining module. The extracted knowledge is used to initialize the memory model in the algorithm rather than to construct new initial solutions. One additional contribution is the use of a path mining algorithm (a specific sequence mining algorithm) rather than Apriori-like association mining algorithms. Computational experiments, conducted both on symmetric Travelling Salesman Problem and symmetric/asymmetric Quadratic Assignment Problem instances, showed that our proposal produces significantly better results, and is more robust than pure applications of population-based ant colony optimization.
Computing, Oct 11, 2013
NoC technology is composed of packet-based interconnections, where the communication resources ar... more NoC technology is composed of packet-based interconnections, where the communication resources are distributed across the network. Therefore, the optimal resource utilization is a crucial consideration for efficient architectural designs. This paper studies the practicality of the Constraint Programming (CP) models for NoC architecture designs that effectively use a regular mesh with wormhole switching and the XY routing. The complexity of the CP models is compared with the earlier Mixed Integer Programming (MIP) models. Practical CP-based mapping and scheduling models are developed and results are reported on the benchmark datasets. Results indicate that mapping and scheduling problems can be solved at near optimality even under relatively shorter run-time limits as compared to those required by the MIP models.
Journal of Intelligent Manufacturing, Jan 8, 2009
Decision Support Systems, Dec 1, 2011
Association mining is the conventional data mining technique for analyzing market basket data and... more Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques.
Data Mining and Knowledge Discovery, Sep 1, 2004
Springer series in fashion business, 2018
Markdown decisions in retailing are made based on the demand forecasts which may or may not be ac... more Markdown decisions in retailing are made based on the demand forecasts which may or may not be accurate in the first place. In this chapter, we propose a framework for forecasting weekly demands of retail items via linear regression models within multi-item groups that incorporate both positive and negative item associations. We then utilize dynamic pricing models to optimize markdown decisions based on the forecasts within multi-item groups. Grouping items can be considered as a form of variable selection to prevent the overfitting in prediction models. We report regression results from multi-item groupings besides results from single-item regression model on a real-world dataset provided by an apparel retailer. We then report markdown optimization results for the single items and multi-item groupings that multi-item forecasting models are built upon. The results show that the regression models provide better estimates within multi-item groups compared to the single-item model. Moreover, the overall revenues achieved in multi-item markdown optimization across all grouping schemes are higher than the total revenue yielded by single-item markdown optimization scheme.
arXiv (Cornell University), Apr 8, 2023
We present a new data analysis perspective to determine variable importance regardless of the und... more We present a new data analysis perspective to determine variable importance regardless of the underlying learning task. Traditionally, variable selection is considered an important step in supervised learning for both classification and regression problems. The variable selection also becomes critical when costs associated with the data collection and storage are considerably high for cases like remote sensing. Therefore, we propose a new methodology to select important variables from the data by first creating dependency networks among all variables and then ranking them (i.e. nodes) by graph centrality measures. Selecting Top-n variables according to preferred centrality measure will yield a strong candidate subset of variables for further learning tasks e.g. clustering. We present our tool as a Shiny app which is a user-friendly interface development environment. We also extend the user interface for two well-known unsupervised variable selection methods from literature for comparison reasons.
Nucleation and Atmospheric Aerosols, 2016
In modern economies, companies place a premium on managing their workforce efficiently especially... more In modern economies, companies place a premium on managing their workforce efficiently especially in labor intensive service sector, since the services have become the significant portion of the economies. Tour scheduling is an important tool to minimize the overall workforce costs while satisfying the minimum service level constraints. In this study, we consider the workforce management problem of an inbound call-center while satisfying the call demand within the short time periods with the minimum cost. We propose a mixed-integer programming model to assign workers to the daily shifts, to determine the weekly off-days, and to determine the timings of lunch and other daily breaks for each worker. The proposed model has been verified on the weekly demand data observed at a specific call center location of a satellite TV operator. The model was run on both 15 and 10 minutes demand estimation periods (planning time intervals).
A significant part of the overall automotive market is derived from the used car trade. Determini... more A significant part of the overall automotive market is derived from the used car trade. Determining correctly the used car market values will certainly help achieving fairer trade in many economies. By using the web listings as a proxy data source, we can create some models for the used car pricing based on the asking prices listed in the web adverts. This type of data acquisition requires a thorough data cleaning process to generate dependable statistical models after all. This paper proposes a survival analysis based approach to study the lifetime of the used car listings that can be found at web sites like Craigslist. Pricing models can be easily built to determine the market values of the used-cars from this type of data. One of the most important assumptions in our approach is to consider the delisting of an advert as a sale event. This is also equivalent to the death in the survival analysis context. Since the collected data have labels in terms of sale or not, we can utilize the predictive models to determine whether a particular car at a certain price will be successfully sold or not.
International Journal of Systems Science: Operations & Logistics, Oct 12, 2015
ABSTRACTForecasting is an essential task conducted regularly by competitive retailers around the ... more ABSTRACTForecasting is an essential task conducted regularly by competitive retailers around the world. Most retail decisions particularly markdowns are made based on the demand forecasts which may or may not be accurate in the first place. In this study, we propose a framework for forecasting weekly demands of retail items via linear regression models within multi-item groups that incorporate both positive and negative item associations. We then utilise dynamic pricing models to optimise markdown decisions based on the forecasts within multi-item groups. Grouping items can be considered as a form of variable selection to prevent the overfitting in prediction models. We report regression results from multi-item groupings besides results from single-item regression model on a real-world data-set provided by an apparel retailer. We then report markdown optimisation results for the single items and multi-item groupings that multi-item forecasting models are built upon. The results show that the regression mo...
Periodicals of Engineering and Natural Sciences (PEN), Oct 2, 2013
Sahibinden.com is a leading e-commerce site in Turkey where sellers (buyers) may advertise their ... more Sahibinden.com is a leading e-commerce site in Turkey where sellers (buyers) may advertise their goods (needs) with or without a fee. Since it generates a large volume of traffic to the classified car listings, the site plays an important role for determining the market value of the used cars. In this study, we first randomly selected 200 car classifieds from 950 new classified ads on the day of February 22, 2012. We then observed these listings on a daily basis for a month to determine the possible updates and deletions of the ads. We assume that if an ad is taken out it means that the car has been sold. In addition to the cars' features, we observed the posted price and the number of daily views of the ads throughout the data collection. Therefore one can construct survival models to study the effects of the features and price of a car on the life of the ad. In other words, it is possible to study that what features and price levels expedite the sales of used cars.
Mathematical Problems in Engineering, Nov 21, 2018
It may be very difficult to achieve the optimal shift schedule in call centers which have highly ... more It may be very difficult to achieve the optimal shift schedule in call centers which have highly uncertain and peaked demand during short time periods. Overlapping shift systems are usually designed for such cases. This paper studies shift scheduling and rostering problems for inbound call centers where overlapping shift systems are used. An integer programming model that determines which shifts to be opened and how many operators to be assigned to these shifts is proposed for the shift scheduling problem. For the rostering problem both integer programming and constraint programming models are developed to determine assignments of operators to all shifts, weekly days-off, and meal and relief break times of the operators. The proposed models are tested on real data supplied by an outsource call center and optimal results are found in an acceptable computation time. An improvement of 15% in the objective function compared to the current situation is observed with the proposed model for the shift scheduling problem. The computational performances of the proposed integer and constraint programming models for the rostering problem are compared using real data observed at a call center and simulated test instances. In addition, benchmark instances are used to compare our Constraint Programming (CP) approach with the existing models. The results of the comprehensive computational study indicate that the constraint programming model runs more efficiently than the integer programming model for the rostering problem. The originality of this research can be attributed to two contributions: (a) a model for shift scheduling problem and two models for rostering problem are presented in detail and compared using real data and (b) the rostering problem is considered as a task-resource allocation and considerably shorter computation times are obtained by modeling this new problem via CP.
In this work we made a study of several other works were the association and sequence mining tech... more In this work we made a study of several other works were the association and sequence mining techniques were applied to the field of web usage mining. This report is to be submitted to classification to the Data Mining course at the phd program "Diseno, Analisis y Aplicaciones de Sistemas Inteligentes", of University of Granada.
Lecture Notes in Computer Science, 2006
Used car trade is one of the major components of the world economies. It is not uncommon to sell ... more Used car trade is one of the major components of the world economies. It is not uncommon to sell a car by placing an internet advertisement irrespective of the geography in these days. A typical content of an advertisement is usually composed of two parts namely the structured and the free text data. The structured data may include some information about the asking price, make, model, year, mileage of the car and the contact info. In most cases, seller may give important clues about the car's current conditions in the free text data where the title (head) of the advertisement can be included as free text too. This paper reports preliminary results from a text mining study conducted on 75K used car internet listings collected from two major car listing web sites in Turkey. As expected, the words and the phrases related to the description of the car are observed to be frequent. The leading concepts in the free text are found to be regarding how to describe the current condition of a car, for example "no crash history".
Lecture Notes in Computer Science, 2009
This paper presents a simple method for mining both positive and negative association rules in da... more This paper presents a simple method for mining both positive and negative association rules in databases using singular value decomposition (SVD) and similarity measures. In literature, SVD is used for summarizing matrices. We use transaction-item price matrix to generate so called ratio rules in the literature. Transaction-item price matrix is formed by using the price data of corresponding items from the sales transactions. Ratio rules are generated by running SVD on transactionitem price matrix. We then use similarity measures on a subset of rules found by Pareto analysis to determine positive and negative associations. The proposed method can present the positive and negative associations with their strengths. We obtain subsequent results using cosine and correlation similarity measures.
Used-car trade has a significant portion in overall automobile market and determining the values ... more Used-car trade has a significant portion in overall automobile market and determining the values of the cars is an important problem. This study proposes a new methodology for determining the market value of the used-cars by observing the classifieds in an e-commerce site. This type of data acquisition plays an important role to build pricing models and to conduct further analysis in our approach. In data acquisition stage, a set of new listings are chosen randomly each day from an ecommerce site (a web site like Craigslist), then these listings are observed until a predetermined period (e.g. thirty days) or delisting time, whichever comes first. The crucial part of our approach is the assumption of a sale event when the listing is no longer available i.e. delisted from the e-commerce site. The proposed methodology may potentially be used for pricing any used item based on the web listings. A web site was developed to help clients/users for determining the market values of their cars as a decision support tool that can assess the likelihood of selling a particular car at a certain price. We also presented the applicability of predictive models to determine the likelihood of selling a car within thirty day period based on the price set by the owner.
Intelligent Data Analysis, Mar 27, 2020
We propose a hybrid application of Population Based Ant Colony Optimization that uses a data mini... more We propose a hybrid application of Population Based Ant Colony Optimization that uses a data mining procedure to wisely initialize the pheromone entries. Hybridization of metaheuristics with data mining techniques has been studied by several researchers in recent years. In this line of research, frequent patterns in a number of initial high-quality solutions are extracted to guide the subsequent iterations of an algorithm, which results in an improvement in solution quality and computational time. Our proposal possesses certain differences from and contributions to existing literature. Instead of one single run that incorporates both the main metaheuristic and the data mining module inside, we propose to carry out independent runs and collect elite sets over these trials. Another contribution is the way we use the knowledge gained from the application of the data mining module. The extracted knowledge is used to initialize the memory model in the algorithm rather than to construct new initial solutions. One additional contribution is the use of a path mining algorithm (a specific sequence mining algorithm) rather than Apriori-like association mining algorithms. Computational experiments, conducted both on symmetric Travelling Salesman Problem and symmetric/asymmetric Quadratic Assignment Problem instances, showed that our proposal produces significantly better results, and is more robust than pure applications of population-based ant colony optimization.
Computing, Oct 11, 2013
NoC technology is composed of packet-based interconnections, where the communication resources ar... more NoC technology is composed of packet-based interconnections, where the communication resources are distributed across the network. Therefore, the optimal resource utilization is a crucial consideration for efficient architectural designs. This paper studies the practicality of the Constraint Programming (CP) models for NoC architecture designs that effectively use a regular mesh with wormhole switching and the XY routing. The complexity of the CP models is compared with the earlier Mixed Integer Programming (MIP) models. Practical CP-based mapping and scheduling models are developed and results are reported on the benchmark datasets. Results indicate that mapping and scheduling problems can be solved at near optimality even under relatively shorter run-time limits as compared to those required by the MIP models.
Journal of Intelligent Manufacturing, Jan 8, 2009
Decision Support Systems, Dec 1, 2011
Association mining is the conventional data mining technique for analyzing market basket data and... more Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques.
Data Mining and Knowledge Discovery, Sep 1, 2004
Springer series in fashion business, 2018
Markdown decisions in retailing are made based on the demand forecasts which may or may not be ac... more Markdown decisions in retailing are made based on the demand forecasts which may or may not be accurate in the first place. In this chapter, we propose a framework for forecasting weekly demands of retail items via linear regression models within multi-item groups that incorporate both positive and negative item associations. We then utilize dynamic pricing models to optimize markdown decisions based on the forecasts within multi-item groups. Grouping items can be considered as a form of variable selection to prevent the overfitting in prediction models. We report regression results from multi-item groupings besides results from single-item regression model on a real-world dataset provided by an apparel retailer. We then report markdown optimization results for the single items and multi-item groupings that multi-item forecasting models are built upon. The results show that the regression models provide better estimates within multi-item groups compared to the single-item model. Moreover, the overall revenues achieved in multi-item markdown optimization across all grouping schemes are higher than the total revenue yielded by single-item markdown optimization scheme.
arXiv (Cornell University), Apr 8, 2023
We present a new data analysis perspective to determine variable importance regardless of the und... more We present a new data analysis perspective to determine variable importance regardless of the underlying learning task. Traditionally, variable selection is considered an important step in supervised learning for both classification and regression problems. The variable selection also becomes critical when costs associated with the data collection and storage are considerably high for cases like remote sensing. Therefore, we propose a new methodology to select important variables from the data by first creating dependency networks among all variables and then ranking them (i.e. nodes) by graph centrality measures. Selecting Top-n variables according to preferred centrality measure will yield a strong candidate subset of variables for further learning tasks e.g. clustering. We present our tool as a Shiny app which is a user-friendly interface development environment. We also extend the user interface for two well-known unsupervised variable selection methods from literature for comparison reasons.