Ross Clement - Academia.edu (original) (raw)
Papers by Ross Clement
CERN European Organization for Nuclear Research - Zenodo, Jun 28, 2015
Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterpri... more Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterprise (SME) wholesalers. This paper is an attempt to understand the cause for the high level of variability such as weather, holidays etc., in demand of SME wholesalers. Therefore, understanding the significance of unidentified factors may improve the forecasting accuracy. This paper presents the current literature on the factors used to predict demand and the existing forecasting techniques of short shelf life products. It then investigates a variety of internal and external possible factors, some of which is not used by other researchers in the demand prediction process. The results presented in this paper are further analysed using a number of techniques to minimize noise in the data. For the analysis past sales data (January 2009 to May 2014) from a UK based SME wholesaler is used and the results presented are limited to product 'Milk' focused on café's in derby. The correlation analysis is done to check the dependencies of variability factor on the actual demand. Further PCA analysis is done to understand the significance of factors identified using correlation. The PCA results suggest that the cloud cover, weather summary and temperature are the most significant factors that can be used in forecasting the demand. The correlation of the above three factors increased relative to monthly and becomes more stable compared to the weekly and daily demand.
Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterpri... more Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterprise (SME) wholesalers. This paper is an attempt to understand the cause for the high level of variability such as weather, holidays etc., in demand of SME wholesalers. Therefore, understanding the significance of unidentified factors may improve the forecasting accuracy. This paper presents the current literature on the factors used to predict demand and the existing forecasting techniques of short shelf life products. It then investigates a variety of internal and external possible factors, some of which is not used by other researchers in the demand prediction process. The results presented in this paper are further analysed using a number of techniques to minimize noise in the data. For the analysis past sales data (January 2009 to May 2014) from a UK based SME wholesaler is used and the results presented are limited to product 'Milk' focused on café's in derby. The correla...
Large, real world, data sets have been investigated in the context of Authorship Attribution of r... more Large, real world, data sets have been investigated in the context of Authorship Attribution of real world documents. Ngram measures can be used to accurately assign authorship for long documents such as novels. A number of 5 (authors) # 5 (movies) arrays of movie reviews were acquired from the Internet Movie Database. Both ngram and naive Bayes classifiers were used to classify along both the authorship and topic (movie) axes. Both approaches yielded similar results, and authorship was as accurately detected, or more accurately detected, than topic. Part of speech tagging and function-word lists were used to investigate the influence of structure on classification tasks on documents with meaning removed but grammatical structure intact.
The accuracy of demand forecasting for companies in the food industry is highly important, especi... more The accuracy of demand forecasting for companies in the food industry is highly important, especially for those that deal with products that require refrigeration or that have short shelf-life, given the fact that the freshness and overall quality of the products offered can affect the profit margins for business and the health of the consumers (Doganis et al., 2006). Furthermore, Agrawal and Schorling (1996) as cited by Chen and Ou (2008) highlighted that having easy access to accurate and up-to-date information about demand forecasting is vital for any company aiming to maintain high levels of competitiveness in their market sector. This is even more important for fresh foods wholesalers, whose profit is directly affected by wasted or unsold products and unsatisfied customers (unfulfilled demand), especially when storage facilities are limited
A web-based final year project management system, ProMS, has been created and deployed to help co... more A web-based final year project management system, ProMS, has been created and deployed to help coordinate undergraduate final year projects including automating practical tasks such as the submission of documents. ProMS helps introduce students to potential supervisors, through both student access to staff information, and staff access to draft copies of students’ project proposals. Many students who used the system found that it helped them become aware of potential supervisors whom they had never met, and a sizeable proportion of these students listed staff they were previously unaware of as preferred supervisors. The system helps greatly in expanding students’ knowledge of potential project supervisors. Following the deadline for student project proposals, ProMS made it possible to generate a draft allocation of students to supervisors in only a few hours
Purpose: This paper looks at characterization of B2B customers of a fresh food wholesale company ... more Purpose: This paper looks at characterization of B2B customers of a fresh food wholesale company supplying SME clients in terms of their weekly orders of a variety of fresh products. Customers whose orders can be predicted (days of the week order is placed, size of order) can easily be supplied without risk of waste due to the wholesaler ordering stock that is not sold to customers before it must be disposed of. Greater understanding of customer order patterns is necessary to improve demand prediction and reduce waste. Research Approach: Extensive real-world data from a fresh food wholesaler has been analysed in bulk. Customers’ weekly orders have been classified into one of nine classes depending on how each week’s order compares to the previous week. Equal order amounts on the same day (or days) of the week as the previous week are the most predictable class. Varying order amounts for orders placed on different days of the week are a much less predictable class. Other classes represent customers who either cease ordering after having made previous orders, or who place an order after not ordering in previous weeks. K-means clustering has also been used to extract clusters of customers showing similar ordering patterns from the customer base. These functions have been integrated into a data visualization tool which displays the clusters in terms of the frequency of occurrence of order classes, and their standard deviation within the clusters
A prototype system for sending SMS text messages to students telling them about announcements has... more A prototype system for sending SMS text messages to students telling them about announcements has been designed and partially implemented. Experiments have been performed to test whether automatic text classification can be used to decide which announcements posted by tutors are urgent and that a SMS text message should be sent informing students. The accuracy of a naive Bayes classifier is not sufficient in itself to decide this, but a flexible classifier and the ability of tutors to override its decisions has promise. How the system would be used would depend on management policies concerning the effects of classification errors.
An agent-based simulation has been built to model speciation in cichlid fishes in the Great Lakes... more An agent-based simulation has been built to model speciation in cichlid fishes in the Great Lakes of Africa. A real natural system has been chosen as the target of simulation, rather than a generalised system. This focusses research towards open problems in cichlid biology, and provides a library of field research to drive the design and parametrisation of the simulation. Visualisations of the end results of simulations are presented to confirm that the simulated fish actually do speciate. A further experiment suggests that the potential for organisms to adapt has strong effects on the competitive exclusion principle, but that this cannot be used to explain the patterns of cichlid species found on rocky reefs in African lakes. The simulation has been written in pure Java rather than a general agent-based modelling platform. The reasons for using Java and a number of alternative platforms are described.
Research in bus and driver scheduling at Leeds University has been taking advantage of new techno... more Research in bus and driver scheduling at Leeds University has been taking advantage of new technologies since the early 1960s, with the objective of yielding better schedules more quickly. Here recent SERC supported work is presented. Most bus scheduling systems produce schedules for only one operating scenario. In practice, planning and scheduling are interconnected, and operators have to explore flexibility in timings and route structures. An interactive framework is described which integrates tools designed to minimise trial and error in the process. We also report on two new approaches to driver scheduling. A knowledge-based approach has resulted in an effective set of domain specific rules critically analysing a given problem and accurately estimating the schedule composition and total number of duties required. This estimate will strengthen existing and new methods. Genetic algorithms are powerful in deriving refined solutions to problems very quickly. However, driver scheduli...
Lecture Notes in Economics and Mathematical Systems, 1995
Genetic algorithms have been applied to bus driver scheduling and compared to other approaches su... more Genetic algorithms have been applied to bus driver scheduling and compared to other approaches such as simulated annealing. Bus driver scheduling is a more difficult domain than most genetic algorithm applications. Special purpose genetic algorithms have been developed that search constrained versions of the initial search space. Greedy algorithms are used for crossovers, though these had to be randomized to give good results. Special purpose optimizing mutation improves search in domains too large for traditional mutation to be useful. The greedy genetic algorithm produces schedules typically within a few duties of the optimum solution. Further theoretical analyses are expected to result in new methods th.at will improve results. The technology developed may also have applications in other areas of transport scheduling.
An Agent-Based model of speciation in cichlid fish has been implemented. When run, this generates... more An Agent-Based model of speciation in cichlid fish has been implemented. When run, this generates large amounts of trace data in which speciation is an implicit, near unobservable, processes. Fuzzy C-Means Clustering is used to identify species extant at the end of simulation, and the power set of these species is the potential set of ancestral species. Membership values for all fish in each of these theoretical ancestral species are calculated, and total set membership for each of these species is plotted against time. The resulting graph is to be a clear visualisation of the process of speciation, and the appearance and disappearance of intermediate species. Our approach allows the visualisation of speciation resulting in larger numbers of final species than was possible using previous techniques based on measuring correlations between explicit properties of modeled organisms, and is also unaffected by changes to the properties used to model fish.
Literary and Linguistic Computing, 2003
Large, real world, data sets have been investigated in the context of Authorship Attribution of r... more Large, real world, data sets have been investigated in the context of Authorship Attribution of real world documents. Ngram measures can be used to accurately assign authorship for long documents such as novels. A number of 5 (authors) ϫ 5 (movies) arrays of movie reviews were acquired from the Internet Movie Database. Both ngram and naive Bayes classifiers were used to classify along both the authorship and topic (movie) axes. Both approaches yielded similar results, and authorship was as accurately detected, or more accurately detected, than topic. Part of speech tagging and function-word lists were used to investigate the influence of structure on classification tasks on documents with meaning removed but grammatical structure intact.
International Journal of Man-Machine Studies, 1992
... Attempts to ease this bottleneck by automating knowledge acquisition have spawned a great var... more ... Attempts to ease this bottleneck by automating knowledge acquisition have spawned a great variety of approaches (eg Gaines Boose, 1988; Michie, 1982) and an even greater number of implemented systems (eg Boose Bradshaw, 1987; Quinlan et al., 1986; Clement, 1991). ...
Artificial Life, 2006
The Cichlid Speciation Project (CSP) is an ALife simulation system for investigating open problem... more The Cichlid Speciation Project (CSP) is an ALife simulation system for investigating open problems in the speciation of African cichlid fish. The CSP can be used to perform a wide range of experiments that show that speciation is a natural consequence of certain biological systems. A visualization system capable of extracting the history of speciation from low-level trace data and creating a phylogenetic tree has been implemented. Unlike previous approaches, this visualization system presents a concrete trace of speciation, rather than a summary of low-level information from which the viewer can make subjective decisions on how speciation progressed. The phylogenetic trees are a more objective visualization of speciation, and enable automated collection and summarization of the results of experiments. The visualization system is used to create a phylogenetic tree from an experiment that models sympatric speciation.
Agent-based simulation has been used to simulate customers of a B2B fresh food supplier, in order... more Agent-based simulation has been used to simulate customers of a B2B fresh food supplier, in order to examine why total orders vary considerably on a day by day basis. Different types of virtual customers can be included in the simulation, ordering products using different strategies including their own demand prediction. This simulation suggests that customers changing the day of their order is the largest cause of daily order variance
CERN European Organization for Nuclear Research - Zenodo, Jun 28, 2015
Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterpri... more Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterprise (SME) wholesalers. This paper is an attempt to understand the cause for the high level of variability such as weather, holidays etc., in demand of SME wholesalers. Therefore, understanding the significance of unidentified factors may improve the forecasting accuracy. This paper presents the current literature on the factors used to predict demand and the existing forecasting techniques of short shelf life products. It then investigates a variety of internal and external possible factors, some of which is not used by other researchers in the demand prediction process. The results presented in this paper are further analysed using a number of techniques to minimize noise in the data. For the analysis past sales data (January 2009 to May 2014) from a UK based SME wholesaler is used and the results presented are limited to product 'Milk' focused on café's in derby. The correlation analysis is done to check the dependencies of variability factor on the actual demand. Further PCA analysis is done to understand the significance of factors identified using correlation. The PCA results suggest that the cloud cover, weather summary and temperature are the most significant factors that can be used in forecasting the demand. The correlation of the above three factors increased relative to monthly and becomes more stable compared to the weekly and daily demand.
Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterpri... more Accurate forecasting of fresh produce demand is one the challenges faced by Small Medium Enterprise (SME) wholesalers. This paper is an attempt to understand the cause for the high level of variability such as weather, holidays etc., in demand of SME wholesalers. Therefore, understanding the significance of unidentified factors may improve the forecasting accuracy. This paper presents the current literature on the factors used to predict demand and the existing forecasting techniques of short shelf life products. It then investigates a variety of internal and external possible factors, some of which is not used by other researchers in the demand prediction process. The results presented in this paper are further analysed using a number of techniques to minimize noise in the data. For the analysis past sales data (January 2009 to May 2014) from a UK based SME wholesaler is used and the results presented are limited to product 'Milk' focused on café's in derby. The correla...
Large, real world, data sets have been investigated in the context of Authorship Attribution of r... more Large, real world, data sets have been investigated in the context of Authorship Attribution of real world documents. Ngram measures can be used to accurately assign authorship for long documents such as novels. A number of 5 (authors) # 5 (movies) arrays of movie reviews were acquired from the Internet Movie Database. Both ngram and naive Bayes classifiers were used to classify along both the authorship and topic (movie) axes. Both approaches yielded similar results, and authorship was as accurately detected, or more accurately detected, than topic. Part of speech tagging and function-word lists were used to investigate the influence of structure on classification tasks on documents with meaning removed but grammatical structure intact.
The accuracy of demand forecasting for companies in the food industry is highly important, especi... more The accuracy of demand forecasting for companies in the food industry is highly important, especially for those that deal with products that require refrigeration or that have short shelf-life, given the fact that the freshness and overall quality of the products offered can affect the profit margins for business and the health of the consumers (Doganis et al., 2006). Furthermore, Agrawal and Schorling (1996) as cited by Chen and Ou (2008) highlighted that having easy access to accurate and up-to-date information about demand forecasting is vital for any company aiming to maintain high levels of competitiveness in their market sector. This is even more important for fresh foods wholesalers, whose profit is directly affected by wasted or unsold products and unsatisfied customers (unfulfilled demand), especially when storage facilities are limited
A web-based final year project management system, ProMS, has been created and deployed to help co... more A web-based final year project management system, ProMS, has been created and deployed to help coordinate undergraduate final year projects including automating practical tasks such as the submission of documents. ProMS helps introduce students to potential supervisors, through both student access to staff information, and staff access to draft copies of students’ project proposals. Many students who used the system found that it helped them become aware of potential supervisors whom they had never met, and a sizeable proportion of these students listed staff they were previously unaware of as preferred supervisors. The system helps greatly in expanding students’ knowledge of potential project supervisors. Following the deadline for student project proposals, ProMS made it possible to generate a draft allocation of students to supervisors in only a few hours
Purpose: This paper looks at characterization of B2B customers of a fresh food wholesale company ... more Purpose: This paper looks at characterization of B2B customers of a fresh food wholesale company supplying SME clients in terms of their weekly orders of a variety of fresh products. Customers whose orders can be predicted (days of the week order is placed, size of order) can easily be supplied without risk of waste due to the wholesaler ordering stock that is not sold to customers before it must be disposed of. Greater understanding of customer order patterns is necessary to improve demand prediction and reduce waste. Research Approach: Extensive real-world data from a fresh food wholesaler has been analysed in bulk. Customers’ weekly orders have been classified into one of nine classes depending on how each week’s order compares to the previous week. Equal order amounts on the same day (or days) of the week as the previous week are the most predictable class. Varying order amounts for orders placed on different days of the week are a much less predictable class. Other classes represent customers who either cease ordering after having made previous orders, or who place an order after not ordering in previous weeks. K-means clustering has also been used to extract clusters of customers showing similar ordering patterns from the customer base. These functions have been integrated into a data visualization tool which displays the clusters in terms of the frequency of occurrence of order classes, and their standard deviation within the clusters
A prototype system for sending SMS text messages to students telling them about announcements has... more A prototype system for sending SMS text messages to students telling them about announcements has been designed and partially implemented. Experiments have been performed to test whether automatic text classification can be used to decide which announcements posted by tutors are urgent and that a SMS text message should be sent informing students. The accuracy of a naive Bayes classifier is not sufficient in itself to decide this, but a flexible classifier and the ability of tutors to override its decisions has promise. How the system would be used would depend on management policies concerning the effects of classification errors.
An agent-based simulation has been built to model speciation in cichlid fishes in the Great Lakes... more An agent-based simulation has been built to model speciation in cichlid fishes in the Great Lakes of Africa. A real natural system has been chosen as the target of simulation, rather than a generalised system. This focusses research towards open problems in cichlid biology, and provides a library of field research to drive the design and parametrisation of the simulation. Visualisations of the end results of simulations are presented to confirm that the simulated fish actually do speciate. A further experiment suggests that the potential for organisms to adapt has strong effects on the competitive exclusion principle, but that this cannot be used to explain the patterns of cichlid species found on rocky reefs in African lakes. The simulation has been written in pure Java rather than a general agent-based modelling platform. The reasons for using Java and a number of alternative platforms are described.
Research in bus and driver scheduling at Leeds University has been taking advantage of new techno... more Research in bus and driver scheduling at Leeds University has been taking advantage of new technologies since the early 1960s, with the objective of yielding better schedules more quickly. Here recent SERC supported work is presented. Most bus scheduling systems produce schedules for only one operating scenario. In practice, planning and scheduling are interconnected, and operators have to explore flexibility in timings and route structures. An interactive framework is described which integrates tools designed to minimise trial and error in the process. We also report on two new approaches to driver scheduling. A knowledge-based approach has resulted in an effective set of domain specific rules critically analysing a given problem and accurately estimating the schedule composition and total number of duties required. This estimate will strengthen existing and new methods. Genetic algorithms are powerful in deriving refined solutions to problems very quickly. However, driver scheduli...
Lecture Notes in Economics and Mathematical Systems, 1995
Genetic algorithms have been applied to bus driver scheduling and compared to other approaches su... more Genetic algorithms have been applied to bus driver scheduling and compared to other approaches such as simulated annealing. Bus driver scheduling is a more difficult domain than most genetic algorithm applications. Special purpose genetic algorithms have been developed that search constrained versions of the initial search space. Greedy algorithms are used for crossovers, though these had to be randomized to give good results. Special purpose optimizing mutation improves search in domains too large for traditional mutation to be useful. The greedy genetic algorithm produces schedules typically within a few duties of the optimum solution. Further theoretical analyses are expected to result in new methods th.at will improve results. The technology developed may also have applications in other areas of transport scheduling.
An Agent-Based model of speciation in cichlid fish has been implemented. When run, this generates... more An Agent-Based model of speciation in cichlid fish has been implemented. When run, this generates large amounts of trace data in which speciation is an implicit, near unobservable, processes. Fuzzy C-Means Clustering is used to identify species extant at the end of simulation, and the power set of these species is the potential set of ancestral species. Membership values for all fish in each of these theoretical ancestral species are calculated, and total set membership for each of these species is plotted against time. The resulting graph is to be a clear visualisation of the process of speciation, and the appearance and disappearance of intermediate species. Our approach allows the visualisation of speciation resulting in larger numbers of final species than was possible using previous techniques based on measuring correlations between explicit properties of modeled organisms, and is also unaffected by changes to the properties used to model fish.
Literary and Linguistic Computing, 2003
Large, real world, data sets have been investigated in the context of Authorship Attribution of r... more Large, real world, data sets have been investigated in the context of Authorship Attribution of real world documents. Ngram measures can be used to accurately assign authorship for long documents such as novels. A number of 5 (authors) ϫ 5 (movies) arrays of movie reviews were acquired from the Internet Movie Database. Both ngram and naive Bayes classifiers were used to classify along both the authorship and topic (movie) axes. Both approaches yielded similar results, and authorship was as accurately detected, or more accurately detected, than topic. Part of speech tagging and function-word lists were used to investigate the influence of structure on classification tasks on documents with meaning removed but grammatical structure intact.
International Journal of Man-Machine Studies, 1992
... Attempts to ease this bottleneck by automating knowledge acquisition have spawned a great var... more ... Attempts to ease this bottleneck by automating knowledge acquisition have spawned a great variety of approaches (eg Gaines Boose, 1988; Michie, 1982) and an even greater number of implemented systems (eg Boose Bradshaw, 1987; Quinlan et al., 1986; Clement, 1991). ...
Artificial Life, 2006
The Cichlid Speciation Project (CSP) is an ALife simulation system for investigating open problem... more The Cichlid Speciation Project (CSP) is an ALife simulation system for investigating open problems in the speciation of African cichlid fish. The CSP can be used to perform a wide range of experiments that show that speciation is a natural consequence of certain biological systems. A visualization system capable of extracting the history of speciation from low-level trace data and creating a phylogenetic tree has been implemented. Unlike previous approaches, this visualization system presents a concrete trace of speciation, rather than a summary of low-level information from which the viewer can make subjective decisions on how speciation progressed. The phylogenetic trees are a more objective visualization of speciation, and enable automated collection and summarization of the results of experiments. The visualization system is used to create a phylogenetic tree from an experiment that models sympatric speciation.
Agent-based simulation has been used to simulate customers of a B2B fresh food supplier, in order... more Agent-based simulation has been used to simulate customers of a B2B fresh food supplier, in order to examine why total orders vary considerably on a day by day basis. Different types of virtual customers can be included in the simulation, ordering products using different strategies including their own demand prediction. This simulation suggests that customers changing the day of their order is the largest cause of daily order variance