Martin Riekert | University of Hohenheim, Stuttgart (original) (raw)
Papers by Martin Riekert
International Journal of Quality & Reliability Management
PurposeMachine learning (ML) models are increasingly being used in industrial maintenance to pred... more PurposeMachine learning (ML) models are increasingly being used in industrial maintenance to predict system failures. However, less is known about how the time windows for reading data and making predictions affect performance. Therefore, the purpose of this research is to assess the impact of different sliding windows on prediction performance.Design/methodology/approachThe authors conducted a factorial experiment using high dimensional machine data covering two years of operation, taken from a real industrial case for the production of high-precision milled and turned parts. The impacts of different reading and prediction windows were tested for three ML algorithms (random forest, support vector machines and logistic regression) and four metrics (accuracy, precision, recall and F-score).FindingsThe results reveal (1) the critical role of the prediction window contingent upon the application domain, (2) a non-monotonic relationship between the reading window and performance, and (3...
SN Computer Science
Text classification is important to better understand online media. A major problem for creating ... more Text classification is important to better understand online media. A major problem for creating accurate text classifiers using machine learning is small training sets due to the cost of annotating them. On this basis, we investigated how SVM and NBSVM text classifiers should be designed to achieve high accuracy and how the training sets should be sized to efficiently use annotation labor. We used a four-way repeated-measures full-factorial design of 32 design factor combinations. For each design factor combination 22 training set sizes were examined. These training sets were subsets of seven public text datasets. We study the statistical variance of accuracy estimates by randomly drawing new training sets, resulting in accuracy estimates for 98,560 different experimental runs. Our major contribution is a set of empirically evaluated guidelines for creating online media text classifiers using small training sets. We recommend uni- and bi-gram features as text representation, btc te...
Computers and Electronics in Agriculture, 2021
Continuous monitoring of pig posture is important for better understanding animal behavior. Previ... more Continuous monitoring of pig posture is important for better understanding animal behavior. Previous studies focused on day recordings and did not investigate how deep learning models could be applied during longer periods including night recordings under near-infrared light from several pens. Therefore, the objective of this research was to study how a suitable deep learning model for continuous 24/7 pig posture detection could be achieved. We selected a deep learning model from over 150 different model configurations covering experiments concerning 3 detection heads, 4 base networks, 5 transfer datasets and 12 data augmentations. For this purpose, we test and validate our models using 4690 annotations of randomly drawn images from 24/7 video recordings covering 2 fattening periods from 10 pens. Our results indicate that pig position and posture was detected on the test set with 84% mAP@0.50 (49% mAP@[0.50:0.05:0.95]) for day recordings and for night recordings 58% mAP@0.50 (29% mAP@[0.50:0.05:0.95]) was achieved. The main reason for lower mAP during night recordings was degraded near-infrared image quality. Our work reports important findings concerning the applicability of deep learning models on night near-infrared recordings for posture detection. The dataset is publicly available for further research and industrial applications.
Big Data Analytics ermöglicht Informationen aus Daten automatisch, objektiv und kostengünstig zu ... more Big Data Analytics ermöglicht Informationen aus Daten automatisch, objektiv und kostengünstig zu extrahieren. So können Daten zur Haltungsumgebung (z. B. Fütterungsoder Temperaturdaten), aber auch Daten aus Verhaltensbeobachtungen mittels Videokameras oder RFID, analysiert und zur Verbesserung des Tierwohls eingesetzt werden. Eine besondere Bedeutung spielen Maschinelle Lernverfahren, die aus bestehenden Datenbeständen lernen und somit die Datenanalyse vereinfachen, Prognosen für Tierwohl-Risiken ermöglichen und Einflussfaktoren auf das Tierwohl identifizieren. Im Projekt „Landwirtschaft 4.0: Info-System“ werden neue Techniken, Methoden und Verfahren für die intelligente Auswertung entwickelt, um eine breite Zustimmung der Gesellschaft zur wettbewerbsfähigen Tierproduktion zu ermöglichen.
Cryptocurrencies are novel means for transacting value, promising lower transaction costs and a c... more Cryptocurrencies are novel means for transacting value, promising lower transaction costs and a complete transaction history, which cannot be manipulated. Systematic risks to such transaction systems are posed by regulatory actions that put strong restrictions on usage – up to complete bans of cryptocurrencies. Prior research has studied the effect of regulatory news on cryptocurrency pricing and found price effects of news of regulatory actions of authorities. We propose a novel dataset of news from online media that loosely relates to cryptocurrency regulation, but includes also opinions and rumors. The proposed dataset allows to study drivers of crashes and risks in cryptocurrency markets.
Online media are an important source for investor sentiment on commodities. Although there is emp... more Online media are an important source for investor sentiment on commodities. Although there is empirical evidence for a relationship between investor sentiment from news and commodity returns, the impact of classifier design on the explanatory power of sentiment for returns has received little attention. We evaluate the explanatory power of nine classifier designs and find that (1) a positive relationship holds between more opinionated online media sentiment and commodity returns, (2) weighting dictionary terms by machine learning increases explanatory power by up to 25%, and (3) the commonly used dictionary of Loughran and McDonald is detrimental for commodity sentiment analysis.
Online media is an important source for sentiments exposed by individuals on goods, services, org... more Online media is an important source for sentiments exposed by individuals on goods, services, organizations, and other objects of interest. While firms can benefit from using these sentiments for decisionmaking, the classification of sentiments is difficult because of volume, velocity, and variety. Machine learning is an effective technique for sentiment classification, which neither requires formalized knowledge about the domain nor the language used. Although the literature provides a rich body of classification methods, system designers and researchers still face the problem of reasonably selecting designs. In this paper, we seek to contribute to the understanding of machine learning for sentiment classification. We report an experimental study that tests the effects of three design factors, i.e., text representation, feature weighting, and machine learning algorithm, on accuracy. The findings can be useful for empirically informed classifier design.
Predicting the duration of surgeries is an important task because of the many dependencies betwee... more Predicting the duration of surgeries is an important task because of the many dependencies between surgery processes and the hospital processes within other departments. Thus, accurate predictions allow for better coordinating patient processes throughout the hospital. Prior data-driven research provides evidence for accurate predictions of surgery durations enhancing the efficiency of surgery schedules. However, the current prediction models require large sets of features, which make their adoption more intricate. Moreover, prediction models focus on the surgery department and neglect potential effects on other departments. We use a unique dataset of about 17,000 surgeries to study how particular features and machine learning algorithms affect the prediction accuracy of major surgery steps. The prediction models that we study require few features and are easy to apply. The empirical findings can be useful for the design of surgery scheduling systems.
Landtechnik, 2021
Animal welfare is a quality indicator of modern pig farming and increasingly important to society... more Animal welfare is a quality indicator of modern pig farming and increasingly important to society. Animal welfare risks have multiple factors and should be recognized and mitigated early on to prevent economic risks. In this work, we use machine learning models to predict animal welfare risks. Our dataset comprises data for over 57,000 pigs with indications of 10 animal welfare risks and 14 suckling phase features. We contribute a prediction model for suckling phase deaths with an accuracy of 80.4% – providing a sizeable improvement over a majority vote‘s accuracy of only 53.1%. The proposed model may help pig farmers to prevent deaths in the suckling phase of pigs at an early stage by taking countermeasure
Journal of Manufacturing Systems, 2021
Abstract Failure prediction is the task of forecasting whether a material system of interest will... more Abstract Failure prediction is the task of forecasting whether a material system of interest will fail at a specific point of time in the future. This task attains significance for strategies of industrial maintenance, such as predictive maintenance. For solving the prediction task, machine learning (ML) technology is increasingly being used, and the literature provides evidence for the effectiveness of ML-based prediction models. However, the state of recent research and the lessons learned are not well documented. Therefore, the objective of this review is to assess the adoption of ML technology for failure prediction in industrial maintenance and synthesize the reported results. We conducted a systematic search for experimental studies in peer-reviewed outlets published from 2012 to 2020. We screened a total of 1,024 articles, of which 34 met the inclusion criteria. We focused on understanding the datasets analyzed, the preprocessing to generate features, and the training and evaluation of prediction models. The results reveal (1) a broad range of systems and domains addressed, (2) the adoption of up-to-date approaches to preprocessing and training, (3) some lack of performance evaluation mitigating the overfitting problem, and (4) considerable heterogeneity in the reporting of experimental designs and results. We identify opportunities for future research and suggest ways to facilitate the comparison and integration of evidence obtained from single studies.
WI2020 Community Tracks, 2020
Smart farming platforms (SFP's) for pig livestock farming are of increasingly relevance to increa... more Smart farming platforms (SFP's) for pig livestock farming are of increasingly relevance to increase sustainable decision making and enhancement of animal welfare. SFPs involve the whole supply chain and integrate various types of data measured, thus enable data-driven solutions using artificial intelligence. While there exists research about SFPs, issues concerning data governance of SFPs are still lacking. Against this backdrop, we develop a SFP for sustainable decision making with respect to data privacy and data security. Our SFP integrates 4 sensor data sources (e.g., temperature control system, and feeding stations), considers farmer characteristics (e.g., projects with pigs), and provides data-driven solutions (e.g., prediction of animal welfare indicators). We report on the current process situation in pig livestock farming as well as on our concept of SFPs for sustainable decision making. We also report on the evaluation of our SFP by validation against defined requirements during the deployment phase.
Computers and Electronics in Agriculture, 2020
Prior livestock research provides evidence for the importance of accurate detection of pig positi... more Prior livestock research provides evidence for the importance of accurate detection of pig positions and postures for better understanding animal welfare. Position and posture detection can be accomplished by machine vision systems. However, current machine vision systems require rigid setups of fixed vertical lighting, vertical topview camera perspectives or complex camera systems, which hinder their adoption in practice. Moreover, existing detection systems focus on specific pen contexts and may be difficult to apply in other livestock facilities. Our main contribution is twofold: First, we design a deep learning system for position and posture detection that only requires standard 2D camera imaging with no adaptations to the application setting. This deep learning system applies the state-of-the-art Faster R-CNN object detection pipeline and the state-of-the-art Neural Architecture Search (NAS) base network for feature extraction. Second, we provide a labelled open access dataset with 7277 human-made annotations from 21 standard 2D cameras, covering 31 different one-hour long video recordings and 18 different pens to train and test the approach under realistic conditions. On unseen pens under similar experimental conditions with sufficient similar training images of pig fattening, the deep learning system detects pig position with an Average Precision (AP) of 87.4%, and pig position and posture with a mean Average Precision (mAP) of 80.2%. Given different and more difficult experimental conditions of pig rearing with no or little similar images in the training set, an AP of over 67.7% was achieved for position detection. However, detecting the position and posture achieved a mAP between 44.8% and 58.8% only. Furthermore, we demonstrate exemplary applications that can aid pen design by visualizing where pigs are lying and how their lying behavior changes through the day. Finally, we contribute open data that can be used for further studies, replication, and pig position detection applications.
Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, 2019
Corporate reputation is an economic asset and its accurate measurement is of increasing interest ... more Corporate reputation is an economic asset and its accurate measurement is of increasing interest in practice and science. This measurement task is difficult because reputation depends on numerous factors and stakeholders. Traditional measurement approaches have focused on human ratings and surveys, which are costly, can be conducted only infrequently and emphasize financial aspects of a corporation. Nowadays, online media with comments related to products, services, and corporations provides an abundant source for measuring reputation more comprehensively. Against this backdrop, we propose an information retrieval approach to automatically collect reputation-related text content from online media and analyze this content by machine learning-based sentiment analysis. We contribute an ontology for identifying corporations and a unique dataset of online media texts labelled by corporations' reputation. Our approach achieves an overall accuracy of 84.4%. Our results help corporations to quickly identify their reputation from online media at low cost.
Predicting the duration of surgeries is an important task because of the many dependencies betwee... more Predicting the duration of surgeries is an important task because of the many dependencies between surgery processes and the hospital processes within other departments. Thus, accurate predictions allow for better coordinating patient processes throughout the hospital. Prior data-driven research provides evidence for accurate predictions of surgery durations enhancing the efficiency of surgery schedules. However, the current prediction models require large sets of features, which make their adoption more intricate. Moreover, prediction models focus on the surgery department and neglect potential effects on other departments. We use a unique dataset of about 17,000 surgeries to study how particular features and machine learning algorithms affect the prediction accuracy of major surgery steps. The prediction models that we study require few features and are easy to apply. The empirical findings can be useful for the design of surgery scheduling systems.
International Journal of Quality & Reliability Management
PurposeMachine learning (ML) models are increasingly being used in industrial maintenance to pred... more PurposeMachine learning (ML) models are increasingly being used in industrial maintenance to predict system failures. However, less is known about how the time windows for reading data and making predictions affect performance. Therefore, the purpose of this research is to assess the impact of different sliding windows on prediction performance.Design/methodology/approachThe authors conducted a factorial experiment using high dimensional machine data covering two years of operation, taken from a real industrial case for the production of high-precision milled and turned parts. The impacts of different reading and prediction windows were tested for three ML algorithms (random forest, support vector machines and logistic regression) and four metrics (accuracy, precision, recall and F-score).FindingsThe results reveal (1) the critical role of the prediction window contingent upon the application domain, (2) a non-monotonic relationship between the reading window and performance, and (3...
SN Computer Science
Text classification is important to better understand online media. A major problem for creating ... more Text classification is important to better understand online media. A major problem for creating accurate text classifiers using machine learning is small training sets due to the cost of annotating them. On this basis, we investigated how SVM and NBSVM text classifiers should be designed to achieve high accuracy and how the training sets should be sized to efficiently use annotation labor. We used a four-way repeated-measures full-factorial design of 32 design factor combinations. For each design factor combination 22 training set sizes were examined. These training sets were subsets of seven public text datasets. We study the statistical variance of accuracy estimates by randomly drawing new training sets, resulting in accuracy estimates for 98,560 different experimental runs. Our major contribution is a set of empirically evaluated guidelines for creating online media text classifiers using small training sets. We recommend uni- and bi-gram features as text representation, btc te...
Computers and Electronics in Agriculture, 2021
Continuous monitoring of pig posture is important for better understanding animal behavior. Previ... more Continuous monitoring of pig posture is important for better understanding animal behavior. Previous studies focused on day recordings and did not investigate how deep learning models could be applied during longer periods including night recordings under near-infrared light from several pens. Therefore, the objective of this research was to study how a suitable deep learning model for continuous 24/7 pig posture detection could be achieved. We selected a deep learning model from over 150 different model configurations covering experiments concerning 3 detection heads, 4 base networks, 5 transfer datasets and 12 data augmentations. For this purpose, we test and validate our models using 4690 annotations of randomly drawn images from 24/7 video recordings covering 2 fattening periods from 10 pens. Our results indicate that pig position and posture was detected on the test set with 84% mAP@0.50 (49% mAP@[0.50:0.05:0.95]) for day recordings and for night recordings 58% mAP@0.50 (29% mAP@[0.50:0.05:0.95]) was achieved. The main reason for lower mAP during night recordings was degraded near-infrared image quality. Our work reports important findings concerning the applicability of deep learning models on night near-infrared recordings for posture detection. The dataset is publicly available for further research and industrial applications.
Big Data Analytics ermöglicht Informationen aus Daten automatisch, objektiv und kostengünstig zu ... more Big Data Analytics ermöglicht Informationen aus Daten automatisch, objektiv und kostengünstig zu extrahieren. So können Daten zur Haltungsumgebung (z. B. Fütterungsoder Temperaturdaten), aber auch Daten aus Verhaltensbeobachtungen mittels Videokameras oder RFID, analysiert und zur Verbesserung des Tierwohls eingesetzt werden. Eine besondere Bedeutung spielen Maschinelle Lernverfahren, die aus bestehenden Datenbeständen lernen und somit die Datenanalyse vereinfachen, Prognosen für Tierwohl-Risiken ermöglichen und Einflussfaktoren auf das Tierwohl identifizieren. Im Projekt „Landwirtschaft 4.0: Info-System“ werden neue Techniken, Methoden und Verfahren für die intelligente Auswertung entwickelt, um eine breite Zustimmung der Gesellschaft zur wettbewerbsfähigen Tierproduktion zu ermöglichen.
Cryptocurrencies are novel means for transacting value, promising lower transaction costs and a c... more Cryptocurrencies are novel means for transacting value, promising lower transaction costs and a complete transaction history, which cannot be manipulated. Systematic risks to such transaction systems are posed by regulatory actions that put strong restrictions on usage – up to complete bans of cryptocurrencies. Prior research has studied the effect of regulatory news on cryptocurrency pricing and found price effects of news of regulatory actions of authorities. We propose a novel dataset of news from online media that loosely relates to cryptocurrency regulation, but includes also opinions and rumors. The proposed dataset allows to study drivers of crashes and risks in cryptocurrency markets.
Online media are an important source for investor sentiment on commodities. Although there is emp... more Online media are an important source for investor sentiment on commodities. Although there is empirical evidence for a relationship between investor sentiment from news and commodity returns, the impact of classifier design on the explanatory power of sentiment for returns has received little attention. We evaluate the explanatory power of nine classifier designs and find that (1) a positive relationship holds between more opinionated online media sentiment and commodity returns, (2) weighting dictionary terms by machine learning increases explanatory power by up to 25%, and (3) the commonly used dictionary of Loughran and McDonald is detrimental for commodity sentiment analysis.
Online media is an important source for sentiments exposed by individuals on goods, services, org... more Online media is an important source for sentiments exposed by individuals on goods, services, organizations, and other objects of interest. While firms can benefit from using these sentiments for decisionmaking, the classification of sentiments is difficult because of volume, velocity, and variety. Machine learning is an effective technique for sentiment classification, which neither requires formalized knowledge about the domain nor the language used. Although the literature provides a rich body of classification methods, system designers and researchers still face the problem of reasonably selecting designs. In this paper, we seek to contribute to the understanding of machine learning for sentiment classification. We report an experimental study that tests the effects of three design factors, i.e., text representation, feature weighting, and machine learning algorithm, on accuracy. The findings can be useful for empirically informed classifier design.
Predicting the duration of surgeries is an important task because of the many dependencies betwee... more Predicting the duration of surgeries is an important task because of the many dependencies between surgery processes and the hospital processes within other departments. Thus, accurate predictions allow for better coordinating patient processes throughout the hospital. Prior data-driven research provides evidence for accurate predictions of surgery durations enhancing the efficiency of surgery schedules. However, the current prediction models require large sets of features, which make their adoption more intricate. Moreover, prediction models focus on the surgery department and neglect potential effects on other departments. We use a unique dataset of about 17,000 surgeries to study how particular features and machine learning algorithms affect the prediction accuracy of major surgery steps. The prediction models that we study require few features and are easy to apply. The empirical findings can be useful for the design of surgery scheduling systems.
Landtechnik, 2021
Animal welfare is a quality indicator of modern pig farming and increasingly important to society... more Animal welfare is a quality indicator of modern pig farming and increasingly important to society. Animal welfare risks have multiple factors and should be recognized and mitigated early on to prevent economic risks. In this work, we use machine learning models to predict animal welfare risks. Our dataset comprises data for over 57,000 pigs with indications of 10 animal welfare risks and 14 suckling phase features. We contribute a prediction model for suckling phase deaths with an accuracy of 80.4% – providing a sizeable improvement over a majority vote‘s accuracy of only 53.1%. The proposed model may help pig farmers to prevent deaths in the suckling phase of pigs at an early stage by taking countermeasure
Journal of Manufacturing Systems, 2021
Abstract Failure prediction is the task of forecasting whether a material system of interest will... more Abstract Failure prediction is the task of forecasting whether a material system of interest will fail at a specific point of time in the future. This task attains significance for strategies of industrial maintenance, such as predictive maintenance. For solving the prediction task, machine learning (ML) technology is increasingly being used, and the literature provides evidence for the effectiveness of ML-based prediction models. However, the state of recent research and the lessons learned are not well documented. Therefore, the objective of this review is to assess the adoption of ML technology for failure prediction in industrial maintenance and synthesize the reported results. We conducted a systematic search for experimental studies in peer-reviewed outlets published from 2012 to 2020. We screened a total of 1,024 articles, of which 34 met the inclusion criteria. We focused on understanding the datasets analyzed, the preprocessing to generate features, and the training and evaluation of prediction models. The results reveal (1) a broad range of systems and domains addressed, (2) the adoption of up-to-date approaches to preprocessing and training, (3) some lack of performance evaluation mitigating the overfitting problem, and (4) considerable heterogeneity in the reporting of experimental designs and results. We identify opportunities for future research and suggest ways to facilitate the comparison and integration of evidence obtained from single studies.
WI2020 Community Tracks, 2020
Smart farming platforms (SFP's) for pig livestock farming are of increasingly relevance to increa... more Smart farming platforms (SFP's) for pig livestock farming are of increasingly relevance to increase sustainable decision making and enhancement of animal welfare. SFPs involve the whole supply chain and integrate various types of data measured, thus enable data-driven solutions using artificial intelligence. While there exists research about SFPs, issues concerning data governance of SFPs are still lacking. Against this backdrop, we develop a SFP for sustainable decision making with respect to data privacy and data security. Our SFP integrates 4 sensor data sources (e.g., temperature control system, and feeding stations), considers farmer characteristics (e.g., projects with pigs), and provides data-driven solutions (e.g., prediction of animal welfare indicators). We report on the current process situation in pig livestock farming as well as on our concept of SFPs for sustainable decision making. We also report on the evaluation of our SFP by validation against defined requirements during the deployment phase.
Computers and Electronics in Agriculture, 2020
Prior livestock research provides evidence for the importance of accurate detection of pig positi... more Prior livestock research provides evidence for the importance of accurate detection of pig positions and postures for better understanding animal welfare. Position and posture detection can be accomplished by machine vision systems. However, current machine vision systems require rigid setups of fixed vertical lighting, vertical topview camera perspectives or complex camera systems, which hinder their adoption in practice. Moreover, existing detection systems focus on specific pen contexts and may be difficult to apply in other livestock facilities. Our main contribution is twofold: First, we design a deep learning system for position and posture detection that only requires standard 2D camera imaging with no adaptations to the application setting. This deep learning system applies the state-of-the-art Faster R-CNN object detection pipeline and the state-of-the-art Neural Architecture Search (NAS) base network for feature extraction. Second, we provide a labelled open access dataset with 7277 human-made annotations from 21 standard 2D cameras, covering 31 different one-hour long video recordings and 18 different pens to train and test the approach under realistic conditions. On unseen pens under similar experimental conditions with sufficient similar training images of pig fattening, the deep learning system detects pig position with an Average Precision (AP) of 87.4%, and pig position and posture with a mean Average Precision (mAP) of 80.2%. Given different and more difficult experimental conditions of pig rearing with no or little similar images in the training set, an AP of over 67.7% was achieved for position detection. However, detecting the position and posture achieved a mAP between 44.8% and 58.8% only. Furthermore, we demonstrate exemplary applications that can aid pen design by visualizing where pigs are lying and how their lying behavior changes through the day. Finally, we contribute open data that can be used for further studies, replication, and pig position detection applications.
Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, 2019
Corporate reputation is an economic asset and its accurate measurement is of increasing interest ... more Corporate reputation is an economic asset and its accurate measurement is of increasing interest in practice and science. This measurement task is difficult because reputation depends on numerous factors and stakeholders. Traditional measurement approaches have focused on human ratings and surveys, which are costly, can be conducted only infrequently and emphasize financial aspects of a corporation. Nowadays, online media with comments related to products, services, and corporations provides an abundant source for measuring reputation more comprehensively. Against this backdrop, we propose an information retrieval approach to automatically collect reputation-related text content from online media and analyze this content by machine learning-based sentiment analysis. We contribute an ontology for identifying corporations and a unique dataset of online media texts labelled by corporations' reputation. Our approach achieves an overall accuracy of 84.4%. Our results help corporations to quickly identify their reputation from online media at low cost.
Predicting the duration of surgeries is an important task because of the many dependencies betwee... more Predicting the duration of surgeries is an important task because of the many dependencies between surgery processes and the hospital processes within other departments. Thus, accurate predictions allow for better coordinating patient processes throughout the hospital. Prior data-driven research provides evidence for accurate predictions of surgery durations enhancing the efficiency of surgery schedules. However, the current prediction models require large sets of features, which make their adoption more intricate. Moreover, prediction models focus on the surgery department and neglect potential effects on other departments. We use a unique dataset of about 17,000 surgeries to study how particular features and machine learning algorithms affect the prediction accuracy of major surgery steps. The prediction models that we study require few features and are easy to apply. The empirical findings can be useful for the design of surgery scheduling systems.