Maya Sappelli | Radboud University Nijmegen (original) (raw)
Papers by Maya Sappelli
Proceedings of the 13th International Conference on Semantic Systems
In this short paper, we address the interpretability of hidden layer representations in deep text... more In this short paper, we address the interpretability of hidden layer representations in deep text mining: deep neural networks applied to text mining tasks. Following earlier work predating deep learning methods, we exploit the internal neural network activation (latent) space as a source for performing k-nearest neighbor search, looking for representative, explanatory training data examples with similar neural layer activations as test inputs. We deploy an additional semantic document similarity metric for establishing document similarity between the textual representations of these nearest neighbors and the test inputs. We argue that the statistical analysis of the output of this measure provides insight to engineers training the networks, and that nearest neighbor search in latent space combined with semantic document similarity measures offers a mechanism for presenting explanatory, intelligible examples to users.
The algorithmic personalization and news (apen18) workshop at icwsm '18, 2018
FD Mediagroep (FDMG1 ) is the leading information provider in the financial economic domain in th... more FD Mediagroep (FDMG1 ) is the leading information provider in the financial economic domain in the Netherlands. FDMG operates “Het Financieele Dagblad” (FD) a daily finan- cial newspaper, similar to the Financial Times. In addition, FDMG operates the daily all-news radio station “Business News Radio” (BNR). As we have a wide variety of users with various backgrounds and interests, we believe that digital me- dia (both news articles and radio) should be personalized to match the interests of a particular customer. We are therefore working on personalization of FDMG’s digital media: • Personalized news: Recommendations and personalized summaries of news articles that match the reading pref- erences and interests of our readers • Personalized radio: A non-linear radio experience with ra- dio snippets that match the listener’s interests In both personalized news and personalized radio we are looking not only at introducing recommender systems but also at personalized ways to present the information using automated summarization (news) and audio segmentation (ra- dio) method
2nd International Workshop on Extraction and Processing of Rich Semantics from Medical Texts, 2017
We present a multilingual, open source system for cancer forum thread analysis, equipped with a b... more We present a multilingual, open source system for cancer forum thread analysis, equipped with a biomedical entity tagger and a module for textual summarization. This system allows users to investi- gate textual co-occurrences of biomedical entities in forum posts, and to browse through summaries of long discussions. It is applied to a number of online cancer patient fora, including a gastro-intestinal cancer forum and a breast cancer forum. We propose that the system can serve as an extra source of information for medical hypothesis formulation, and as a facility for boosting patient empowerment
The 17th dutch-belgian information retrieval workshop, 2018
In this demonstration paper we describe the SMART Radio app 1 forBNRNieuwsradio. TheSMARTRadioapp... more In this demonstration paper we describe the SMART Radio app 1 forBNRNieuwsradio. TheSMARTRadioappisanextensionto the current BNR app, which offers users a more personalized news radio experience. It does so by automatically fragmenting shows to offer our users more targeted and focused fragments of audio, not full shows. We employ audio segmentation and audio topic- tagging techniques to achieve this, which we describe in this paper. In its present form, users can subscribe to tags to get appropriate suggestions of relevant radio fragments. In the future we would like to improve the app’s personalization, by using information of the user’s interaction with the app
One of the challenges in the field of content-based image retrieval is to bridge the semantic gap... more One of the challenges in the field of content-based image retrieval is to bridge the semantic gap that exists between the information extracted from visual data using classifiers, and the interpretation of this data made by the end users. The semantic gap is a cascade of 1) the transformation of image pixels into labelled objects and 2) the semantic distance between the label used to name the classifier and that what it refers to for the end-user. In this paper, we focus on the second part and specifically on (semantically) scalable solutions that are independent from domain-specific vocabularies. To this end, we propose a generic semantic reasoning approach that applies semiotics in its query interpretation. Semiotics is about how humans interpret signs, and we use its text analysis structures to guide the query expansion that we apply. We evaluated our approach using a general-purpose image search engine. In our experiments, we compared several semiotic structures to determine to ...
In this paper, we show our vision on prescriptive analytics. Prescriptive analytics is a field of... more In this paper, we show our vision on prescriptive analytics. Prescriptive analytics is a field of study in which the actions are determined that are required in order to achieve a particular goal. This is different from predictive analytics, where we only determine what will happen if we continue current trend. Consequently, the amount of data that needs to be taken into account is much larger, making it a relevant big data problem. We zoom in on the requirements of prescriptive analytics problems: impact, complexity, objective, constraints and data. We explain some of the challenges, such as the availability of the data, the downside of simulations, the creation of bias in the data and trust of the user. We highlight a number of application areas in which prescriptive analytics could or would not work given our requirements. Based on these application areas, we conclude that domains with a large amount of data and in which the phenomena are restricted by laws of physics or math are...
project, which develops a social robot for children with diabetes. Type 1 diabetes mellitus (T1DM... more project, which develops a social robot for children with diabetes. Type 1 diabetes mellitus (T1DM) is one of the most common diseases among children and youngsters in the United States and Europe (Freeborn et al., 2013). Within the PAL project, it is studied whether a robotic companion could help children with the self-management of all the daily tasks, and whether it could increase children’s knowledge on diabetes using ontologies (Neerincx et al., 2016).
Abstract—With the growth of open sensor networks, multiple applications in different domains make... more Abstract—With the growth of open sensor networks, multiple applications in different domains make use of a large amount of sensor data, resulting in an emerging need to search semantically over heterogeneous datasets. In semantic search, an important challenge consists of bridging the semantic gap between the high-level natural language query posed by the users and the low-level sensor data. In this paper, we show that state-of-the-art techniques in Semantic Modelling, Computer Vision and Human Media Interaction can be combined to apply semantic reasoning in the field of image retrieval. We propose a system, GOOSE, which is a general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. User queries are interpreted using the Stanford Parser, semantic rules and the Linked Open Data source ConceptNet. Interpreted queries are presented to the user as an intuitive and insightful graph in order to collect feedback that is used for furt...
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020
This paper presents a typology of errors produced by automatic summarization systems. The typolog... more This paper presents a typology of errors produced by automatic summarization systems. The typology was created by manually analyzing the output of four recent neural summarization systems. Our work is motivated by the growing awareness of the need for better summary evaluation methods that go beyond conventional overlap-based metrics. Our typology is structured into two dimensions. First, the Mapping Dimension describes surface-level errors and provides insight into word-sequence transformation issues. Second, the Meaning Dimension describes issues related to interpretation and provides insight into breakdowns in truth, i.e., factual faithfulness to the original text. Comparative analysis revealed that two neural summarization systems leveraging pretrained models have an advantage in decreasing grammaticality errors, but not necessarily factual errors. We also discuss the importance of ensuring that summary length and abstractiveness do not interfere with evaluating summary quality.
User Modeling and User-Adapted Interaction, 2019
Recent advances in wearable sensor technology and smartphones enable simple and affordable collec... more Recent advances in wearable sensor technology and smartphones enable simple and affordable collection of personal analytics. This paper reflects on the lessons learned in the SWELL project that addressed the design of user-centered ICT applications for self-management of vitality in the domain of knowledge workers. These workers often have a sedentary lifestyle and are susceptible to mental health effects due to a high workload. We present the sense-reason-act framework that is the basis of the SWELL approach and we provide an overview of the individual studies carried out in SWELL. In this paper, we revisit our work on reasoning: interpreting raw heterogeneous sensor data, and acting: providing personalized feedback to support behavioural change. We conclude that simple affordable sensors can be used to classify user behaviour and heath status in a physically non-intrusive way. The interpreted data can be used to inform personalized feedback strategies. Further longitudinal studies can now be initiated to assess the effectiveness of m-Health interventions using the SWELL methods.
ACM SIGIR Forum, 2017
There is an increase in stress during the job. This can lead to health issues such as burn-out fo... more There is an increase in stress during the job. This can lead to health issues such as burn-out for employees. In the SWELL project (http://www.swell-project.net) we investigate intelligent ICT solutions that can support knowledge workers to achieve a healthy way of living at work and at home. One of the causes of stress at work is the problem of 'information overload'. The availability of smartphones and tables with continuous internet access causes individuals to become overwhelmed with information. It becomes more difficult to separate work and home, but also to find the information you need to execute your work properly. A possible solution to this problem is to make application more 'context-aware'. This means that the application understands what a person is doing, such that it can provide the optimal support at the optimal time. This is the underlying motivation for this thesis. This thesis consists of three parts. In the first part we discuss the question &quo...
ACM Transactions on Interactive Intelligent Systems, 2016
In this article, we propose and implement a new model for context recognition and identification ... more In this article, we propose and implement a new model for context recognition and identification . Our work is motivated by the importance of “working in context” for knowledge workers to stay focused and productive. A computer application that can identify the current context in which the knowledge worker is working can (among other things) provide the worker with contextual support, for example, by suggesting relevant information sources, or give an overview of how he or she spent his or her time during the day. We present a descriptive model for the context of a knowledge worker. This model describes the contextual elements in the work environment of the knowledge worker and how these elements relate to each other. This model is operationalized in an algorithm, the contextual interactive activation model (CIA), which is based on the interactive activation model by Rumelhart and McClelland. It consists of a layered connected network through which activation flows. We have tested C...
Information Retrieval Journal, 2016
We evaluate five term scoring methods for automatic term extraction on four different types of te... more We evaluate five term scoring methods for automatic term extraction on four different types of text collections: personal document collections, news articles, scientific articles and medical discharge summaries. Each collection has its own use case: author profiling, boolean query term suggestion, personalized query suggestion and patient query expansion. The methods for term scoring that have been proposed in the literature were designed with a specific goal in mind. However, it is as yet unclear how these methods perform on collections with characteristics different than what they were designed for, and which method is the most suitable for a given (new) collection. In a series of experiments, we evaluate, compare and analyse the output of six term scoring methods for the collections at hand. We found that the most important factors in the success of a term scoring method are the size of the collection and the importance of multi-word terms in the domain. Larger collections lead to better terms; all methods are hindered by small collection sizes (below 1000 words). The most flexible method for the extraction of singleword and multi-word terms is pointwise Kullback-Leibler divergence for informativeness and phraseness. Overall, we have shown that extracting relevant terms using unsupervised term scoring methods is possible in diverse use cases, and that the methods are applicable in more contexts than their original design purpose.
Journal of the Association for Information Science and Technology, 2016
In this article we evaluate context-aware recommendation systems for information re-finding by kn... more In this article we evaluate context-aware recommendation systems for information re-finding by knowledge workers. We identify 4 criteria that are relevant for evaluating the quality of knowledge worker support: context relevance, document relevance, prediction of user action, and diversity of the suggestions. We compare 3 different context-aware recommendation methods for information re-finding in a writing support task. The first method uses contextual prefiltering and content-based recommendation (CBR), the second uses the just-intime information retrieval paradigm (JITIR), and the third is a novel network-based recommendation system where context is part of the recommendation model (CIA). We found that each method has its own strengths: CBR is strong at context relevance, JITIR captures document relevance well, and CIA achieves the best result at predicting user action. Weaknesses include that CBR depends on a manual source to determine the context and in JITIR the context query can fail when the textual content is not sufficient. We conclude that to truly support a knowledge worker, all 4 evaluation criteria are important. In light of that conclusion, we argue that the network-based approach the CIA offers has the highest robustness and flexibility for context-aware information recommendation.
2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI), 2015
The number of networked cameras is growing exponentially. Multiple applications in different doma... more The number of networked cameras is growing exponentially. Multiple applications in different domains result in an increasing need to search semantically over video sensor data. In this paper, we present the GOOSE demonstrator, which is a real-time general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. Top-down, this demonstrator interprets queries, which are presented as an intuitive graph to collect user feedback. Bottomup, the system automatically recognizes and localizes concepts in images and it can incrementally learn novel concepts. A smart ranking combines both and allows effective retrieval of relevant images.
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, 2013
Lecture Notes in Computer Science, 2013
bacterial or fungal infections. The average medical expenses for people with diabetes are about 2... more bacterial or fungal infections. The average medical expenses for people with diabetes are about 2.3 times higher than they would be in the absence of diabetes. Diabetes is the seventh leading cause of death in the United States, either as the underlying cause of death or as a contributing cause of death. The World Health Organization reports the following statistics: The 2014 global prevalence of diabetes was about 9 percent for adults. About 90 percent of people with diabetes have type 2 diabetes. Diabetes caused about 1.5 million deaths worldwide in 2012. English
Proceedings of the 13th International Conference on Semantic Systems
In this short paper, we address the interpretability of hidden layer representations in deep text... more In this short paper, we address the interpretability of hidden layer representations in deep text mining: deep neural networks applied to text mining tasks. Following earlier work predating deep learning methods, we exploit the internal neural network activation (latent) space as a source for performing k-nearest neighbor search, looking for representative, explanatory training data examples with similar neural layer activations as test inputs. We deploy an additional semantic document similarity metric for establishing document similarity between the textual representations of these nearest neighbors and the test inputs. We argue that the statistical analysis of the output of this measure provides insight to engineers training the networks, and that nearest neighbor search in latent space combined with semantic document similarity measures offers a mechanism for presenting explanatory, intelligible examples to users.
The algorithmic personalization and news (apen18) workshop at icwsm '18, 2018
FD Mediagroep (FDMG1 ) is the leading information provider in the financial economic domain in th... more FD Mediagroep (FDMG1 ) is the leading information provider in the financial economic domain in the Netherlands. FDMG operates “Het Financieele Dagblad” (FD) a daily finan- cial newspaper, similar to the Financial Times. In addition, FDMG operates the daily all-news radio station “Business News Radio” (BNR). As we have a wide variety of users with various backgrounds and interests, we believe that digital me- dia (both news articles and radio) should be personalized to match the interests of a particular customer. We are therefore working on personalization of FDMG’s digital media: • Personalized news: Recommendations and personalized summaries of news articles that match the reading pref- erences and interests of our readers • Personalized radio: A non-linear radio experience with ra- dio snippets that match the listener’s interests In both personalized news and personalized radio we are looking not only at introducing recommender systems but also at personalized ways to present the information using automated summarization (news) and audio segmentation (ra- dio) method
2nd International Workshop on Extraction and Processing of Rich Semantics from Medical Texts, 2017
We present a multilingual, open source system for cancer forum thread analysis, equipped with a b... more We present a multilingual, open source system for cancer forum thread analysis, equipped with a biomedical entity tagger and a module for textual summarization. This system allows users to investi- gate textual co-occurrences of biomedical entities in forum posts, and to browse through summaries of long discussions. It is applied to a number of online cancer patient fora, including a gastro-intestinal cancer forum and a breast cancer forum. We propose that the system can serve as an extra source of information for medical hypothesis formulation, and as a facility for boosting patient empowerment
The 17th dutch-belgian information retrieval workshop, 2018
In this demonstration paper we describe the SMART Radio app 1 forBNRNieuwsradio. TheSMARTRadioapp... more In this demonstration paper we describe the SMART Radio app 1 forBNRNieuwsradio. TheSMARTRadioappisanextensionto the current BNR app, which offers users a more personalized news radio experience. It does so by automatically fragmenting shows to offer our users more targeted and focused fragments of audio, not full shows. We employ audio segmentation and audio topic- tagging techniques to achieve this, which we describe in this paper. In its present form, users can subscribe to tags to get appropriate suggestions of relevant radio fragments. In the future we would like to improve the app’s personalization, by using information of the user’s interaction with the app
One of the challenges in the field of content-based image retrieval is to bridge the semantic gap... more One of the challenges in the field of content-based image retrieval is to bridge the semantic gap that exists between the information extracted from visual data using classifiers, and the interpretation of this data made by the end users. The semantic gap is a cascade of 1) the transformation of image pixels into labelled objects and 2) the semantic distance between the label used to name the classifier and that what it refers to for the end-user. In this paper, we focus on the second part and specifically on (semantically) scalable solutions that are independent from domain-specific vocabularies. To this end, we propose a generic semantic reasoning approach that applies semiotics in its query interpretation. Semiotics is about how humans interpret signs, and we use its text analysis structures to guide the query expansion that we apply. We evaluated our approach using a general-purpose image search engine. In our experiments, we compared several semiotic structures to determine to ...
In this paper, we show our vision on prescriptive analytics. Prescriptive analytics is a field of... more In this paper, we show our vision on prescriptive analytics. Prescriptive analytics is a field of study in which the actions are determined that are required in order to achieve a particular goal. This is different from predictive analytics, where we only determine what will happen if we continue current trend. Consequently, the amount of data that needs to be taken into account is much larger, making it a relevant big data problem. We zoom in on the requirements of prescriptive analytics problems: impact, complexity, objective, constraints and data. We explain some of the challenges, such as the availability of the data, the downside of simulations, the creation of bias in the data and trust of the user. We highlight a number of application areas in which prescriptive analytics could or would not work given our requirements. Based on these application areas, we conclude that domains with a large amount of data and in which the phenomena are restricted by laws of physics or math are...
project, which develops a social robot for children with diabetes. Type 1 diabetes mellitus (T1DM... more project, which develops a social robot for children with diabetes. Type 1 diabetes mellitus (T1DM) is one of the most common diseases among children and youngsters in the United States and Europe (Freeborn et al., 2013). Within the PAL project, it is studied whether a robotic companion could help children with the self-management of all the daily tasks, and whether it could increase children’s knowledge on diabetes using ontologies (Neerincx et al., 2016).
Abstract—With the growth of open sensor networks, multiple applications in different domains make... more Abstract—With the growth of open sensor networks, multiple applications in different domains make use of a large amount of sensor data, resulting in an emerging need to search semantically over heterogeneous datasets. In semantic search, an important challenge consists of bridging the semantic gap between the high-level natural language query posed by the users and the low-level sensor data. In this paper, we show that state-of-the-art techniques in Semantic Modelling, Computer Vision and Human Media Interaction can be combined to apply semantic reasoning in the field of image retrieval. We propose a system, GOOSE, which is a general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. User queries are interpreted using the Stanford Parser, semantic rules and the Linked Open Data source ConceptNet. Interpreted queries are presented to the user as an intuitive and insightful graph in order to collect feedback that is used for furt...
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020
This paper presents a typology of errors produced by automatic summarization systems. The typolog... more This paper presents a typology of errors produced by automatic summarization systems. The typology was created by manually analyzing the output of four recent neural summarization systems. Our work is motivated by the growing awareness of the need for better summary evaluation methods that go beyond conventional overlap-based metrics. Our typology is structured into two dimensions. First, the Mapping Dimension describes surface-level errors and provides insight into word-sequence transformation issues. Second, the Meaning Dimension describes issues related to interpretation and provides insight into breakdowns in truth, i.e., factual faithfulness to the original text. Comparative analysis revealed that two neural summarization systems leveraging pretrained models have an advantage in decreasing grammaticality errors, but not necessarily factual errors. We also discuss the importance of ensuring that summary length and abstractiveness do not interfere with evaluating summary quality.
User Modeling and User-Adapted Interaction, 2019
Recent advances in wearable sensor technology and smartphones enable simple and affordable collec... more Recent advances in wearable sensor technology and smartphones enable simple and affordable collection of personal analytics. This paper reflects on the lessons learned in the SWELL project that addressed the design of user-centered ICT applications for self-management of vitality in the domain of knowledge workers. These workers often have a sedentary lifestyle and are susceptible to mental health effects due to a high workload. We present the sense-reason-act framework that is the basis of the SWELL approach and we provide an overview of the individual studies carried out in SWELL. In this paper, we revisit our work on reasoning: interpreting raw heterogeneous sensor data, and acting: providing personalized feedback to support behavioural change. We conclude that simple affordable sensors can be used to classify user behaviour and heath status in a physically non-intrusive way. The interpreted data can be used to inform personalized feedback strategies. Further longitudinal studies can now be initiated to assess the effectiveness of m-Health interventions using the SWELL methods.
ACM SIGIR Forum, 2017
There is an increase in stress during the job. This can lead to health issues such as burn-out fo... more There is an increase in stress during the job. This can lead to health issues such as burn-out for employees. In the SWELL project (http://www.swell-project.net) we investigate intelligent ICT solutions that can support knowledge workers to achieve a healthy way of living at work and at home. One of the causes of stress at work is the problem of 'information overload'. The availability of smartphones and tables with continuous internet access causes individuals to become overwhelmed with information. It becomes more difficult to separate work and home, but also to find the information you need to execute your work properly. A possible solution to this problem is to make application more 'context-aware'. This means that the application understands what a person is doing, such that it can provide the optimal support at the optimal time. This is the underlying motivation for this thesis. This thesis consists of three parts. In the first part we discuss the question &quo...
ACM Transactions on Interactive Intelligent Systems, 2016
In this article, we propose and implement a new model for context recognition and identification ... more In this article, we propose and implement a new model for context recognition and identification . Our work is motivated by the importance of “working in context” for knowledge workers to stay focused and productive. A computer application that can identify the current context in which the knowledge worker is working can (among other things) provide the worker with contextual support, for example, by suggesting relevant information sources, or give an overview of how he or she spent his or her time during the day. We present a descriptive model for the context of a knowledge worker. This model describes the contextual elements in the work environment of the knowledge worker and how these elements relate to each other. This model is operationalized in an algorithm, the contextual interactive activation model (CIA), which is based on the interactive activation model by Rumelhart and McClelland. It consists of a layered connected network through which activation flows. We have tested C...
Information Retrieval Journal, 2016
We evaluate five term scoring methods for automatic term extraction on four different types of te... more We evaluate five term scoring methods for automatic term extraction on four different types of text collections: personal document collections, news articles, scientific articles and medical discharge summaries. Each collection has its own use case: author profiling, boolean query term suggestion, personalized query suggestion and patient query expansion. The methods for term scoring that have been proposed in the literature were designed with a specific goal in mind. However, it is as yet unclear how these methods perform on collections with characteristics different than what they were designed for, and which method is the most suitable for a given (new) collection. In a series of experiments, we evaluate, compare and analyse the output of six term scoring methods for the collections at hand. We found that the most important factors in the success of a term scoring method are the size of the collection and the importance of multi-word terms in the domain. Larger collections lead to better terms; all methods are hindered by small collection sizes (below 1000 words). The most flexible method for the extraction of singleword and multi-word terms is pointwise Kullback-Leibler divergence for informativeness and phraseness. Overall, we have shown that extracting relevant terms using unsupervised term scoring methods is possible in diverse use cases, and that the methods are applicable in more contexts than their original design purpose.
Journal of the Association for Information Science and Technology, 2016
In this article we evaluate context-aware recommendation systems for information re-finding by kn... more In this article we evaluate context-aware recommendation systems for information re-finding by knowledge workers. We identify 4 criteria that are relevant for evaluating the quality of knowledge worker support: context relevance, document relevance, prediction of user action, and diversity of the suggestions. We compare 3 different context-aware recommendation methods for information re-finding in a writing support task. The first method uses contextual prefiltering and content-based recommendation (CBR), the second uses the just-intime information retrieval paradigm (JITIR), and the third is a novel network-based recommendation system where context is part of the recommendation model (CIA). We found that each method has its own strengths: CBR is strong at context relevance, JITIR captures document relevance well, and CIA achieves the best result at predicting user action. Weaknesses include that CBR depends on a manual source to determine the context and in JITIR the context query can fail when the textual content is not sufficient. We conclude that to truly support a knowledge worker, all 4 evaluation criteria are important. In light of that conclusion, we argue that the network-based approach the CIA offers has the highest robustness and flexibility for context-aware information recommendation.
2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI), 2015
The number of networked cameras is growing exponentially. Multiple applications in different doma... more The number of networked cameras is growing exponentially. Multiple applications in different domains result in an increasing need to search semantically over video sensor data. In this paper, we present the GOOSE demonstrator, which is a real-time general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. Top-down, this demonstrator interprets queries, which are presented as an intuitive graph to collect user feedback. Bottomup, the system automatically recognizes and localizes concepts in images and it can incrementally learn novel concepts. A smart ranking combines both and allows effective retrieval of relevant images.
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, 2013
Lecture Notes in Computer Science, 2013
bacterial or fungal infections. The average medical expenses for people with diabetes are about 2... more bacterial or fungal infections. The average medical expenses for people with diabetes are about 2.3 times higher than they would be in the absence of diabetes. Diabetes is the seventh leading cause of death in the United States, either as the underlying cause of death or as a contributing cause of death. The World Health Organization reports the following statistics: The 2014 global prevalence of diabetes was about 9 percent for adults. About 90 percent of people with diabetes have type 2 diabetes. Diabetes caused about 1.5 million deaths worldwide in 2012. English