Goran Nenadic - Academia.edu (original) (raw)
Papers by Goran Nenadic
arXiv (Cornell University), Aug 7, 2023
arXiv (Cornell University), Jan 8, 2023
Interactive Journal of Medical Research
Background Epidemiological criminology refers to health issues affecting incarcerated and noninca... more Background Epidemiological criminology refers to health issues affecting incarcerated and nonincarcerated offender populations, a group recognized as being challenging to conduct research with. Notwithstanding this, an urgent need exists for new knowledge and interventions to improve health, justice, and social outcomes for this marginalized population. Objective To better understand research outputs in the field of epidemiological criminology, we examined the lead author’s affiliation by analyzing peer-reviewed published outputs to determine countries and organizations (eg, universities, governmental and nongovernmental organizations) responsible for peer-reviewed publications. Methods We used a semiautomated approach to examine the first-author affiliations of 23,904 PubMed epidemiological studies related to incarcerated and offender populations published in English between 1946 and 2021. We also mapped research outputs to the World Justice Project Rule of Law Index to better unde...
arXiv (Cornell University), Oct 8, 2022
2020 International Conference on Computing and Information Technology (ICCIT-1441), 2020
Complete reporting of Experimental Meta-data (EM) is necessary for reproducing and understanding ... more Complete reporting of Experimental Meta-data (EM) is necessary for reproducing and understanding biomedical experiments and results. Experimental Metadata Reporting Checklist Questions (EMR-CLQs) have been designed and used by journals as guidelines to capture EM and evaluate the quality of the reporting. Automatically answering EMR-CLQs is necessary to check completeness and clarity of EM, which can be useful for the peer-review process. Moreover, automatically extracting the EMR-CLQs answers can be used to search the relevant literature for the meta-data analysis process in an efficient way. This paper shows the possibility of answering different types of EMR-CLQs automatically by understanding the structure of both EMR-CLQs and the biomedical article. A text mining model (rule-based approach) based on the information extraction techniques and the structure of the biomedical articles and the EMR-CLQs, is proposed as a first model in the biomedical reproducibility domain to answer EMR-CLQs automatically. The model was used to answer five EMR-CLQs of two different types automatically; Main and Attribute questions. We evaluated the feasibility of the model against gold-standard data of 58 full-text articles annotated by domain experts. The results are showing the possibility of answering the EMR-CLQs automatically with a mean f-measure of 75% and 73% for development and testing datasets, respectively.
International Journal of Population Data Science, 2020
IntroductionA significant amount of valuable information in Electronic Health Records (EHR) such ... more IntroductionA significant amount of valuable information in Electronic Health Records (EHR) such as laboratory test results or echocardiogram interpretations is embedded in lengthy free-text fields. Often patients’ personal information is also included in these narratives. Privacy legislation in different jurisdictions requires de-identification of this information prior to making it available for research. This process can be challenging and time-consuming. In particular, rule-based algorithms may lead to over-masking of essential medical terms, conditions, or devices that are named after individuals. Objectives and ApproachWe aimed to enhance ICES’ existing rule-based application to make it contextually-driven by applying Artificial Intelligence (AI). The ICES team collaborated with computer scientists at the University of Manchester who had already published work in this area and Evenset, a Toronto-based software company. Based on the Manchester University de-identification frame...
ArXiv, 2020
Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text.... more Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analys...
Expert Systems with Applications, 2018
Abstract Mobile application (app) websites such as Google Play and AppStore allow users to review... more Abstract Mobile application (app) websites such as Google Play and AppStore allow users to review their downloaded apps. Such reviews can be useful for app users, as they may help users make an informed decision; such reviews can also be potentially useful for app developers, if they contain valuable information concerning user needs and requirements. However, in order to unleash the value of app reviews for mobile app development, intelligent mining tools that can help discern relevant reviews from irrelevant ones must be provided. This paper surveys the state of the art in the development of such tools and techniques behind them. To gain insight into the maturity of the current support mining tools, the paper will also find out what app development information these tools have discovered and what challenges they are facing. The results of this survey can inform the development of more effective and intelligent app review mining techniques and tools.
Journal of medical Internet research, Jan 13, 2018
Vast numbers of domestic violence (DV) incidents are attended by the New South Wales Police Force... more Vast numbers of domestic violence (DV) incidents are attended by the New South Wales Police Force each year in New South Wales and recorded as both structured quantitative data and unstructured free text in the WebCOPS (Web-based interface for the Computerised Operational Policing System) database regarding the details of the incident, the victim, and person of interest (POI). Although the structured data are used for reporting purposes, the free text remains untapped for DV reporting and surveillance purposes. In this paper, we explore whether text mining can automatically identify mental health disorders from this unstructured text. We used a training set of 200 DV recorded events to design a knowledge-driven approach based on lexical patterns in text suggesting mental health disorders for POIs and victims. The precision returned from an evaluation set of 100 DV events was 97.5% and 87.1% for mental health disorders related to POIs and victims, respectively. After applying our app...
Lecture Notes in Computer Science, 2016
A patient’s occupation is an important variable used for disease surveillance and modeling, but s... more A patient’s occupation is an important variable used for disease surveillance and modeling, but such information is often only available in free-text clinical narratives. We have developed a large occupation dictionary that is used as part of both knowledge- (dictionary and rules) and data-driven (machine-learning) methods for the identification of occupation mentions. We have evaluated the approaches on both public and non-public clinical datasets. A machine-learning method using linear chain conditional random fields trained on minimalistic set of features achieved up to 88 % \( {\text{F}}_{1} \)-measure (token-level), with the occupation feature derived from the knowledge-driven method showing a notable positive impact across the datasets (up to additional 32 % \( {\text{F}}_{1} \)-measure).
Health Services and Delivery Research, 2020
Background Collecting NHS patient experience data is critical to ensure the delivery of high-qual... more Background Collecting NHS patient experience data is critical to ensure the delivery of high-quality services. Data are obtained from multiple sources, including service-specific surveys and widely used generic surveys. There are concerns about the timeliness of feedback, that some groups of patients and carers do not give feedback and that free-text feedback may be useful but is difficult to analyse. Objective To understand how to improve the collection and usefulness of patient experience data in services for people with long-term conditions using digital data capture and improved analysis of comments. Design The DEPEND study is a mixed-methods study with four parts: qualitative research to explore the perspectives of patients, carers and staff; use of computer science text-analytics methods to analyse comments; co-design of new tools to improve data collection and usefulness; and implementation and process evaluation to assess use of the tools and any impacts. Setting Services fo...
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
arXiv (Cornell University), Jul 31, 2023
arXiv (Cornell University), Aug 7, 2023
arXiv (Cornell University), Jan 8, 2023
Interactive Journal of Medical Research
Background Epidemiological criminology refers to health issues affecting incarcerated and noninca... more Background Epidemiological criminology refers to health issues affecting incarcerated and nonincarcerated offender populations, a group recognized as being challenging to conduct research with. Notwithstanding this, an urgent need exists for new knowledge and interventions to improve health, justice, and social outcomes for this marginalized population. Objective To better understand research outputs in the field of epidemiological criminology, we examined the lead author’s affiliation by analyzing peer-reviewed published outputs to determine countries and organizations (eg, universities, governmental and nongovernmental organizations) responsible for peer-reviewed publications. Methods We used a semiautomated approach to examine the first-author affiliations of 23,904 PubMed epidemiological studies related to incarcerated and offender populations published in English between 1946 and 2021. We also mapped research outputs to the World Justice Project Rule of Law Index to better unde...
arXiv (Cornell University), Oct 8, 2022
2020 International Conference on Computing and Information Technology (ICCIT-1441), 2020
Complete reporting of Experimental Meta-data (EM) is necessary for reproducing and understanding ... more Complete reporting of Experimental Meta-data (EM) is necessary for reproducing and understanding biomedical experiments and results. Experimental Metadata Reporting Checklist Questions (EMR-CLQs) have been designed and used by journals as guidelines to capture EM and evaluate the quality of the reporting. Automatically answering EMR-CLQs is necessary to check completeness and clarity of EM, which can be useful for the peer-review process. Moreover, automatically extracting the EMR-CLQs answers can be used to search the relevant literature for the meta-data analysis process in an efficient way. This paper shows the possibility of answering different types of EMR-CLQs automatically by understanding the structure of both EMR-CLQs and the biomedical article. A text mining model (rule-based approach) based on the information extraction techniques and the structure of the biomedical articles and the EMR-CLQs, is proposed as a first model in the biomedical reproducibility domain to answer EMR-CLQs automatically. The model was used to answer five EMR-CLQs of two different types automatically; Main and Attribute questions. We evaluated the feasibility of the model against gold-standard data of 58 full-text articles annotated by domain experts. The results are showing the possibility of answering the EMR-CLQs automatically with a mean f-measure of 75% and 73% for development and testing datasets, respectively.
International Journal of Population Data Science, 2020
IntroductionA significant amount of valuable information in Electronic Health Records (EHR) such ... more IntroductionA significant amount of valuable information in Electronic Health Records (EHR) such as laboratory test results or echocardiogram interpretations is embedded in lengthy free-text fields. Often patients’ personal information is also included in these narratives. Privacy legislation in different jurisdictions requires de-identification of this information prior to making it available for research. This process can be challenging and time-consuming. In particular, rule-based algorithms may lead to over-masking of essential medical terms, conditions, or devices that are named after individuals. Objectives and ApproachWe aimed to enhance ICES’ existing rule-based application to make it contextually-driven by applying Artificial Intelligence (AI). The ICES team collaborated with computer scientists at the University of Manchester who had already published work in this area and Evenset, a Toronto-based software company. Based on the Manchester University de-identification frame...
ArXiv, 2020
Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text.... more Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analys...
Expert Systems with Applications, 2018
Abstract Mobile application (app) websites such as Google Play and AppStore allow users to review... more Abstract Mobile application (app) websites such as Google Play and AppStore allow users to review their downloaded apps. Such reviews can be useful for app users, as they may help users make an informed decision; such reviews can also be potentially useful for app developers, if they contain valuable information concerning user needs and requirements. However, in order to unleash the value of app reviews for mobile app development, intelligent mining tools that can help discern relevant reviews from irrelevant ones must be provided. This paper surveys the state of the art in the development of such tools and techniques behind them. To gain insight into the maturity of the current support mining tools, the paper will also find out what app development information these tools have discovered and what challenges they are facing. The results of this survey can inform the development of more effective and intelligent app review mining techniques and tools.
Journal of medical Internet research, Jan 13, 2018
Vast numbers of domestic violence (DV) incidents are attended by the New South Wales Police Force... more Vast numbers of domestic violence (DV) incidents are attended by the New South Wales Police Force each year in New South Wales and recorded as both structured quantitative data and unstructured free text in the WebCOPS (Web-based interface for the Computerised Operational Policing System) database regarding the details of the incident, the victim, and person of interest (POI). Although the structured data are used for reporting purposes, the free text remains untapped for DV reporting and surveillance purposes. In this paper, we explore whether text mining can automatically identify mental health disorders from this unstructured text. We used a training set of 200 DV recorded events to design a knowledge-driven approach based on lexical patterns in text suggesting mental health disorders for POIs and victims. The precision returned from an evaluation set of 100 DV events was 97.5% and 87.1% for mental health disorders related to POIs and victims, respectively. After applying our app...
Lecture Notes in Computer Science, 2016
A patient’s occupation is an important variable used for disease surveillance and modeling, but s... more A patient’s occupation is an important variable used for disease surveillance and modeling, but such information is often only available in free-text clinical narratives. We have developed a large occupation dictionary that is used as part of both knowledge- (dictionary and rules) and data-driven (machine-learning) methods for the identification of occupation mentions. We have evaluated the approaches on both public and non-public clinical datasets. A machine-learning method using linear chain conditional random fields trained on minimalistic set of features achieved up to 88 % \( {\text{F}}_{1} \)-measure (token-level), with the occupation feature derived from the knowledge-driven method showing a notable positive impact across the datasets (up to additional 32 % \( {\text{F}}_{1} \)-measure).
Health Services and Delivery Research, 2020
Background Collecting NHS patient experience data is critical to ensure the delivery of high-qual... more Background Collecting NHS patient experience data is critical to ensure the delivery of high-quality services. Data are obtained from multiple sources, including service-specific surveys and widely used generic surveys. There are concerns about the timeliness of feedback, that some groups of patients and carers do not give feedback and that free-text feedback may be useful but is difficult to analyse. Objective To understand how to improve the collection and usefulness of patient experience data in services for people with long-term conditions using digital data capture and improved analysis of comments. Design The DEPEND study is a mixed-methods study with four parts: qualitative research to explore the perspectives of patients, carers and staff; use of computer science text-analytics methods to analyse comments; co-design of new tools to improve data collection and usefulness; and implementation and process evaluation to assess use of the tools and any impacts. Setting Services fo...
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
arXiv (Cornell University), Jul 31, 2023