Noryanti Muhammad | Universiti Malaysia Pahang (UMP) (original) (raw)
Papers by Noryanti Muhammad
Applied mathematics and computational intelligence, Jun 4, 2024
Journal of Advanced Research in Applied Sciences and Engineering Technology, May 17, 2024
Competitive analysis in digital and technology is trending in the business field. However, the fi... more Competitive analysis in digital and technology is trending in the business field. However, the field of digital and technology in the business world is vast and challenging to analyse. The purpose of this research is first to identify the success factor which represent the company performance. Second, is to identify the significant services provided by the company to their business user. Then, based on the first and second objectives, a predictive modelling is developed to produce the best solution to their business user. The research is implementing a case study from Telecommunication Company and using data science life cycle methodology. The statistical modelling that is used to develop the competitor's analysis model is generalised linear model (GLM) which integrated with machine learning approach. Furthermore, the synthetic data set is created by using Gamma Distribution, Gaussian Distribution and Poisson Distribution due to some data from the case study is confidential. The synthetic data set is based on existing real data which are from Telecommunication Company sentiment analysis data, were used to investigate the performance of the proposed model. The machine learning technique is used to get the accuracy of the significant GLM which has been developed. The accuracy is tested by using the error rates of the machine learning technique which are Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R-squared. This research discovered that the business solution to the significant service for the business user and discovered the best statistical model to be used for the business solution. The results show that the Gumbel distribution is the best fit model for the synthetic dataset where the values of RMSE is 1.0574, MAE is 0.9168 and R-squared is 0.3994, and the significant success factor that has been identified by using the GLM is advertising success factor. The model developed can be improved with another type of data set and different sizes of data. Hence, further studies and real-world data are required for better validation.
AIP conference proceedings, 2024
AIP conference proceedings, 2024
AIP conference proceedings, 2024
PENERBIT UNIVERSITI MALAYSIA PAHANG eBooks, Apr 1, 2023
Nucleation and Atmospheric Aerosols, 2023
Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In t... more Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In this paper, we focus on the development of predictive modelling of the CVDs using Logistic Regression Analysis (LRA). The model uses self-reported information of individuals on selected variables which also called as non-laboratory-based features. We use the binary logistic regression analysis due to the binary nature of the outcome feature which is the cardiovascular diseases (CVDs) status. The methodology to develop the risk prediction models are discussed. Then, the risk prediction model of the CVDs using LRA is developed. The performance of the LRA model was evaluated through 10-fold cross-validation and the performance were presented in term of accuracy, sensitivity, specificity, kappa statistic, root mean square error (RMSE) and area under the curve (AUC). Forward stepwise LRA model highlighted the individual feature importance in the LRA model. The role of dietary habit is different from the existing risk prediction models in terms of direction or statistical significance. Mainly dietary features which significantly vary in different regions. Therefore, this finding supported the assumption that each country or region should has their own risk prediction models. It concludes that every country must have their local risk prediction models, but it is recommended to follow standard methodology and risk assessment framework which been established.
International Journal of Humanities Technology and Civilization
Learning Mathematics is the prerequisite for mastery in learning. There are various reasons for t... more Learning Mathematics is the prerequisite for mastery in learning. There are various reasons for the weakness of primary school students in the mastery of Mathematics. One of the characteristics of students' ability to master learning is their ability to control the heart's coherence "Heart Rate Variability - HRV". Coherent control of HRV allows individuals to control their mind and emotions and thus stimulate their ability to learn more effectively. This study was carried out to examine the extent to which students with mathematics mastery abnormalities are able to control the coherence of the heart. Is there a difference in coherence scores between good students and poor students in Maths mastery. This study was conducted involving 50 sample people in two groups of students with different achievements, good and poor. Students are given some Math exercises to examine if there is a difference in heart coherence scores when they answer Math questions. Findings show t...
The 5TH ISM INTERNATIONAL STATISTICAL CONFERENCE 2021 (ISM-V): Statistics in the Spotlight: Navigating the New Norm
Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In t... more Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In this paper, we focus on the development of predictive modelling of the CVDs using Logistic Regression Analysis (LRA). The model uses self-reported information of individuals on selected variables which also called as non-laboratory-based features. We use the binary logistic regression analysis due to the binary nature of the outcome feature which is the cardiovascular diseases (CVDs) status. The methodology to develop the risk prediction models are discussed. Then, the risk prediction model of the CVDs using LRA is developed. The performance of the LRA model was evaluated through 10-fold cross-validation and the performance were presented in term of accuracy, sensitivity, specificity, kappa statistic, root mean square error (RMSE) and area under the curve (AUC). Forward stepwise LRA model highlighted the individual feature importance in the LRA model. The role of dietary habit is different from the existing risk prediction models in terms of direction or statistical significance. Mainly dietary features which significantly vary in different regions. Therefore, this finding supported the assumption that each country or region should has their own risk prediction models. It concludes that every country must have their local risk prediction models, but it is recommended to follow standard methodology and risk assessment framework which been established.
Journal of Food Process Engineering
MATEC Web of Conferences, 2018
Many real-world problems of statistical inference involve dependent bivariate data including surv... more Many real-world problems of statistical inference involve dependent bivariate data including survival analysis. This paper presents new nonparametric methods for predictive inference for survival analysis involving a future bivariate observation. The method combine between bivariate Nonparametric Predictive Inference (NPI) for the marginals with parametric copula to take dependence structure into account. The proposed method is a discretized version of the parametric copula. The NPI fits the marginal and very straight forward computations. Generally, NPI is a frequentist approach which infer a future observation based on past data. The proposed method resulting imprecision is robustness with regard to the assumed parametric copula in the marginal for prediction. This is practical for small data set. The suggestion is to use a basic parametric copula for small data sets. We investigate and discuss the performance of these methods by presenting results from simulation studies. The method is further illustrated via application in survival analysis using data sets from the literature.
Reliability Engineering & System Safety, 2014
The nonparametric predictive inference (NPI) approach for competing risks data has recently been ... more The nonparametric predictive inference (NPI) approach for competing risks data has recently been presented, in particular addressing the question due to which of the competing risks the next unit will fail, and also considering the effects of unobserved, redefined , unknown or removed competing risks. In this paper, we introduce how the NPI approach can be used to deal with situations where units are not all at risk from all competing risks. This may typically occur if one combines information from multiple samples, which can e.g. be related to further aspects of units that define the samples or groups to which the units belong or to different applications where the circumstances under which the units operate can vary. We study the effect of combining the additional information from these multiple samples, so effectively borrowing information on specific competing risks from other units, on the inferences. Such combination of information can be relevant to competing risks scenarios in a variety of application areas, including engineering and medical studies.
Data Analytics and Applied Mathematics (DAAM), 2020
Previous researches usually applied Bass diffusion model (BDM) in forecasting the new product in ... more Previous researches usually applied Bass diffusion model (BDM) in forecasting the new product in various areas. This is the first application of BDM to the new tourism product since the model had been developed by Frank M. Bass in 1969. On the other hand, Grey forecasting model able to deal with limited number of data. Both BDM and grey forecasting model have been used in various areas in the forecasting studies. Taking advantages of both models, the combination of both Bass and grey model, called grey Bass forecasting model is applied in the context of the new tourism product forecasting. The objective of this study is to forecast the new tourism product demand in Malaysia using the developed model. Yearly visitors from two ecotourism resorts in Pahang, Tanah Aina Fahad and Tanah Aina Farrah Soraya from 2014 until 2018 are used. The results show that both BDM and grey Bass forecasting model are suitable in forecasting the new tourism product. The authors also suggest other factors ...
Artificial Intelligence in Medicine
International Journal of Environmental Research and Public Health
Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular dise... more Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular diseases (CVDs) in new populations motivates researchers to develop regional models. The predominant usage of laboratory features in these RPMs is also causing reproducibility issues in low–middle-income countries (LMICs). Further, conventional logistic regression analysis (LRA) does not consider non-linear associations and interaction terms in developing these RPMs, which might oversimplify the phenomenon. This study aims to develop alternative machine learning (ML)-based RPMs that may perform better at predicting CVD status using nonlaboratory features in comparison to conventional RPMs. The data was based on a case–control study conducted at the Punjab Institute of Cardiology, Pakistan. Data from 460 subjects, aged between 30 and 76 years, with (1:1) gender-based matching, was collected. We tested various ML models to identify the best model/models considering LRA as a baseline RPM. An ar...
Computational Intelligence and Neuroscience
Machine learning (ML) often provides applicable high-performance models to facilitate decision-ma... more Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindrance in their application. Therefore, in highly sensitive decisions, black boxes of ML models are not recommended. We proposed a novel methodology that uses complex supervised ML models and transforms them into simple, interpretable, transparent statistical models. This methodology is like stacking ensemble ML in which the best ML models are used as a base learner to compute relative feature weights. The index of these weights is further used as a single covariate in the simple logistic regression model to estimate the likelihood of an event. We tested this methodology on the primary dataset related to cardiovascular diseases (CVDs), the leading cause of mortalities in recent times. Therefor...
Nonparametric predictive inference (NPI) is a statistical approach with strong frequentist proper... more Nonparametric predictive inference (NPI) is a statistical approach with strong frequentist properties, with inferences explicitly in terms of one or more future observations. NPI is based on relatively few modelling assumptions, enabled by the use of lower and upper probabilities to quantify m;icertainty. While NPI has been developed for a range of data types, and for a variety of applications, thus far it has not been developed for multivariate data. This thesis presents the first study in this direction. Restricting attention to bivariate data, a novel approach is presented which combines NPI for the marginals with copulas for representing the dependence between the two variables. It turns out that, by using a discretization of the copula, this combined method leads to relatively easy computations. The new method is introduced with use of an assumed parametric copula. The main idea is that NPI on the marginals provides a level of robustness which, for small to medium-sized data se...
Research Journal of Pharmacy and Technology, 2021
Physical inactivity (PI) is an established modifiable risk factor of cardiovascular diseases (CVD... more Physical inactivity (PI) is an established modifiable risk factor of cardiovascular diseases (CVDs) which is the leading cause of global mortality. Researchers and practitioners have been trying to reduce the surge of PI in the population but still, a substantial chunk of the world population is struggling with the issues of PI. This study is aimed at determining the prevalence and associated background factors of PI among CVDs patients. Further, profiles of potentially physically inactive people will also be identified for the future. A cross-sectional study was conducted at Punjab Institute of Cardiology (PIC) Lahore, Pakistan spanning the duration of September 2018 to February 2019. A sample of 230 CVDs patients, using 95% confidence interval (CI), 80% power of test and 5% margin of error was selected in the study. The data on PI was collected using standardized international physical activity questionnaire. In addition to descriptive statistics, bivariate analysis, multiple logi...
Applied mathematics and computational intelligence, Jun 4, 2024
Journal of Advanced Research in Applied Sciences and Engineering Technology, May 17, 2024
Competitive analysis in digital and technology is trending in the business field. However, the fi... more Competitive analysis in digital and technology is trending in the business field. However, the field of digital and technology in the business world is vast and challenging to analyse. The purpose of this research is first to identify the success factor which represent the company performance. Second, is to identify the significant services provided by the company to their business user. Then, based on the first and second objectives, a predictive modelling is developed to produce the best solution to their business user. The research is implementing a case study from Telecommunication Company and using data science life cycle methodology. The statistical modelling that is used to develop the competitor's analysis model is generalised linear model (GLM) which integrated with machine learning approach. Furthermore, the synthetic data set is created by using Gamma Distribution, Gaussian Distribution and Poisson Distribution due to some data from the case study is confidential. The synthetic data set is based on existing real data which are from Telecommunication Company sentiment analysis data, were used to investigate the performance of the proposed model. The machine learning technique is used to get the accuracy of the significant GLM which has been developed. The accuracy is tested by using the error rates of the machine learning technique which are Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R-squared. This research discovered that the business solution to the significant service for the business user and discovered the best statistical model to be used for the business solution. The results show that the Gumbel distribution is the best fit model for the synthetic dataset where the values of RMSE is 1.0574, MAE is 0.9168 and R-squared is 0.3994, and the significant success factor that has been identified by using the GLM is advertising success factor. The model developed can be improved with another type of data set and different sizes of data. Hence, further studies and real-world data are required for better validation.
AIP conference proceedings, 2024
AIP conference proceedings, 2024
AIP conference proceedings, 2024
PENERBIT UNIVERSITI MALAYSIA PAHANG eBooks, Apr 1, 2023
Nucleation and Atmospheric Aerosols, 2023
Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In t... more Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In this paper, we focus on the development of predictive modelling of the CVDs using Logistic Regression Analysis (LRA). The model uses self-reported information of individuals on selected variables which also called as non-laboratory-based features. We use the binary logistic regression analysis due to the binary nature of the outcome feature which is the cardiovascular diseases (CVDs) status. The methodology to develop the risk prediction models are discussed. Then, the risk prediction model of the CVDs using LRA is developed. The performance of the LRA model was evaluated through 10-fold cross-validation and the performance were presented in term of accuracy, sensitivity, specificity, kappa statistic, root mean square error (RMSE) and area under the curve (AUC). Forward stepwise LRA model highlighted the individual feature importance in the LRA model. The role of dietary habit is different from the existing risk prediction models in terms of direction or statistical significance. Mainly dietary features which significantly vary in different regions. Therefore, this finding supported the assumption that each country or region should has their own risk prediction models. It concludes that every country must have their local risk prediction models, but it is recommended to follow standard methodology and risk assessment framework which been established.
International Journal of Humanities Technology and Civilization
Learning Mathematics is the prerequisite for mastery in learning. There are various reasons for t... more Learning Mathematics is the prerequisite for mastery in learning. There are various reasons for the weakness of primary school students in the mastery of Mathematics. One of the characteristics of students' ability to master learning is their ability to control the heart's coherence "Heart Rate Variability - HRV". Coherent control of HRV allows individuals to control their mind and emotions and thus stimulate their ability to learn more effectively. This study was carried out to examine the extent to which students with mathematics mastery abnormalities are able to control the coherence of the heart. Is there a difference in coherence scores between good students and poor students in Maths mastery. This study was conducted involving 50 sample people in two groups of students with different achievements, good and poor. Students are given some Math exercises to examine if there is a difference in heart coherence scores when they answer Math questions. Findings show t...
The 5TH ISM INTERNATIONAL STATISTICAL CONFERENCE 2021 (ISM-V): Statistics in the Spotlight: Navigating the New Norm
Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In t... more Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. In this paper, we focus on the development of predictive modelling of the CVDs using Logistic Regression Analysis (LRA). The model uses self-reported information of individuals on selected variables which also called as non-laboratory-based features. We use the binary logistic regression analysis due to the binary nature of the outcome feature which is the cardiovascular diseases (CVDs) status. The methodology to develop the risk prediction models are discussed. Then, the risk prediction model of the CVDs using LRA is developed. The performance of the LRA model was evaluated through 10-fold cross-validation and the performance were presented in term of accuracy, sensitivity, specificity, kappa statistic, root mean square error (RMSE) and area under the curve (AUC). Forward stepwise LRA model highlighted the individual feature importance in the LRA model. The role of dietary habit is different from the existing risk prediction models in terms of direction or statistical significance. Mainly dietary features which significantly vary in different regions. Therefore, this finding supported the assumption that each country or region should has their own risk prediction models. It concludes that every country must have their local risk prediction models, but it is recommended to follow standard methodology and risk assessment framework which been established.
Journal of Food Process Engineering
MATEC Web of Conferences, 2018
Many real-world problems of statistical inference involve dependent bivariate data including surv... more Many real-world problems of statistical inference involve dependent bivariate data including survival analysis. This paper presents new nonparametric methods for predictive inference for survival analysis involving a future bivariate observation. The method combine between bivariate Nonparametric Predictive Inference (NPI) for the marginals with parametric copula to take dependence structure into account. The proposed method is a discretized version of the parametric copula. The NPI fits the marginal and very straight forward computations. Generally, NPI is a frequentist approach which infer a future observation based on past data. The proposed method resulting imprecision is robustness with regard to the assumed parametric copula in the marginal for prediction. This is practical for small data set. The suggestion is to use a basic parametric copula for small data sets. We investigate and discuss the performance of these methods by presenting results from simulation studies. The method is further illustrated via application in survival analysis using data sets from the literature.
Reliability Engineering & System Safety, 2014
The nonparametric predictive inference (NPI) approach for competing risks data has recently been ... more The nonparametric predictive inference (NPI) approach for competing risks data has recently been presented, in particular addressing the question due to which of the competing risks the next unit will fail, and also considering the effects of unobserved, redefined , unknown or removed competing risks. In this paper, we introduce how the NPI approach can be used to deal with situations where units are not all at risk from all competing risks. This may typically occur if one combines information from multiple samples, which can e.g. be related to further aspects of units that define the samples or groups to which the units belong or to different applications where the circumstances under which the units operate can vary. We study the effect of combining the additional information from these multiple samples, so effectively borrowing information on specific competing risks from other units, on the inferences. Such combination of information can be relevant to competing risks scenarios in a variety of application areas, including engineering and medical studies.
Data Analytics and Applied Mathematics (DAAM), 2020
Previous researches usually applied Bass diffusion model (BDM) in forecasting the new product in ... more Previous researches usually applied Bass diffusion model (BDM) in forecasting the new product in various areas. This is the first application of BDM to the new tourism product since the model had been developed by Frank M. Bass in 1969. On the other hand, Grey forecasting model able to deal with limited number of data. Both BDM and grey forecasting model have been used in various areas in the forecasting studies. Taking advantages of both models, the combination of both Bass and grey model, called grey Bass forecasting model is applied in the context of the new tourism product forecasting. The objective of this study is to forecast the new tourism product demand in Malaysia using the developed model. Yearly visitors from two ecotourism resorts in Pahang, Tanah Aina Fahad and Tanah Aina Farrah Soraya from 2014 until 2018 are used. The results show that both BDM and grey Bass forecasting model are suitable in forecasting the new tourism product. The authors also suggest other factors ...
Artificial Intelligence in Medicine
International Journal of Environmental Research and Public Health
Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular dise... more Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular diseases (CVDs) in new populations motivates researchers to develop regional models. The predominant usage of laboratory features in these RPMs is also causing reproducibility issues in low–middle-income countries (LMICs). Further, conventional logistic regression analysis (LRA) does not consider non-linear associations and interaction terms in developing these RPMs, which might oversimplify the phenomenon. This study aims to develop alternative machine learning (ML)-based RPMs that may perform better at predicting CVD status using nonlaboratory features in comparison to conventional RPMs. The data was based on a case–control study conducted at the Punjab Institute of Cardiology, Pakistan. Data from 460 subjects, aged between 30 and 76 years, with (1:1) gender-based matching, was collected. We tested various ML models to identify the best model/models considering LRA as a baseline RPM. An ar...
Computational Intelligence and Neuroscience
Machine learning (ML) often provides applicable high-performance models to facilitate decision-ma... more Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindrance in their application. Therefore, in highly sensitive decisions, black boxes of ML models are not recommended. We proposed a novel methodology that uses complex supervised ML models and transforms them into simple, interpretable, transparent statistical models. This methodology is like stacking ensemble ML in which the best ML models are used as a base learner to compute relative feature weights. The index of these weights is further used as a single covariate in the simple logistic regression model to estimate the likelihood of an event. We tested this methodology on the primary dataset related to cardiovascular diseases (CVDs), the leading cause of mortalities in recent times. Therefor...
Nonparametric predictive inference (NPI) is a statistical approach with strong frequentist proper... more Nonparametric predictive inference (NPI) is a statistical approach with strong frequentist properties, with inferences explicitly in terms of one or more future observations. NPI is based on relatively few modelling assumptions, enabled by the use of lower and upper probabilities to quantify m;icertainty. While NPI has been developed for a range of data types, and for a variety of applications, thus far it has not been developed for multivariate data. This thesis presents the first study in this direction. Restricting attention to bivariate data, a novel approach is presented which combines NPI for the marginals with copulas for representing the dependence between the two variables. It turns out that, by using a discretization of the copula, this combined method leads to relatively easy computations. The new method is introduced with use of an assumed parametric copula. The main idea is that NPI on the marginals provides a level of robustness which, for small to medium-sized data se...
Research Journal of Pharmacy and Technology, 2021
Physical inactivity (PI) is an established modifiable risk factor of cardiovascular diseases (CVD... more Physical inactivity (PI) is an established modifiable risk factor of cardiovascular diseases (CVDs) which is the leading cause of global mortality. Researchers and practitioners have been trying to reduce the surge of PI in the population but still, a substantial chunk of the world population is struggling with the issues of PI. This study is aimed at determining the prevalence and associated background factors of PI among CVDs patients. Further, profiles of potentially physically inactive people will also be identified for the future. A cross-sectional study was conducted at Punjab Institute of Cardiology (PIC) Lahore, Pakistan spanning the duration of September 2018 to February 2019. A sample of 230 CVDs patients, using 95% confidence interval (CI), 80% power of test and 5% margin of error was selected in the study. The data on PI was collected using standardized international physical activity questionnaire. In addition to descriptive statistics, bivariate analysis, multiple logi...