Eva Boj | Universitat de Barcelona (original) (raw)

Papers by Eva Boj

Research paper thumbnail of The use of distance-based regression and generalised linear models in the rate making process. An empirical study

We propose the use of distance-based regression models for the selection of tariff variables in t... more We propose the use of distance-based regression models for the selection of tariff variables in the rate making process. These models permit the direct use of mixed risk factors. Distance-based measures and statistical tests are constructed for the design of a stepwise procedure. The method is illustrated using a portfolio from a Spanish automobile insurer. For comparison, the selection was also made using the well-known generalized linear regression models as is suggested for this purpose in the actuarial literature. The methods must be considered as alternative tools to assist the actuary, each having its own advantages and disadvantages.

Research paper thumbnail of Factores explicativos de la deserción de estudiantes de pedagogía

Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de est... more Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de estudiantes de pedagogía, considerando sus características individuales y académicas. El estudio se realizó con 531 estudiantes de la cohorte 2009. La investigación es de tipo cuantitativo, con un diseño explicativo, longitudinal y no experimental. La información se recolectó a partir de datos secundarios, los cuales fueron analizados según el método de análisis de supervivencia, modelados a través de la regresión de riesgos proporcionales de Cox. Los resultados demostraron que las variables individuales que explican la deserción de los estudiantes corresponden al sexo y la procedencia de la región del Bio Bio. Por otro lado, las variables académicas que explican la deserción universitaria corresponden al promedio de notas de enseñanza media, el lugar en la lista de seleccionados, provenir de un establecimiento secundario científico-humanista, el total de asignaturas inscritas, el último promedio curricular y la suspensión de estudios. Se concluye que las capacidades asociadas al nivel de logro de los resultados académicos y la gestión de apoyo social para los estudiantes, se constituyen en aspectos significativos para mantener el compromiso por permanecer en el programa académico. En la medida que las capacidades y la gestión de apoyo sean positivas, los estudiantes contarán con interacciones favorables que apoyarán su participación a nivel institucional, lo cual favorecerá su desarrollo intelectual y académico. Finalmente, se concluye que a nivel de política institucional resulta relevante gestionar el apoyo de las capacidades y la adaptación de los estudiantes, ya que se contribuirá en la generación de un equilibrio positivo entre la integración académica y social, a partir de la configuración de elementos que apoyarán el desarrollo de un contexto de motivación que permitirá mantener el compromiso de los estudiantes por el logro de la meta de graduación. Palabras clave: deserción estudiantil; educación superior; análisis de supervivencia; modelo de regresión Cox. [en] Explanatory factors the student teachers drop out rates Abstract. The aim of this study is to identify the factors that influence student-teachers drop-out rates, taking their individual and academic characteristics into account. The study was conducted on 531 student-teachers from the 2009 cohort. This is a quantitative and non experimental study of a

Research paper thumbnail of Implementing PLS for distance-based regression: computational issues

Computational Statistics, 2007

Distance-based regression allows for a neat implementation of the partial least squares recurrenc... more Distance-based regression allows for a neat implementation of the partial least squares recurrence. In this paper, we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.

Research paper thumbnail of Enseñanza práctica de la matemática actuarial

El presente trabajo tiene tres objetivos: en primer lugar, dar a conocer la organización de la en... more El presente trabajo tiene tres objetivos: en primer lugar, dar a conocer la organización de la enseñanza práctica de la matemática actuarial (tanto vida como no vida) dentro de la Licenciatura en Ciencias Actuariales y Financieras que se imparte en la Facultad de Ciencias Económicas de la Universidad de Barcelona; en segundo lugar, se desea también, compartir la experiencia (exponer los problemas y las soluciones por las que se ha optado) con otras enseñanzas prácticas para mejorar y optimizar el proceso de aprendizaje; y, por último, constituir el punto de partida para el proceso de reflexión sobre enseñanza práctica y teórica de la matemática actuarial en el nuevo Espacio Europeo de Educación Superior.

Research paper thumbnail of CRITERIOS DE SELECCIÓN DE MODELO EN CREDIT SCORING. APLICACIÓN DEL ANÁLISIS DISCRIMINANTE BASADO EN DISTANCIAS

The aim of this paper is to study model selection criteria in credit scoring. Such criteria are u... more The aim of this paper is to study model selection criteria in credit scoring. Such criteria are usually derived from an error cost function which takes into account misclassification probabilities in good and bad credit risk subpopulations plus other parameters encoding context information relevant to the objective portfolio. We present a distance based classification approach to credit scoring, as an

Research paper thumbnail of EL TRABAJO EN GRUPO COMO TÉCNICA DE APRENDIZAJE EN LA ENSEÑANZA SUPERIOR

Research paper thumbnail of Global and local distance-based generalized linear models

TEST, 2015

ABSTRACT This paper introduces local distance-based generalized linear models. These models exten... more ABSTRACT This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models first to the generalized linear model framework. Then, a nonparametric version of these models is proposed by means of local fitting. Distances between individuals are the only predictor information needed to fit these models. Therefore, they are applicable, among others, to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. An implementation is provided by the R package dbstats, which also implements other distance-based prediction methods. Supplementary material for this article is available online, which reproduces all the results of this article.

Research paper thumbnail of Local Distance-Based Generalized Linear Models using the dbstats package for R

This paper introduces local distance-based generalized linear models. These models extend (weight... more This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be

Research paper thumbnail of Assessing the Importance of Risk Factors in Distance-Based Generalized Linear Models

Methodology and Computing in Applied Probability, 2014

ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent va... more ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent variables derived from the distance function. This key feature has the drawback of adding a non-linearity layer between observed predictors and response which shields one from the other and, in particular, prevents us from interpreting linear predictor coefficients as influence measures. In actuarial applications such as credit scoring or a priori rate-making we cannot forgo this capability, crucial to assess the relative leverage of risk factors. Towards the goal of recovering this functionality we define and study influence coefficients, measuring the relative importance of observed predictors. Unavoidably, due to inherent model non-linearities, these quantities will be local -valid in a neighborhood of a given point in predictor space.

Research paper thumbnail of Local Linear Functional Regression Based on Weighted Distance-based Regression

Contributions to Statistics, 2008

We consider the problem of nonparametrically predicting a scalar response variable y from a funct... more We consider the problem of nonparametrically predicting a scalar response variable y from a functional predictor χ. We have n observations (χ i , y i ) and we assign a

Research paper thumbnail of Projection error term in Gower's interpolation

Journal of Statistical Planning and Inference, 2009

New points can be superimposed on a Euclidean configuration obtained as a result of a metric mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric multidimensional scaling at coordinates given by Gower's interpolation formula. The procedure amounts to discarding a, possibly nonnull, coordinate along an additional dimension. We derive an analytical formula for this projection error term and, for real data problems, we describe a statistical method for testing its significance, as a cautionary device prior to further distance-based predictions.

Research paper thumbnail of Distance-based local linear regression for functional predictors

Computational Statistics & Data Analysis, 2010

ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional... more ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional predictor is considered. A sample of pairs (functional predictor and response) is observed. When predicting the response for a new functional predictor value, a semi-metric is used to compute the distances between the new and the previously observed functional predictors. Then each pair in the original sample is weighted according to a decreasing function of these distances. A Weighted (Linear) Distance-Based Regression is fitted, where the weights are as above and the distances are given by a possibly different semi-metric. This approach can be extended to nonparametric predictions from other kinds of explanatory variables (e.g., data of mixed type) in a natural way.

Research paper thumbnail of IMPLEMENTING PLS FOR DISTANCE-BASED REGRESSION: COMPUTATIONAL ISSUES

Distance-based regression allows for a neat implementation of the Partial Least Squares recurrenc... more Distance-based regression allows for a neat implementation of the Partial Least Squares recurrence. In this paper we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.

Research paper thumbnail of Interaction Terms in Distance-Based Regression

Communications in Statistics - Theory and Methods, 2009

We propose a method of including polynomial and interaction terms in Distance-Based Regression (C... more We propose a method of including polynomial and interaction terms in Distance-Based Regression (Cuadras and Arenas, 1990), relying on properties of a semi-Hadamard or Khatri-Rao product of matrices. We demonstrate its application to real data examples.

Research paper thumbnail of Statistical Aspects of Gower's Interpolation: Error Term and its Influence on Prediction

New points can be superimposed on a Euclidean configuration obtained as a result of a metric Mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric Multidimensional Scaling at coordinates given by Gower’s interpolation formula. The procedure amounts to discarding a, possibly non-null, coordinate along an additional dimension. We compute this error term, assessing
its influence on distance-based predictions.

Research paper thumbnail of The use of distance-based regression and generalised linear models in the rate making process. An empirical study

We propose the use of distance-based regression models for the selection of tariff variables in t... more We propose the use of distance-based regression models for the selection of tariff variables in the rate making process. These models permit the direct use of mixed risk factors. Distance-based measures and statistical tests are constructed for the design of a stepwise procedure. The method is illustrated using a portfolio from a Spanish automobile insurer. For comparison, the selection was also made using the well-known generalized linear regression models as is suggested for this purpose in the actuarial literature. The methods must be considered as alternative tools to assist the actuary, each having its own advantages and disadvantages.

Research paper thumbnail of Factores explicativos de la deserción de estudiantes de pedagogía

Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de est... more Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de estudiantes de pedagogía, considerando sus características individuales y académicas. El estudio se realizó con 531 estudiantes de la cohorte 2009. La investigación es de tipo cuantitativo, con un diseño explicativo, longitudinal y no experimental. La información se recolectó a partir de datos secundarios, los cuales fueron analizados según el método de análisis de supervivencia, modelados a través de la regresión de riesgos proporcionales de Cox. Los resultados demostraron que las variables individuales que explican la deserción de los estudiantes corresponden al sexo y la procedencia de la región del Bio Bio. Por otro lado, las variables académicas que explican la deserción universitaria corresponden al promedio de notas de enseñanza media, el lugar en la lista de seleccionados, provenir de un establecimiento secundario científico-humanista, el total de asignaturas inscritas, el último promedio curricular y la suspensión de estudios. Se concluye que las capacidades asociadas al nivel de logro de los resultados académicos y la gestión de apoyo social para los estudiantes, se constituyen en aspectos significativos para mantener el compromiso por permanecer en el programa académico. En la medida que las capacidades y la gestión de apoyo sean positivas, los estudiantes contarán con interacciones favorables que apoyarán su participación a nivel institucional, lo cual favorecerá su desarrollo intelectual y académico. Finalmente, se concluye que a nivel de política institucional resulta relevante gestionar el apoyo de las capacidades y la adaptación de los estudiantes, ya que se contribuirá en la generación de un equilibrio positivo entre la integración académica y social, a partir de la configuración de elementos que apoyarán el desarrollo de un contexto de motivación que permitirá mantener el compromiso de los estudiantes por el logro de la meta de graduación. Palabras clave: deserción estudiantil; educación superior; análisis de supervivencia; modelo de regresión Cox. [en] Explanatory factors the student teachers drop out rates Abstract. The aim of this study is to identify the factors that influence student-teachers drop-out rates, taking their individual and academic characteristics into account. The study was conducted on 531 student-teachers from the 2009 cohort. This is a quantitative and non experimental study of a

Research paper thumbnail of Implementing PLS for distance-based regression: computational issues

Computational Statistics, 2007

Distance-based regression allows for a neat implementation of the partial least squares recurrenc... more Distance-based regression allows for a neat implementation of the partial least squares recurrence. In this paper, we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.

Research paper thumbnail of Enseñanza práctica de la matemática actuarial

El presente trabajo tiene tres objetivos: en primer lugar, dar a conocer la organización de la en... more El presente trabajo tiene tres objetivos: en primer lugar, dar a conocer la organización de la enseñanza práctica de la matemática actuarial (tanto vida como no vida) dentro de la Licenciatura en Ciencias Actuariales y Financieras que se imparte en la Facultad de Ciencias Económicas de la Universidad de Barcelona; en segundo lugar, se desea también, compartir la experiencia (exponer los problemas y las soluciones por las que se ha optado) con otras enseñanzas prácticas para mejorar y optimizar el proceso de aprendizaje; y, por último, constituir el punto de partida para el proceso de reflexión sobre enseñanza práctica y teórica de la matemática actuarial en el nuevo Espacio Europeo de Educación Superior.

Research paper thumbnail of CRITERIOS DE SELECCIÓN DE MODELO EN CREDIT SCORING. APLICACIÓN DEL ANÁLISIS DISCRIMINANTE BASADO EN DISTANCIAS

The aim of this paper is to study model selection criteria in credit scoring. Such criteria are u... more The aim of this paper is to study model selection criteria in credit scoring. Such criteria are usually derived from an error cost function which takes into account misclassification probabilities in good and bad credit risk subpopulations plus other parameters encoding context information relevant to the objective portfolio. We present a distance based classification approach to credit scoring, as an

Research paper thumbnail of EL TRABAJO EN GRUPO COMO TÉCNICA DE APRENDIZAJE EN LA ENSEÑANZA SUPERIOR

Research paper thumbnail of Global and local distance-based generalized linear models

TEST, 2015

ABSTRACT This paper introduces local distance-based generalized linear models. These models exten... more ABSTRACT This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models first to the generalized linear model framework. Then, a nonparametric version of these models is proposed by means of local fitting. Distances between individuals are the only predictor information needed to fit these models. Therefore, they are applicable, among others, to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. An implementation is provided by the R package dbstats, which also implements other distance-based prediction methods. Supplementary material for this article is available online, which reproduces all the results of this article.

Research paper thumbnail of Local Distance-Based Generalized Linear Models using the dbstats package for R

This paper introduces local distance-based generalized linear models. These models extend (weight... more This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be

Research paper thumbnail of Assessing the Importance of Risk Factors in Distance-Based Generalized Linear Models

Methodology and Computing in Applied Probability, 2014

ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent va... more ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent variables derived from the distance function. This key feature has the drawback of adding a non-linearity layer between observed predictors and response which shields one from the other and, in particular, prevents us from interpreting linear predictor coefficients as influence measures. In actuarial applications such as credit scoring or a priori rate-making we cannot forgo this capability, crucial to assess the relative leverage of risk factors. Towards the goal of recovering this functionality we define and study influence coefficients, measuring the relative importance of observed predictors. Unavoidably, due to inherent model non-linearities, these quantities will be local -valid in a neighborhood of a given point in predictor space.

Research paper thumbnail of Local Linear Functional Regression Based on Weighted Distance-based Regression

Contributions to Statistics, 2008

We consider the problem of nonparametrically predicting a scalar response variable y from a funct... more We consider the problem of nonparametrically predicting a scalar response variable y from a functional predictor χ. We have n observations (χ i , y i ) and we assign a

Research paper thumbnail of Projection error term in Gower's interpolation

Journal of Statistical Planning and Inference, 2009

New points can be superimposed on a Euclidean configuration obtained as a result of a metric mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric multidimensional scaling at coordinates given by Gower's interpolation formula. The procedure amounts to discarding a, possibly nonnull, coordinate along an additional dimension. We derive an analytical formula for this projection error term and, for real data problems, we describe a statistical method for testing its significance, as a cautionary device prior to further distance-based predictions.

Research paper thumbnail of Distance-based local linear regression for functional predictors

Computational Statistics & Data Analysis, 2010

ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional... more ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional predictor is considered. A sample of pairs (functional predictor and response) is observed. When predicting the response for a new functional predictor value, a semi-metric is used to compute the distances between the new and the previously observed functional predictors. Then each pair in the original sample is weighted according to a decreasing function of these distances. A Weighted (Linear) Distance-Based Regression is fitted, where the weights are as above and the distances are given by a possibly different semi-metric. This approach can be extended to nonparametric predictions from other kinds of explanatory variables (e.g., data of mixed type) in a natural way.

Research paper thumbnail of IMPLEMENTING PLS FOR DISTANCE-BASED REGRESSION: COMPUTATIONAL ISSUES

Distance-based regression allows for a neat implementation of the Partial Least Squares recurrenc... more Distance-based regression allows for a neat implementation of the Partial Least Squares recurrence. In this paper we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.

Research paper thumbnail of Interaction Terms in Distance-Based Regression

Communications in Statistics - Theory and Methods, 2009

We propose a method of including polynomial and interaction terms in Distance-Based Regression (C... more We propose a method of including polynomial and interaction terms in Distance-Based Regression (Cuadras and Arenas, 1990), relying on properties of a semi-Hadamard or Khatri-Rao product of matrices. We demonstrate its application to real data examples.

Research paper thumbnail of Statistical Aspects of Gower's Interpolation: Error Term and its Influence on Prediction

New points can be superimposed on a Euclidean configuration obtained as a result of a metric Mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric Multidimensional Scaling at coordinates given by Gower’s interpolation formula. The procedure amounts to discarding a, possibly non-null, coordinate along an additional dimension. We compute this error term, assessing
its influence on distance-based predictions.