Keith De Souza - Academia.edu (original) (raw)
Papers by Keith De Souza
Journal of Solar Energy Engineering, 2025
Accurate predictive daily global horizontal irradiation models are essential for diverse solar en... more Accurate predictive daily global horizontal irradiation models are essential for diverse solar energy applications. Their long-term performances can be assessed using average years. This study scrutinized 70 machine learning and 44 empirical models using two disjoint 5-year average daily training and validation datasets, each comprising 365 records
and 10 features. The features included day number, minimum and maximum air temperature, air temperature amplitude, theoretical and observed sunshine hours, theoretical extraterrestrial horizontal irradiation, relative sunshine, cloud cover, and relative humidity. Fourteen machine learning algorithms, namely, multiple linear regression, ridge regression,
Lasso regression, elastic net regression, Huber regression, k-nearest neighbors, decision tree, support vector machine, multilayer perceptron, extreme learning machine, generalized regression neural network, extreme gradient boosting, gradient boosting machine, and light gradient boosting machine were trained, validated, and instantiated as base learners in four strategically designed homogeneous parallel ensembles—variants
of pasting, random subspace, bagging, and random patches—which also were scrutinized, producing 70 models. Specific hyperparameters from the algorithms were optimized. Validation showed that at least two ensembles outperformed its individual model. Huber-subspace ranked first with a root mean square error of 1.495 MJ/m2/day. The multilayer
perceptron was most robust to the random perturbations of the ensembles which extrapolate to good tolerance to ground-truth data noise. The best empirical model returned a validation root mean square error of 1.595 MJ/m2/day but was outperformed by 93% of the machine learning models with the homogeneous parallel ensembles producing superior predictive accuracies.
This thesis reports on the use of spontaneous Brillouin scattering for the purpose of fibre-optic... more This thesis reports on the use of spontaneous Brillouin scattering for the purpose of fibre-optic distributed temperature and strain sensing based on a time-domain Landau-Placzek ratio technique. Detection system specifications are dictated by the spatial resolution, range, measurand resolution and measurement time. Pulsed sources are used in these sensors. The minimum spatial resolution depends on both the pulse width and receiver bandwidth. The range and measurand resolution depend on the peak pulse power launched into the sensing fibre as well as the Brillouin signal-to-noise characteristics at the receiver. The maximum launched pulse power is limited by the onset of nonlinear effects in the sensing fibre. Novel interferometric techniques based on low-cost, low loss all fibre Mach-Zehnder interferometric optical filters needed to separate the backscattered Rayleigh and spontaneous Brillouin signals have been developed with enhanced signal-to-noise capabilities. Used in conjunctio...
Journal of Solar Energy Engineering
Irradiation time equivalence pioneers the classification of models that predict monthly average d... more Irradiation time equivalence pioneers the classification of models that predict monthly average daily global solar radiation on a horizontal surface based on their double cross-validation performances. By exploiting indigenous irradiation data, novel irradiation-based models can be created and used to classify prediction models, thereby facilitating a deeper understanding of model performance beyond routine summary statistics. The concept was demonstrated by formulating novel 1-hour and 2-hour irradiation-based models to predict monthly average daily global horizontal irradiation. Double cross-validations of the two irradiation-based models and 70 existing regression models were performed using a pair of 5-year subsets. The 70 models used the measured meteorological predictors of air temperature and sunshine hours, either alone or combined. The irradiation time equivalence of a model evaluated under double cross-validation has been defined as the minimum number of hours of measured ...
Photoplethysmography (PPG) is an electro-optic technique used for measuring non-invasively the ph... more Photoplethysmography (PPG) is an electro-optic technique used for measuring non-invasively the physiological dynamics of blood volume pulsations. The inflow and outflow of blood in the vascular bed is pulsatile in nature and is the major physiological characteristic studied in the determination of micro-vascular resistance. The onset of disease generally causes an alteration in the value of micro- vascualar resistance, thereby indicating that this technique has great diagnostic capabilities. Micro-vascular resistance was measured from the temporal parameters of the finger PPG waveform using the software Chart. The ratio of the anacrotic time, Tg, to catacrotic time, Ta, was used as the measure of micro vascular resistance. Simulation of the diseased state was performed by restricting blood flow, using a rubber band on the index finger of twenty-seven (27) test subjects. The subjects exhibited a higher Tg/Ta ratio after the maneuver was performed. This result held true for both gende...
Journal of Renewable and Sustainable Energy
FIG. 4. Graphs of calculated (solid line) and measured (dashed line) monthly average hourly globa... more FIG. 4. Graphs of calculated (solid line) and measured (dashed line) monthly average hourly global solar radiation against solar time (h) for the D, WLJ, and CPRG models for six months of the validation period 2008.
Journal of Renewable and Sustainable Energy
Monthly optimum tilt angles of a flat-plate solar collector capable of south or north orientation... more Monthly optimum tilt angles of a flat-plate solar collector capable of south or north orientations were modeled for the tropical Caribbean island of Trinidad at 10.6 N latitude, using measured monthly average daily global and diffuse horizontal irradiation data from the period 2005-2010, as input to six transposition models comprising three isotropic (Liu and Jordan, Koronakis, and Badescu) and three anisotropic (Hay and Davies, 'Hay and Davies, Klucher and Reindl' , and Ma and Iqbal) models. The anisotropic models were in good agreement with one another, and an easily implementable technique was devised to determine the best suited decomposition-transposition model matches from six decomposition models due to Liu and Jordan and Klein, Page, Collares-Perreira and Rabl, Iqbal, Erbs et al., and Ibrahim. These matches can be used by territories having a similar climate to Trinidad but lacking measured diffuse horizontal irradiation or plane-of-array measurements, and the technique can be implemented globally by other host territories. The Ma and Iqbal model was chosen to simulate in detail the aforementioned collector as well as the one that was south-oriented only. A south/north-oriented collector required twelve monthly and two seasonal [April: 12.5 (north-oriented), October: 32 (south-oriented)], and annual [11.2 (south-oriented)] adjustments with corresponding gains in the collectable annual global solar irradiation compared to that on a horizontal surface of 9.3%, 7.9%, and 1.5%, respectively. In contrast, a south-oriented collector required eight monthly (September-April) and two seasonal (March: 0 and October: 35.5) adjustments with lower corresponding gains of 7.6% and 7.1%, respectively.
Journal of Renewable and Sustainable Energy, 2015
Photonics North 2004: Photonic Applications in Telecommunications, Sensors, Software, and Lasers, 2004
ABSTRACT
SPIE Proceedings, 2004
ABSTRACT
SPIE Proceedings, 2004
ABSTRACT
Frontiers in Optics, 2006
Optics Express, 2004
We report on the theory and use of pre-amplification to enhance the measurement range of a sponta... more We report on the theory and use of pre-amplification to enhance the measurement range of a spontaneous Brillouin intensity based distributed fiber-optic sensor. One factor that limits temperature resolution is receiver sensitivity, which degrades for long range sensors. Using optical preamplification before photo detection in a 23km sensor improved the signal-to-noise by approximately 17dB using a 20MHz detector. The major source of noise was amplified spontaneous emission beat noise.
Measurement Science and Technology, 2006
The temperature resolution of a fibre-optic distributed temperature sensor based on taking the ra... more The temperature resolution of a fibre-optic distributed temperature sensor based on taking the ratio of the temperature sensitive backscattered spontaneous Brillouin signal to the corresponding Rayleigh signal depends on the optical signal-to-noise of the receiver system and the amplitude fluctuations in the Rayleigh signal. The amplitude fluctuations or coherent Rayleigh noise have been investigated experimentally as a function of detection bandwidth, source bandwidth and spatial resolution and showed good agreement with theory.
Data-splitting is the most widely used method to cross-validate global horizontal irradiation reg... more Data-splitting is the most widely used method to cross-validate global horizontal irradiation regression models. An available dataset is split into two subsets, one to calibrate models and the other to validate them. This study investigated the sufficiency of this method within the ambit of two other cross-validation techniques—Monte Carlo cross-validation nested with double cross-validation and leave-one-year-out cross-validation. These techniques facilitated cross-validation in long and short term periods, respectively. They were applied to the De Souza and Hargreaves-Samani temperature-based regression models. Unlike data-splitting, the techniques promoted full characterization of the models by the averages and sensitivities (%) of their tuned parameters, the averages and spread of their predictive accuracies via root mean square errors, and their stability (Monte Carlo-determined). On a monthly average daily time scale, their fully characterized (less their average tuned paramet...
A recently developed temperature-based model for predicting monthly average hourly global horizon... more A recently developed temperature-based model for predicting monthly average hourly global horizontal irradiation has been used to estimate monthly average daily global horizontal irradiation. In addition, a modified form of the Hargreaves-Samani model was implemented. Both models were evaluated alongside five existing temperature-based models due to Hargreaves and Samani, Allen, Bristow and Campbell, Goodin et al., and Hassan et al. for performance and applicability in Trinidad. Calibration and validation of the models were done using datasets of global horizontal irradiation and temperature from 2001-2005 and 2006-2010, respectively. The newly developed temperature-based model performed better than all other models. For an average year, yearly periods, and average and yearly periods collectively, the newly developed temperature-based model yielded root mean square errors of 0.51, 0.86, and 0.68 MJ/m2 day , respectively, between the calculated and measured monthly average daily glob...
Optics Letters, 2000
Optical preamplification has been used in a fiber-optic distributed temperature sensor based on s... more Optical preamplification has been used in a fiber-optic distributed temperature sensor based on spontaneous Brillouin scattering and the use of direct detection, resulting in improved signal-to-noise ratios. The fiber-based optical preamplifier system comprises a three-port circulator, an erbium-doped fiber amplifier with a small-signal gain of 27 dB, and a fiber Bragg grating with 47-GHz bandwidth. An improvement of 17 dB in the optical signal-to-noise ratio for the Brillouin signal is demonstrated in a 23-km sensor. The limit to the signal-to-noise ratio is attributed to spontaneous-spontaneous beat noise generated at the photodetector by amplified spontaneous emission from the optical amplifier.
Fiber Optic Sensor Technology and Applications III, 2004
ABSTRACT
Journal of Renewable and Sustainable Energy, 2019
Data-splitting is the most widely used method to cross-validate global horizontal irradiation reg... more Data-splitting is the most widely used method to cross-validate global horizontal irradiation regression models. An available dataset is split into two subsets, one to calibrate models and the other to validate them. This study investigated the sufficiency of this method within the ambit of two other cross-validation techniques—Monte Carlo cross-validation nested with double cross-validation and leave-one-year-out cross-validation. These techniques facilitated cross-validation in long and short term periods, respectively. They were applied to the De Souza and Hargreaves-Samani temperature-based regression models. Unlike data-splitting, the techniques promoted full characterization of the models by the averages and sensitivities (%) of their tuned parameters, the averages and spread of their predictive accuracies via root mean square errors, and their stability (Monte Carlo-determined). On a monthly average daily timescale, their fully characterized (less their average tuned parameters) Monte Carlo results were < 6%, 0.56 ± 0.12 and 0.032 MJ m-2 day-1 for the De Souza model, and < 1.5%, 0.94 ± 0.14 and 0.174 MJ m-2 day-1 for the Hargreaves-Samani model. Similarly, the leave-one-year-out results were < 2% and 0.88 ± 0.28 MJ m-2 day-1 for the De Souza model, and < 1% and 1.31 ± 0.24 MJ m-2 day-1 for the Hargreaves-Samani model. The De Souza model performed better. We further demonstrated the erroneous assessments possible with models subjected to traditional data-splitting which proved inadequate. Consequently, we proposed an algorithm to implement our cross-validation techniques that reduces computational burden for multiple model evaluation. This was achieved by including a novel controlled data-splitting cross-validation subroutine.
Journal of Solar Energy Engineering, 2025
Accurate predictive daily global horizontal irradiation models are essential for diverse solar en... more Accurate predictive daily global horizontal irradiation models are essential for diverse solar energy applications. Their long-term performances can be assessed using average years. This study scrutinized 70 machine learning and 44 empirical models using two disjoint 5-year average daily training and validation datasets, each comprising 365 records
and 10 features. The features included day number, minimum and maximum air temperature, air temperature amplitude, theoretical and observed sunshine hours, theoretical extraterrestrial horizontal irradiation, relative sunshine, cloud cover, and relative humidity. Fourteen machine learning algorithms, namely, multiple linear regression, ridge regression,
Lasso regression, elastic net regression, Huber regression, k-nearest neighbors, decision tree, support vector machine, multilayer perceptron, extreme learning machine, generalized regression neural network, extreme gradient boosting, gradient boosting machine, and light gradient boosting machine were trained, validated, and instantiated as base learners in four strategically designed homogeneous parallel ensembles—variants
of pasting, random subspace, bagging, and random patches—which also were scrutinized, producing 70 models. Specific hyperparameters from the algorithms were optimized. Validation showed that at least two ensembles outperformed its individual model. Huber-subspace ranked first with a root mean square error of 1.495 MJ/m2/day. The multilayer
perceptron was most robust to the random perturbations of the ensembles which extrapolate to good tolerance to ground-truth data noise. The best empirical model returned a validation root mean square error of 1.595 MJ/m2/day but was outperformed by 93% of the machine learning models with the homogeneous parallel ensembles producing superior predictive accuracies.
This thesis reports on the use of spontaneous Brillouin scattering for the purpose of fibre-optic... more This thesis reports on the use of spontaneous Brillouin scattering for the purpose of fibre-optic distributed temperature and strain sensing based on a time-domain Landau-Placzek ratio technique. Detection system specifications are dictated by the spatial resolution, range, measurand resolution and measurement time. Pulsed sources are used in these sensors. The minimum spatial resolution depends on both the pulse width and receiver bandwidth. The range and measurand resolution depend on the peak pulse power launched into the sensing fibre as well as the Brillouin signal-to-noise characteristics at the receiver. The maximum launched pulse power is limited by the onset of nonlinear effects in the sensing fibre. Novel interferometric techniques based on low-cost, low loss all fibre Mach-Zehnder interferometric optical filters needed to separate the backscattered Rayleigh and spontaneous Brillouin signals have been developed with enhanced signal-to-noise capabilities. Used in conjunctio...
Journal of Solar Energy Engineering
Irradiation time equivalence pioneers the classification of models that predict monthly average d... more Irradiation time equivalence pioneers the classification of models that predict monthly average daily global solar radiation on a horizontal surface based on their double cross-validation performances. By exploiting indigenous irradiation data, novel irradiation-based models can be created and used to classify prediction models, thereby facilitating a deeper understanding of model performance beyond routine summary statistics. The concept was demonstrated by formulating novel 1-hour and 2-hour irradiation-based models to predict monthly average daily global horizontal irradiation. Double cross-validations of the two irradiation-based models and 70 existing regression models were performed using a pair of 5-year subsets. The 70 models used the measured meteorological predictors of air temperature and sunshine hours, either alone or combined. The irradiation time equivalence of a model evaluated under double cross-validation has been defined as the minimum number of hours of measured ...
Photoplethysmography (PPG) is an electro-optic technique used for measuring non-invasively the ph... more Photoplethysmography (PPG) is an electro-optic technique used for measuring non-invasively the physiological dynamics of blood volume pulsations. The inflow and outflow of blood in the vascular bed is pulsatile in nature and is the major physiological characteristic studied in the determination of micro-vascular resistance. The onset of disease generally causes an alteration in the value of micro- vascualar resistance, thereby indicating that this technique has great diagnostic capabilities. Micro-vascular resistance was measured from the temporal parameters of the finger PPG waveform using the software Chart. The ratio of the anacrotic time, Tg, to catacrotic time, Ta, was used as the measure of micro vascular resistance. Simulation of the diseased state was performed by restricting blood flow, using a rubber band on the index finger of twenty-seven (27) test subjects. The subjects exhibited a higher Tg/Ta ratio after the maneuver was performed. This result held true for both gende...
Journal of Renewable and Sustainable Energy
FIG. 4. Graphs of calculated (solid line) and measured (dashed line) monthly average hourly globa... more FIG. 4. Graphs of calculated (solid line) and measured (dashed line) monthly average hourly global solar radiation against solar time (h) for the D, WLJ, and CPRG models for six months of the validation period 2008.
Journal of Renewable and Sustainable Energy
Monthly optimum tilt angles of a flat-plate solar collector capable of south or north orientation... more Monthly optimum tilt angles of a flat-plate solar collector capable of south or north orientations were modeled for the tropical Caribbean island of Trinidad at 10.6 N latitude, using measured monthly average daily global and diffuse horizontal irradiation data from the period 2005-2010, as input to six transposition models comprising three isotropic (Liu and Jordan, Koronakis, and Badescu) and three anisotropic (Hay and Davies, 'Hay and Davies, Klucher and Reindl' , and Ma and Iqbal) models. The anisotropic models were in good agreement with one another, and an easily implementable technique was devised to determine the best suited decomposition-transposition model matches from six decomposition models due to Liu and Jordan and Klein, Page, Collares-Perreira and Rabl, Iqbal, Erbs et al., and Ibrahim. These matches can be used by territories having a similar climate to Trinidad but lacking measured diffuse horizontal irradiation or plane-of-array measurements, and the technique can be implemented globally by other host territories. The Ma and Iqbal model was chosen to simulate in detail the aforementioned collector as well as the one that was south-oriented only. A south/north-oriented collector required twelve monthly and two seasonal [April: 12.5 (north-oriented), October: 32 (south-oriented)], and annual [11.2 (south-oriented)] adjustments with corresponding gains in the collectable annual global solar irradiation compared to that on a horizontal surface of 9.3%, 7.9%, and 1.5%, respectively. In contrast, a south-oriented collector required eight monthly (September-April) and two seasonal (March: 0 and October: 35.5) adjustments with lower corresponding gains of 7.6% and 7.1%, respectively.
Journal of Renewable and Sustainable Energy, 2015
Photonics North 2004: Photonic Applications in Telecommunications, Sensors, Software, and Lasers, 2004
ABSTRACT
SPIE Proceedings, 2004
ABSTRACT
SPIE Proceedings, 2004
ABSTRACT
Frontiers in Optics, 2006
Optics Express, 2004
We report on the theory and use of pre-amplification to enhance the measurement range of a sponta... more We report on the theory and use of pre-amplification to enhance the measurement range of a spontaneous Brillouin intensity based distributed fiber-optic sensor. One factor that limits temperature resolution is receiver sensitivity, which degrades for long range sensors. Using optical preamplification before photo detection in a 23km sensor improved the signal-to-noise by approximately 17dB using a 20MHz detector. The major source of noise was amplified spontaneous emission beat noise.
Measurement Science and Technology, 2006
The temperature resolution of a fibre-optic distributed temperature sensor based on taking the ra... more The temperature resolution of a fibre-optic distributed temperature sensor based on taking the ratio of the temperature sensitive backscattered spontaneous Brillouin signal to the corresponding Rayleigh signal depends on the optical signal-to-noise of the receiver system and the amplitude fluctuations in the Rayleigh signal. The amplitude fluctuations or coherent Rayleigh noise have been investigated experimentally as a function of detection bandwidth, source bandwidth and spatial resolution and showed good agreement with theory.
Data-splitting is the most widely used method to cross-validate global horizontal irradiation reg... more Data-splitting is the most widely used method to cross-validate global horizontal irradiation regression models. An available dataset is split into two subsets, one to calibrate models and the other to validate them. This study investigated the sufficiency of this method within the ambit of two other cross-validation techniques—Monte Carlo cross-validation nested with double cross-validation and leave-one-year-out cross-validation. These techniques facilitated cross-validation in long and short term periods, respectively. They were applied to the De Souza and Hargreaves-Samani temperature-based regression models. Unlike data-splitting, the techniques promoted full characterization of the models by the averages and sensitivities (%) of their tuned parameters, the averages and spread of their predictive accuracies via root mean square errors, and their stability (Monte Carlo-determined). On a monthly average daily time scale, their fully characterized (less their average tuned paramet...
A recently developed temperature-based model for predicting monthly average hourly global horizon... more A recently developed temperature-based model for predicting monthly average hourly global horizontal irradiation has been used to estimate monthly average daily global horizontal irradiation. In addition, a modified form of the Hargreaves-Samani model was implemented. Both models were evaluated alongside five existing temperature-based models due to Hargreaves and Samani, Allen, Bristow and Campbell, Goodin et al., and Hassan et al. for performance and applicability in Trinidad. Calibration and validation of the models were done using datasets of global horizontal irradiation and temperature from 2001-2005 and 2006-2010, respectively. The newly developed temperature-based model performed better than all other models. For an average year, yearly periods, and average and yearly periods collectively, the newly developed temperature-based model yielded root mean square errors of 0.51, 0.86, and 0.68 MJ/m2 day , respectively, between the calculated and measured monthly average daily glob...
Optics Letters, 2000
Optical preamplification has been used in a fiber-optic distributed temperature sensor based on s... more Optical preamplification has been used in a fiber-optic distributed temperature sensor based on spontaneous Brillouin scattering and the use of direct detection, resulting in improved signal-to-noise ratios. The fiber-based optical preamplifier system comprises a three-port circulator, an erbium-doped fiber amplifier with a small-signal gain of 27 dB, and a fiber Bragg grating with 47-GHz bandwidth. An improvement of 17 dB in the optical signal-to-noise ratio for the Brillouin signal is demonstrated in a 23-km sensor. The limit to the signal-to-noise ratio is attributed to spontaneous-spontaneous beat noise generated at the photodetector by amplified spontaneous emission from the optical amplifier.
Fiber Optic Sensor Technology and Applications III, 2004
ABSTRACT
Journal of Renewable and Sustainable Energy, 2019
Data-splitting is the most widely used method to cross-validate global horizontal irradiation reg... more Data-splitting is the most widely used method to cross-validate global horizontal irradiation regression models. An available dataset is split into two subsets, one to calibrate models and the other to validate them. This study investigated the sufficiency of this method within the ambit of two other cross-validation techniques—Monte Carlo cross-validation nested with double cross-validation and leave-one-year-out cross-validation. These techniques facilitated cross-validation in long and short term periods, respectively. They were applied to the De Souza and Hargreaves-Samani temperature-based regression models. Unlike data-splitting, the techniques promoted full characterization of the models by the averages and sensitivities (%) of their tuned parameters, the averages and spread of their predictive accuracies via root mean square errors, and their stability (Monte Carlo-determined). On a monthly average daily timescale, their fully characterized (less their average tuned parameters) Monte Carlo results were < 6%, 0.56 ± 0.12 and 0.032 MJ m-2 day-1 for the De Souza model, and < 1.5%, 0.94 ± 0.14 and 0.174 MJ m-2 day-1 for the Hargreaves-Samani model. Similarly, the leave-one-year-out results were < 2% and 0.88 ± 0.28 MJ m-2 day-1 for the De Souza model, and < 1% and 1.31 ± 0.24 MJ m-2 day-1 for the Hargreaves-Samani model. The De Souza model performed better. We further demonstrated the erroneous assessments possible with models subjected to traditional data-splitting which proved inadequate. Consequently, we proposed an algorithm to implement our cross-validation techniques that reduces computational burden for multiple model evaluation. This was achieved by including a novel controlled data-splitting cross-validation subroutine.