Dr Ekaterina Abramova | London Business School (original) (raw)

Uploads

Papers by Dr Ekaterina Abramova

Research paper thumbnail of Optimal Daily Trading of Battery Operations Using Arbitrage Spreads

Energies, 2021

An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more at...

Research paper thumbnail of Hierarchical , Heterogeneous Control of Non-Linear Dynamical Systems using Reinforcement Learning

Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible framework to address large scale non-linear control problems.

Research paper thumbnail of Forecasting the Intra-Day Spread Densities of Electricity Prices

Energies, 2020

Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.

Research paper thumbnail of Combining Markov Decision Processes with Linear Optimal Controllers

Research paper thumbnail of Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity

SSRN Electronic Journal, 2019

Research paper thumbnail of RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

ArXiv, 2019

Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear contro...

Research paper thumbnail of Optimal Daily Trading of Battery Operations Using Arbitrage Spreads

MDPI Energies (Special Issue Computational Modeling and Design of Energy Systems), 2021

An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more attractive in reducing risk. In contrast to the conventional approach of trading the daily peak and trough, multiple trades are found to be profitable and opportunistic depending upon the weather forecasts.

Research paper thumbnail of Forecasting the Intra-Day Spread Densities of Electricity Prices

MDPI Energies (Special Issue Modeling and Forecasting Intraday Electricity Markets), 2020

Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.

Research paper thumbnail of Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity

This paper formulates dynamic density functions, based upon skewed-t and similar representations ... more This paper formulates dynamic density functions, based upon skewed-t and similar representations , to model and forecast electricity price spreads between different hours of the day. This supports an optimal day ahead storage and discharge schedule, and thereby facilitates a bidding strategy for a merchant arbitrage facility into the day-ahead auctions for wholesale electricity. The four latent moments of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the mean, variance, skewness and kurtosis of the densities to respond hourly to such factors as weather and demand forecasts. The best specification for each spread is selected based on the Pinball Loss function, following the closed form analytical solutions of the cumulative density functions. Those analytical properties also allow the calculation of risk associated with the spread arbitrages. From these spread densities, the optimal daily operation of a battery storage facility is determined.

Research paper thumbnail of RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

RLOC, 2019

Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical framework. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.

Research paper thumbnail of Hierarchical, Heterogeneous Control of Non-Linear Dynamical Systems using Reinforcement Learning

Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible framework to address large scale non-linear control problems.

Research paper thumbnail of Combining Markov Decision Processes with Linear Optimal Controllers

Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear proble... more Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear problems do not . The state of the art method used to find approximate solutions to non-linear control problems (iterative LQG) [3] carries a large computational cost associated with iterative calculations [4]. We propose a novel approach for solving nonlinear Optimal Control (OC) problems which combines Reinforcement Learning (RL) with OC. The new algorithm, RLOC, uses a small set of localized optimal linear controllers and applies a Monte Carlo algorithm that learns the mapping from the state space to controllers. We illustrate our approach by solving a non-linear OC problem of the 2-joint arm operating in a plane with two point masses. We show that controlling the arm with the RLOC is less costly than using the Linear Quadratic Regulator (LQR). This finding shows that non-linear optimal control problems can be solved using a novel approach of adaptive RL.

Thesis Chapters by Dr Ekaterina Abramova

Research paper thumbnail of Combining Reinforcement Learning And Optimal Control For The Control Of Nonlinear Dynamical Systems

PhD Thesis, 2016

This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Cont... more This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Control, for controlling nonlinear dynamical systems with continuous states and actions. The adapted approach mimics the neural computations that allow our brain to bridge across the divide between symbolic action-selection and low-level actuation control by operating at two levels of abstraction. First, current findings demonstrate that at the level of limb coordination human behaviour is explained by linear optimal feedback control theory, where cost functions match energy and timing constraints of tasks. Second, humans learn cognitive tasks involving learning symbolic level action selection, in terms of both model-free and model-based reinforcement learning algorithms. We postulate that the ease with which humans learn complex nonlinear tasks arises from combining these two levels of abstraction. The Reinforcement Learning Optimal Control framework learns the local task dynamics from naive experience using an expectation maximization algorithm for estimation of linear dynamical systems and forms locally optimal Linear Quadratic Regulators, producing continuous low-level control. A high-level reinforcement learning agent uses these available controllers as actions and learns how to combine them in state space, while maximizing a long term reward. The optimal control costs form training signals for high-level symbolic learner. The algorithm demonstrates that a small number of locally optimal linear controllers can be combined in a smart way to solve global nonlinear control problems and forms a proof-of-principle to how the brain may bridge the divide between low-level continuous control and high-level symbolic action selection. It competes in terms of computational cost and solution quality with state-of-the-art control, which is illustrated with solutions to benchmark problems.

Research paper thumbnail of Optimal Daily Trading of Battery Operations Using Arbitrage Spreads

Energies, 2021

An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more at...

Research paper thumbnail of Hierarchical , Heterogeneous Control of Non-Linear Dynamical Systems using Reinforcement Learning

Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible framework to address large scale non-linear control problems.

Research paper thumbnail of Forecasting the Intra-Day Spread Densities of Electricity Prices

Energies, 2020

Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.

Research paper thumbnail of Combining Markov Decision Processes with Linear Optimal Controllers

Research paper thumbnail of Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity

SSRN Electronic Journal, 2019

Research paper thumbnail of RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

ArXiv, 2019

Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear contro...

Research paper thumbnail of Optimal Daily Trading of Battery Operations Using Arbitrage Spreads

MDPI Energies (Special Issue Computational Modeling and Design of Energy Systems), 2021

An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more attractive in reducing risk. In contrast to the conventional approach of trading the daily peak and trough, multiple trades are found to be profitable and opportunistic depending upon the weather forecasts.

Research paper thumbnail of Forecasting the Intra-Day Spread Densities of Electricity Prices

MDPI Energies (Special Issue Modeling and Forecasting Intraday Electricity Markets), 2020

Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.

Research paper thumbnail of Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity

This paper formulates dynamic density functions, based upon skewed-t and similar representations ... more This paper formulates dynamic density functions, based upon skewed-t and similar representations , to model and forecast electricity price spreads between different hours of the day. This supports an optimal day ahead storage and discharge schedule, and thereby facilitates a bidding strategy for a merchant arbitrage facility into the day-ahead auctions for wholesale electricity. The four latent moments of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the mean, variance, skewness and kurtosis of the densities to respond hourly to such factors as weather and demand forecasts. The best specification for each spread is selected based on the Pinball Loss function, following the closed form analytical solutions of the cumulative density functions. Those analytical properties also allow the calculation of risk associated with the spread arbitrages. From these spread densities, the optimal daily operation of a battery storage facility is determined.

Research paper thumbnail of RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

RLOC, 2019

Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical framework. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.

Research paper thumbnail of Hierarchical, Heterogeneous Control of Non-Linear Dynamical Systems using Reinforcement Learning

Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible framework to address large scale non-linear control problems.

Research paper thumbnail of Combining Markov Decision Processes with Linear Optimal Controllers

Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear proble... more Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear problems do not . The state of the art method used to find approximate solutions to non-linear control problems (iterative LQG) [3] carries a large computational cost associated with iterative calculations [4]. We propose a novel approach for solving nonlinear Optimal Control (OC) problems which combines Reinforcement Learning (RL) with OC. The new algorithm, RLOC, uses a small set of localized optimal linear controllers and applies a Monte Carlo algorithm that learns the mapping from the state space to controllers. We illustrate our approach by solving a non-linear OC problem of the 2-joint arm operating in a plane with two point masses. We show that controlling the arm with the RLOC is less costly than using the Linear Quadratic Regulator (LQR). This finding shows that non-linear optimal control problems can be solved using a novel approach of adaptive RL.

Research paper thumbnail of Combining Reinforcement Learning And Optimal Control For The Control Of Nonlinear Dynamical Systems

PhD Thesis, 2016

This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Cont... more This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Control, for controlling nonlinear dynamical systems with continuous states and actions. The adapted approach mimics the neural computations that allow our brain to bridge across the divide between symbolic action-selection and low-level actuation control by operating at two levels of abstraction. First, current findings demonstrate that at the level of limb coordination human behaviour is explained by linear optimal feedback control theory, where cost functions match energy and timing constraints of tasks. Second, humans learn cognitive tasks involving learning symbolic level action selection, in terms of both model-free and model-based reinforcement learning algorithms. We postulate that the ease with which humans learn complex nonlinear tasks arises from combining these two levels of abstraction. The Reinforcement Learning Optimal Control framework learns the local task dynamics from naive experience using an expectation maximization algorithm for estimation of linear dynamical systems and forms locally optimal Linear Quadratic Regulators, producing continuous low-level control. A high-level reinforcement learning agent uses these available controllers as actions and learns how to combine them in state space, while maximizing a long term reward. The optimal control costs form training signals for high-level symbolic learner. The algorithm demonstrates that a small number of locally optimal linear controllers can be combined in a smart way to solve global nonlinear control problems and forms a proof-of-principle to how the brain may bridge the divide between low-level continuous control and high-level symbolic action selection. It competes in terms of computational cost and solution quality with state-of-the-art control, which is illustrated with solutions to benchmark problems.