Lior Horesh | IBM Research (original) (raw)
Papers by Lior Horesh
2015 European Control Conference (ECC), 2015
Source estimation is a fundamental ingredient of Full Waveform Inversion (FWI). In such seismic i... more Source estimation is a fundamental ingredient of Full Waveform Inversion (FWI). In such seismic inversion methods wavelet intensity and phase spectra are usually estimated statistically although for the FWI formulation as a nonlinear least-squares optimization problem it can naturally be incorporated to the workflow. Modern approaches for source estimation consider robust misfit functions leading to the well known robust FWI method. The present work uses synthetic data generated from a high order spectral element forward solver to produce observed data which in turn are used to estimate the intensity and the location of the point seismic source term of the original elastic wave PDE. A min-max filter approach is used to convert the original source estimation problem into a state problem conditioned to the observations and a non-standard uncertainty description. The resulting numerical scheme uses an implicit midpoint method to solve, in parallel, the chosen 2D and 3D numerical examples running on an IBM Blue Gene/Q using a grid defined by approximately sixteen thousand 5 th order elements, resulting in a total of approximately 6.5 million degrees of freedom.
Inverse Problems in Science and Engineering, 2014
A broad range of parameter estimation problems involve the collection of an excessively large num... more A broad range of parameter estimation problems involve the collection of an excessively large number of observations N . Typically each such observation involves excitation of the domain through injection of energy at some pre-defined sites and recording of the response of the domain at another set of locations. It has been observed that similar results can often be obtained by considering a far smaller number K of multiple linear superpositions of experiments with K << N . This allows the construction of the solution to the inverse problem in time O(K) instead of O(N ). Given these considerations it should not be necessary to perform all the N experiments but only a much smaller number of K experiments with simultaneous sources in superpositions with certain weights. Devising such procedure would results in a drastic reduction in acquisition time.
77th EAGE Conference and Exhibition 2015, 2015
We define a computational domain, Ω ⊂ R n , with boundary Γ, and consider the acoustic wave equat... more We define a computational domain, Ω ⊂ R n , with boundary Γ, and consider the acoustic wave equation in Ω:
Given a computational solid domain Ω ⊂ R 3 with boundary Γ, the elastic wave equation is defined ... more Given a computational solid domain Ω ⊂ R 3 with boundary Γ, the elastic wave equation is defined in Ω as:
Key seismic workflows, such as migration, imaging, and full waveform inversion, can all be formul... more Key seismic workflows, such as migration, imaging, and full waveform inversion, can all be formulated as PDE constrained optimization problems. In their most abstract form, all of these problems can be written as follows:
SPE Annual Technical Conference and Exhibition, 2014
Hessian-free training has become a popular parallel second order optimization technique for Deep ... more Hessian-free training has become a popular parallel second order optimization technique for Deep Neural Network training. This study aims at speeding up Hessian-free training, both by means of decreasing the amount of data used for training, as well as through reduction of the number of Krylov subspace solver iterations used for implicit estimation of the Hessian. In this paper, we develop an L-BFGS based preconditioning scheme that avoids the need to access the Hessian explicitly. Since L-BFGS cannot be regarded as a fixed-point iteration, we further propose the employment of flexible Krylov subspace solvers that retain the desired theoretical convergence guarantees of their conventional counterparts. Second, we propose a new sampling algorithm, which geometrically increases the amount of data utilized for gradient and Krylov subspace iteration calculations. On a 50-hr English Broadcast News task, we find that these methodologies provide roughly a 1.5x speed-up, whereas, on a 300-hr Switchboard task, these techniques provide over a 2.3x speedup, with no loss in WER. These results suggest that even further speed-up is expected, as problems scale and complexity grows.
It is more and more common to encounter applications where the collected data is most naturally s... more It is more and more common to encounter applications where the collected data is most naturally stored or represented in a multi-dimensional array, known as a tensor. The goal is often to approximate this tensor as a sum of some type of combination of basic elements, where the notation of what is a basic element is specific to the type of factorization employed. If the number of terms in the combination is few, the tensor factorization gives (implicitly) a sparse (approximate) representation of the data. The terms (e.g. vectors, matrices, tensors) in the combination themselves may also be sparse. This chapter highlights recent developments in the area of non-negative tensor factorization which admit such sparse representations. Specifically, we consider the approximate factorization of third and fourth order tensors into non-negative sums of types of outer-products of objects with one dimension less using the so-called t-product. A demonstration on an application in facial recognition shows the potential promise of the overall approach. We discuss a number of algorithmic options for solving the resulting optimization problems, and modification of such algorithms for increasing the sparsity.
Compressed sensing is a new emerging field dealing with the reconstruction of a sparse or, more p... more Compressed sensing is a new emerging field dealing with the reconstruction of a sparse or, more precisely, a compressed representation of a signal from a relatively small number of observations, typically less than the signal dimension. In our previous work we have shown how the Kalman filter can be naturally applied for obtaining an approximate Bayesian solution for the compressed sensing problem. The resulting algorithm, which was termed CSKF, relies on a pseudomeasurement technique for enforcing the sparseness constraint. Our approach raises two concerns which are addressed in this paper. The first one refers to the validity of our approximation technique. In this regard, we provide a rigorous treatment of the CSKF algorithm which is concluded with an upper bound on the discrepancy between the exact (in the Bayesian sense) and the approximate solutions. The second concern refers to the computational overhead associated with the CSKF in large scale settings. This problem is alleviated here using an efficient measurement update scheme based on Krylov subspace method.
The method of Tikhonov regularization is commonly used to obtain regularized solutions of ill-pos... more The method of Tikhonov regularization is commonly used to obtain regularized solutions of ill-posed linear inverse problems. We use its natural connection to optimal Bayes estimators to determine optimal experimental designs that can be used with Tikhonov regularization; they are designed to control a measure of total relative efficiency. We present an iterative/semidefinite programming hybrid method to explore the configuration space efficiently. Two examples from geophysics are used to illustrate the type of applications to which the methodology can be applied.
2015 European Control Conference (ECC), 2015
Source estimation is a fundamental ingredient of Full Waveform Inversion (FWI). In such seismic i... more Source estimation is a fundamental ingredient of Full Waveform Inversion (FWI). In such seismic inversion methods wavelet intensity and phase spectra are usually estimated statistically although for the FWI formulation as a nonlinear least-squares optimization problem it can naturally be incorporated to the workflow. Modern approaches for source estimation consider robust misfit functions leading to the well known robust FWI method. The present work uses synthetic data generated from a high order spectral element forward solver to produce observed data which in turn are used to estimate the intensity and the location of the point seismic source term of the original elastic wave PDE. A min-max filter approach is used to convert the original source estimation problem into a state problem conditioned to the observations and a non-standard uncertainty description. The resulting numerical scheme uses an implicit midpoint method to solve, in parallel, the chosen 2D and 3D numerical examples running on an IBM Blue Gene/Q using a grid defined by approximately sixteen thousand 5 th order elements, resulting in a total of approximately 6.5 million degrees of freedom.
Inverse Problems in Science and Engineering, 2014
A broad range of parameter estimation problems involve the collection of an excessively large num... more A broad range of parameter estimation problems involve the collection of an excessively large number of observations N . Typically each such observation involves excitation of the domain through injection of energy at some pre-defined sites and recording of the response of the domain at another set of locations. It has been observed that similar results can often be obtained by considering a far smaller number K of multiple linear superpositions of experiments with K << N . This allows the construction of the solution to the inverse problem in time O(K) instead of O(N ). Given these considerations it should not be necessary to perform all the N experiments but only a much smaller number of K experiments with simultaneous sources in superpositions with certain weights. Devising such procedure would results in a drastic reduction in acquisition time.
77th EAGE Conference and Exhibition 2015, 2015
We define a computational domain, Ω ⊂ R n , with boundary Γ, and consider the acoustic wave equat... more We define a computational domain, Ω ⊂ R n , with boundary Γ, and consider the acoustic wave equation in Ω:
Given a computational solid domain Ω ⊂ R 3 with boundary Γ, the elastic wave equation is defined ... more Given a computational solid domain Ω ⊂ R 3 with boundary Γ, the elastic wave equation is defined in Ω as:
Key seismic workflows, such as migration, imaging, and full waveform inversion, can all be formul... more Key seismic workflows, such as migration, imaging, and full waveform inversion, can all be formulated as PDE constrained optimization problems. In their most abstract form, all of these problems can be written as follows:
SPE Annual Technical Conference and Exhibition, 2014
Hessian-free training has become a popular parallel second order optimization technique for Deep ... more Hessian-free training has become a popular parallel second order optimization technique for Deep Neural Network training. This study aims at speeding up Hessian-free training, both by means of decreasing the amount of data used for training, as well as through reduction of the number of Krylov subspace solver iterations used for implicit estimation of the Hessian. In this paper, we develop an L-BFGS based preconditioning scheme that avoids the need to access the Hessian explicitly. Since L-BFGS cannot be regarded as a fixed-point iteration, we further propose the employment of flexible Krylov subspace solvers that retain the desired theoretical convergence guarantees of their conventional counterparts. Second, we propose a new sampling algorithm, which geometrically increases the amount of data utilized for gradient and Krylov subspace iteration calculations. On a 50-hr English Broadcast News task, we find that these methodologies provide roughly a 1.5x speed-up, whereas, on a 300-hr Switchboard task, these techniques provide over a 2.3x speedup, with no loss in WER. These results suggest that even further speed-up is expected, as problems scale and complexity grows.
It is more and more common to encounter applications where the collected data is most naturally s... more It is more and more common to encounter applications where the collected data is most naturally stored or represented in a multi-dimensional array, known as a tensor. The goal is often to approximate this tensor as a sum of some type of combination of basic elements, where the notation of what is a basic element is specific to the type of factorization employed. If the number of terms in the combination is few, the tensor factorization gives (implicitly) a sparse (approximate) representation of the data. The terms (e.g. vectors, matrices, tensors) in the combination themselves may also be sparse. This chapter highlights recent developments in the area of non-negative tensor factorization which admit such sparse representations. Specifically, we consider the approximate factorization of third and fourth order tensors into non-negative sums of types of outer-products of objects with one dimension less using the so-called t-product. A demonstration on an application in facial recognition shows the potential promise of the overall approach. We discuss a number of algorithmic options for solving the resulting optimization problems, and modification of such algorithms for increasing the sparsity.
Compressed sensing is a new emerging field dealing with the reconstruction of a sparse or, more p... more Compressed sensing is a new emerging field dealing with the reconstruction of a sparse or, more precisely, a compressed representation of a signal from a relatively small number of observations, typically less than the signal dimension. In our previous work we have shown how the Kalman filter can be naturally applied for obtaining an approximate Bayesian solution for the compressed sensing problem. The resulting algorithm, which was termed CSKF, relies on a pseudomeasurement technique for enforcing the sparseness constraint. Our approach raises two concerns which are addressed in this paper. The first one refers to the validity of our approximation technique. In this regard, we provide a rigorous treatment of the CSKF algorithm which is concluded with an upper bound on the discrepancy between the exact (in the Bayesian sense) and the approximate solutions. The second concern refers to the computational overhead associated with the CSKF in large scale settings. This problem is alleviated here using an efficient measurement update scheme based on Krylov subspace method.
The method of Tikhonov regularization is commonly used to obtain regularized solutions of ill-pos... more The method of Tikhonov regularization is commonly used to obtain regularized solutions of ill-posed linear inverse problems. We use its natural connection to optimal Bayes estimators to determine optimal experimental designs that can be used with Tikhonov regularization; they are designed to control a measure of total relative efficiency. We present an iterative/semidefinite programming hybrid method to explore the configuration space efficiently. Two examples from geophysics are used to illustrate the type of applications to which the methodology can be applied.