Felix Herrmann - Profile on Academia.edu (original) (raw)

Papers by Felix Herrmann

Computers & Geosciences

Fourier neural operators (FNOs) are a recently introduced neural network architecture for learnin... more Fourier neural operators (FNOs) are a recently introduced neural network architecture for learning solution operators of partial differential equations (PDEs), which have been shown to perform significantly better than comparable deep learning approaches. Once trained, FNOs can achieve speed-ups of multiple orders of magnitude over conventional numerical PDE solvers. However, due to the high dimensionality of their input data and network weights, FNOs have so far only been applied to two-dimensional or small three-dimensional problems. To remove this limited problem-size barrier, we propose a model-parallel version of FNOs based on domaindecomposition of both the input data and network weights. We demonstrate that our modelparallel FNO is able to predict time-varying PDE solutions of over 2.6 billion variables on Perlmutter using up to 512 A100 GPUs and show an example of training a distributed FNO on the Azure cloud for simulating multiphase CO 2 dynamics in the Earth's subsurface.

We introduce Devito, a new domain-specific language for implementing high-performance finite diff... more We introduce Devito, a new domain-specific language for implementing high-performance finite difference partial differential equation solvers. The motivating application is exploration seismology where methods such as Full-Waveform Inversion and Reverse-Time Migration are used to invert terabytes of seismic data to create images of the earth's subsurface. Even using modern supercomputers, it can take weeks to process a single seismic survey and create a useful subsurface image. The computational cost is dominated by the numerical solution of wave equations and their corresponding adjoints. Therefore, a great deal of effort is invested in aggressively optimizing the performance of these wave-equation propagators for different computer architectures. Additionally, the actual set of partial differential equations being solved and their numerical discretization is under constant innovation as increasingly realistic representations of the physics are developed, further ratcheting up the cost of practical solvers. By embedding a domain-specific language within Python and making heavy use of SymPy, a symbolic mathematics library, we make it possible to develop finite difference simulators quickly using a syntax that strongly resembles the mathematics. The Devito compiler reads this code and applies a wide range of analysis to generate highly optimized and parallel code. This approach can reduce the development time of a verified and optimized solver from months to days. 1 Introduction Large-scale inversion problems in exploration seismology constitute some of the most computationally demanding problems in industrial and academic research. Developing computationally efficient solutions for applications such as seismic inversion requires expertise ranging from theoretical and numerical methods for partial differential equation (PDE) constrained optimization to low-level performance optimization of PDE solvers. Progress in this area is often limited by the complexity and cost of developing bespoke wave propagators (and their discrete adjoints) for each new inversion algorithm or formulation of wave physics. Traditional software engineering approaches often lead developers to make critical choices regarding the numerical discretization before manual performance optimization for a specific target architecture and making it ready for production. This workflow of bringing new algorithms into production, or even to a stage that they can be evaluated on realistic datasets can take many person-months or even person-years. Furthermore, it leads to mathematical software that is not easily ported,

Siam Journal on Imaging Sciences, 2018

We propose an extended full-waveform inversion formulation that includes general convex constrain... more We propose an extended full-waveform inversion formulation that includes general convex constraints on the model. Though the full problem is highly nonconvex, the overarching optimization scheme arrives at geologically plausible results by solving a sequence of relaxed and warm-started constrained convex subproblems. The combination of box, total-variation, and successively relaxed asymmetric total-variation constraints allows us to steer free from parasitic local minima while keeping the estimated physical parameters laterally continuous and in a physically realistic range. For accurate starting models, numerical experiments carried out on the challenging 2004 BP velocity benchmark demonstrate that bound and total-variation constraints improve the inversion result significantly by removing inversion artifacts, related to source encoding, and by clearly improved delineation of top, bottom, and flanks of a high-velocity high-contrast salt inclusion. The experiments also show that for poor starting models these two constraints by themselves are insufficient to detect the bottom of high-velocity inclusions such as salt. Inclusion of the one-sided asymmetric total-variation constraint overcomes this issue by discouraging velocity lows to buildup during the early stages of the inversion. To the author's knowledge the presented algorithm is the first to successfully remove the imprint of local minima caused by poor starting models and band-width limited finite aperture data. † John "Ernie" Esser passed away on March 8, 2015 while preparing this manuscript. The original is posted here: https://www.slim.eos.ubc.ca/content/total-variation-regularization-strategies-full-waveform-inversion-improving-robustness-noise.

Geophysics, Apr 13, 2023

Bayesian inference for high-dimensional inverse problems is computationally costly and requires s... more Bayesian inference for high-dimensional inverse problems is computationally costly and requires selecting a suitable prior distribution. Amortized variational inference addresses these challenges by pretraining a neural network that approximates the posterior distribution not only for one instance of observed data, but a distribution of data pertaining to a specific inverse problem. When fed previously unseen data, the neural network-in our case a conditional normalizing flow-provides posterior samples at virtually no cost. However, the accuracy of amortized variational inference relies on the availability of high-fidelity training data, which seldom exists in geophysical inverse problems because of the Earth's heterogeneous subsurface. In addition, the network is prone to errors if evaluated over data that is not drawn from the training data distribution. As such, we propose to increase the resilience of amortized variational inference in the presence of moderate data distribution shifts. We achieve this via a correction to the conditional normalizing flow's latent distribution that improves the approximation to the posterior distribution for the data at hand. The correction involves relaxing the standard Gaussian assumption on the latent distribution and parameterizing it via a Gaussian distribution with an unknown mean and (diagonal) covariance. These unknowns are then estimated by minimizing the Kullback-Leibler divergence between the corrected and the (physics-based) true posterior distributions. While generic and applicable to other inverse problems, by means of a linearized seismic imaging example, we show that our correction step improves the robustness of amortized variational inference with respect to changes in the number of seismic sources, noise variance, and shifts in the prior distribution. This approach, given noisy seismic data simulated via linearized Born modeling, provides a seismic image with limited artifacts and an assessment of its uncertainty at approximately the same cost as five reverse-time migrations.

arXiv (Cornell University), Oct 18, 2020

Many of the seismic inversion techniques currently proposed that focus on robustness with respect... more Many of the seismic inversion techniques currently proposed that focus on robustness with respect to the background model choice are not apt to largescale 3D applications, and the methods that are computationally feasible for industrial problems, such as full waveform inversion, are notoriously bogged down by local minima and require adequate starting models. We propose a novel solution that is both scalable and less sensitive to starting model or modeling inaccuracies (e.g. stemming from inaccurate physical parameters that are kept fixed during inversion) when compared to full waveform inversion. It is based on a dual reformulation of the classical wavefield reconstruction inversion, whose empirical robustness with respect to local minima is well documented in the literature. Alas, the classical version is not suited to 3D, as it leverages expensive frequency-domain solvers for the wave equation. Our proposal, instead, allows the deployment of state-of-the-art time-domain finite-difference methods, and is computationally mature for industrial scale problems.

Derisking geologic carbon storage from high-resolution time-lapse seismic to explainable leakage detection

The Leading Edge

Geologic carbon storage represents one of the few truly scalable technologies capable of reducing... more Geologic carbon storage represents one of the few truly scalable technologies capable of reducing the CO2 concentration in the atmosphere. While this technology has the potential to scale, its success hinges on our ability to mitigate its risks. An important aspect of risk mitigation concerns assurances that the injected CO2 remains within the storage complex. Among the different monitoring modalities, seismic imaging stands out due to its ability to attain high-resolution and high-fidelity images. However, these superior features come at prohibitive costs and time-intensive efforts that potentially render extensive seismic monitoring undesirable. To overcome this shortcoming, we present a methodology in which time-lapse images are created by inverting nonreplicated time-lapse monitoring data jointly. By no longer insisting on replication of the surveys to obtain high-fidelity time-lapse images and differences, extreme costs and time-consuming labor are averted. To demonstrate our a...

Two complementary solution strategies to the least-squares migration problem with sparseness-& co... more Two complementary solution strategies to the least-squares migration problem with sparseness-& continuity constraints are proposed. The applied formalism explores the sparseness of curvelets on the reflectivity and their invariance under the demigrationmigration operator. Sparseness is enhanced by (approximately) minimizing a (weighted) 1-norm on the curvelet coefficients. Continuity along imaged reflectors is brought out by minimizing the anisotropic diffusion or total variation norm which penalizes variations along and in between reflectors. A brief sketch of the theory is provided as well as a number of synthetic examples. Technical details on the implementation of the optimization strategies are deferred to an accompanying paper: implementation

We propose an efficient iterative data interpolation method using continuity along reflectors in ... more We propose an efficient iterative data interpolation method using continuity along reflectors in seismic images via curvelet and discrete cosine transforms. The curvelet transform is a new multiscale transform that provides sparse representations for images that comprise smooth objects separated by piece-wise smooth discontinuities (e.g. seismic images). The advantage of using curvelets is that these frames are sparse for high-frequency caustic-free solutions of the wave-equation. Since we are dealing with less than ideal data (e.g. bandwidth-limited), we compliment the curvelet frames with the discrete cosine transform. The latter is motivated by the successful data continuation with the discrete Fourier transform. By choosing generic basis functions we circumvent the necessity to make parametric assumptions (e.g. through linear/parabolic Radon or demigration) regarding the shape of events in seismic data. Synthetic and real data examples demonstrate that our algorithm provides interpolated traces that accurately reproduce the wavelet shape as well as the AVO behavior along events in shot gathers. Linear data interpolation Data continuation has been an important topic in seismic processing, imaging and inversion. For instance, our ability to accurately and artifact-free image or predict multiples, as part of surface related multiple elimination, depends for a large part on the availability of densely and equidistantly sampled data. Unfortunately, constraints on marine and land acquisitions preclude such a dense sampling and several efforts have been made to solve issues related to aliasing and missing traces.

arXiv (Cornell University), Feb 9, 2022

Seismic imaging in challenging sedimentary basins and reservoirs requires acquiring, processing, ... more Seismic imaging in challenging sedimentary basins and reservoirs requires acquiring, processing, and imaging very large volumes of data (tens of terabytes). To reduce the cost of acquisition and the time from acquiring the data to producing a subsurface image, novel acquisition systems based on compressive sensing, low-rank matrix recovery, and randomized sampling have been developed and implemented. These approaches allow practitioners to achieve dense wavefield reconstruction from a substantially reduced number of field samples. However, designing acquisition surveys suited for this new sampling paradigm remains a critical and challenging role in oil, gas, and geothermal exploration. Typical random designs studied in the low-rank matrix recovery and compressive sensing literature are difficult to achieve by standard industry hardware. For practical purposes, a compromise between stochastic and realizable samples is needed. In this paper, we propose a deterministic and computationally cheap tool to alleviate randomized acquisition design, prior to survey deployment and large-scale optimization. We consider universal and deterministic matrix completion results in the context of seismology, where a bipartite graph representation of the source-receiver layout allows for the respective spectral gap to act as a quality metric for wavefield reconstruction. We provide realistic scenarios to demonstrate the utility of the spectral gap as a flexible tool that can be incorporated into existing survey design workflows for successful seismic data acquisition via low-rank and sparse signal recovery.

arXiv (Cornell University), Sep 13, 2019

We outline new approaches to incorporate ideas from deep learning into wave-based least-squares i... more We outline new approaches to incorporate ideas from deep learning into wave-based least-squares imaging. The aim, and main contribution of this work, is the combination of handcrafted constraints with deep convolutional neural networks, as a way to harness their remarkable ease of generating natural images. The mathematical basis underlying our method is the expectation-maximization framework, where data are divided in batches and coupled to additional "latent" unknowns. These unknowns are pairs of elements from the original unknown space (but now coupled to a specific data batch) and network inputs. In this setting, the neural network controls the similarity between these additional parameters, acting as a "center" variable. The resulting problem amounts to a maximum-likelihood estimation of the network parameters when the augmented data model is marginalized over the latent variables.

arXiv (Cornell University), Apr 15, 2020

Achieving desirable receiver sampling in ocean bottom acquisition is often not possible because o... more Achieving desirable receiver sampling in ocean bottom acquisition is often not possible because of cost considerations. Assuming adequate source sampling is available, which is achievable by virtue of reciprocity and the use of modern randomized (simultaneous-source) marine acquisition technology, we are in a position to train convolutional neural networks (CNNs) to bring the receiver sampling to the same spatial grid as the dense source sampling. To accomplish this task, we form training pairs consisting of densely sampled data and artificially subsampled data using a reciprocity argument and the assumption that the source-site sampling is dense. While this approach has successfully been used on the recovery monochromatic frequency slices, its application in practice calls for wavefield reconstruction of time-domain data. Despite having the option to parallelize, the overall costs of this approach can become prohibitive if we decide to carry out the training and recovery independently for each frequency. Because different frequency slices share information, we propose the use the method of transfer training to make our approach computationally more efficient by warm starting the training with CNN weights obtained from a neighboring frequency slices. If the two neighboring frequency slices share information, we would expect the training to improve and converge faster. Our aim is to prove this principle by carrying a series of carefully selected experiments on a relatively large-scale five-dimensional data synthetic data volume associated with wide-azimuth 3D ocean bottom node acquisition. From these experiments, we observe that by transfer training we are able t significantly speedup in the training, specially at relatively higher frequencies where consecutive frequency slices are more correlated.

Geophysics, Jul 1, 2008

Geophysical inverse problems typically involve a trade off between data misfit and some prior. Pa... more Geophysical inverse problems typically involve a trade off between data misfit and some prior. Pareto curves trace the optimal trade off between these two competing aims. These curves are commonly used in problems with two-norm priors where they are plotted on a log-log scale and are known as L-curves. For other priors, such as the sparsity-promoting one norm, Pareto curves remain relatively unexplored. We show how these curves lead to new insights in onenorm regularization. First, we confirm the theoretical properties of smoothness and convexity of these curves from a stylized and a geophysical example. Second, we exploit these crucial properties to approximate the Pareto curve for a largescale problem. Third, we show how Pareto curves provide an objective criterion to gauge how different one-norm solvers advance towards the solution.

arXiv (Cornell University), May 8, 2014

In this work, we develop an optimization framework for problems whose solutions are wellapproxima... more In this work, we develop an optimization framework for problems whose solutions are wellapproximated by Hierarchical Tucker (HT) tensors, an efficient structured tensor format based on recursive subspace factorizations. By exploiting the smooth manifold structure of these tensors, we construct standard optimization algorithms such as Steepest Descent and Conjugate Gradient for completing tensors from missing entries. Our algorithmic framework is fast and scalable to large problem sizes as we do not require SVDs on the ambient tensor space, as required by other methods. Moreover, we exploit the structure of the Gramian matrices associated with the HT format to regularize our problem, reducing overfitting for high subsampling ratios. We also find that the organization of the tensor can have a major impact on completion from realistic seismic acquisition geometries. These samplings are far from idealized randomized samplings that are usually considered in the literature but are realizable in practical scenarios. Using these algorithms, we successfully interpolate large-scale seismic data sets and demonstrate the competitive computational scaling of our algorithms as the problem sizes grow.

High resolution imaging of the Earth with adaptive full-waveform inversion

AGU Fall Meeting Abstracts, Dec 1, 2014

arXiv (Cornell University), Apr 14, 2021

Time-lapse seismic monitoring of carbon storage and sequestration is often challenging because th... more Time-lapse seismic monitoring of carbon storage and sequestration is often challenging because the time-lapse signature of the growth of CO2 plumes is weak in amplitude and therefore difficult to detect seismically. This situation is compounded by the fact that the surveys are often coarsely sampled and not replicated to reduce costs. As a result, images obtained for different vintages (baseline and monitor surveys) often contain artifacts that may be attributed wrongly to time-lapse changes. To address these issues, we propose to invert the baseline and monitor surveys jointly. By using the joint recovery model, we exploit information shared between multiple time-lapse surveys. Contrary to other time-lapse methods, our approach does not rely on replicating the surveys to detect time-lapse changes. To illustrate this advantage, we present a numerical sensitivity study where CO2 is injected in a realistic synthetic model. This model is representative of the geology in the southeast of the North Sea, an area currently considered for carbon sequestration. Our example demonstrates that the joint recovery model improves the quality of time-lapse images allowing us to monitor the CO2 plume seismically.

We present our initial steps towards the development of a large-scale seismic modeling workflow i... more We present our initial steps towards the development of a large-scale seismic modeling workflow in Julia that provides a framework for wave equation based inversion methods like full waveform inversion or least squares migration. Our framework is based on the Devito, a finite difference domain specific language compiler that generates highly optimized and parallel code. We develop a flexible workflow that is based on abstract matrixfree linear operators and enables developers to write code that closely resembles the underlying math, while at the same time leveraging highly optimized wave equation solvers, allowing us to solve large-scale three-dimensional inverse problems.

Low-rank based matrix/tensor completions for the “real” (seismic) world

Non UBCUnreviewedAuthor affiliation: University of British ColumbiaFacult

83rd EAGE Annual Conference & Exhibition, 2022

By building on recent advances in the use of randomized trace estimation to drastically reduce th... more By building on recent advances in the use of randomized trace estimation to drastically reduce the memory footprint of adjoint-state methods, we present and validate an imaging approach that can be executed exclusively on accelerators. Results obtained on field-realistic synthetic datasets, which include salt and anisotropy, show that our method produces high-fidelity images. These findings open the enticing perspective of 3D wave-based inversion technology with a memory footprint that matches the hardware and that runs exclusively on clusters of GPUs without the undesirable need to offload certain tasks to CPUs. Preprint. Under review.

In this paper an overview is given on the application of directional basis functions, known under... more In this paper an overview is given on the application of directional basis functions, known under the name Curvelets/Contourlets, to various aspects of seismic processing and imaging. Key conceps in the approach are the use of (i) directional basis functions that localize in both domains (e.g. space and angle); (ii) non-linear estimation, which corresponds to localized muting on the coefficients, possibly supplemented by constrained optimization (iii) invariance of the basis functions under the imaging operators. We will discuss applications that include multiple and ground roll removal; sparseness-constrained least-squares migration and the computation of 4-D difference cubes.

Curvelet applications in surface wave removal

Curvelet Applications in Surface Wave Removal Carson Yarham (Seismic Laboratory for Imaging and M... more Curvelet Applications in Surface Wave Removal Carson Yarham (Seismic Laboratory for Imaging and Modeling Department of Earth and Ocean Sciences The University of British Columbia) Gilles Hennenfent (Seismic Laboratory for Imaging and Modeling Department of Earth and Ocean Sciences The University of British Columbia) and Felix J. Herrmann (Seismic Laboratory for Imaging and Modeling Department of Earth and Ocean Sciences The University of British Columbia) SUMMARY____________________________________________________________ Ground roll removal of seismic signals can be a challenging prospect. Dealing with undersampleing causing aliased waves amplitudes orders of magnitude higher than reflector signals and low frequency loss of information due to band