Publication: A cross-validation framework to extract data features for reducing structural uncertainty in subsurface heterogeneity

in Advances in Water Resources Vol 133 (November 2019)
by Jorge Lopez-Alvis, Thomas Hermans, Frédéric Nguyen


Spatial heterogeneity is a critical issue in the management of water resources. However, most studies do not consider uncertainty at different levels in the conceptualization of the subsurface patterns, for example using one single geological scenario to generate an ensemble of realizations.

In this paper, we represent the spatial uncertainty by the use of hierarchical models in which higher-level parameters control the structure. Reduction of uncertainty in such higher-level structural parameters with observation data may be done by updating the complete hierarchical model, but this is, in general, computationally challenging.

To address this, methods have been proposed that directly update these structural parameters by means of extracting lower dimensional representations of data called data features that are informative and applying a statistical estimation technique using these features.

The difficulty of such methods, however, lies in the choice and design of data features, i.e. their extraction function and their dimensionality, which have been shown to be case-dependent. Therefore, we propose a cross-validation framework to properly assess the robustness of each designed feature and make the choice of the best feature more objective. Such framework aids also in choosing the values for the parameters of the statistical estimation technique, such as the bandwidth for kernel density estimation.

We demonstrate the approach on a synthetic case with cross-hole ground penetrating radar traveltime data and two higher-level structural parameters: discrete geological scenarios and the continuous preferential orientation of channels.

With the best performing features selected according to the cross-validation score, we successfully reduce the uncertainty for these structural parameters in a computationally efficient way. While doing so, we also provide guidelines to design features accounting for the level of knowledge of the studied system.

Full article here

Access to data

More on ESR15 research project

Abstract: Updating structural uncertainty through dimension reduction of geophysical data

Event: AGU Fall Meeting, Washington DC (USA), 2018
Abstract by Frédéric Nguyen, Jorge Lopez-Alvis, Thomas Hermans


When studying uncertainty in geosciences we usually have to deal with: (1) very high dimensionality in both data and parameters, (2) complex models to describe the subsurface, and (3) nonlinear forward models to simulate data.

All these facts hinder the quantification of uncertainty as the estimation of thecorresponding joint probability distribution is not straightforward. In this study, we follow a Bayesian approach to obtain the marginal probability distribution of structural parameters given the geophysical data. We approximate this distribution with a combination of Monte Carlo sampling, data dimension reduction and kernel density estimation. We use synthetic GPR cross-borehole traveltime data with added noise and consider two structural parameters: (1) geological scenario, a discrete parameter, and(2) preferential orientation, a continuous parameter.

The focus of this work is in comparing different dimension reduction techniques to assess which one gives the most accurate estimation of structuraluncertainty. We generate the Monte Carlo samples starting with the structural parameter, then applyingmultiple-point geostatistics to obtain the subsurface realization and finally simulating traveltime data with the geophysics forward model. We follow four different strategies: (1) apply multidimensionalscaling directly on traveltime data (2) use histograms of traveltime data as features and then apply multidimensional scaling to further reduce dimensions, (3) transform data into geophysical images bymeans of regularized inversion and then apply multidimensional scaling, and (4) obtain connectivityfeatures from these geophysical images and then apply multidimensional scaling.

Using defined features results in the most accurate estimations of structural uncertainty as measured through cross-validation, both for the histogram and the connectivity, but working with the former is computationallymore efficient since it does not require obtaining the geophysical image. Data dimension reduction is useful when approximating the marginal probability distribution of structural parameters but the accuracy of this distribution depends on the ability of the dimension reduction technique to retain themost informative part of the data with respect to the parameter of interest.

More information on ESR15 research project

More on ORBi Liège