Skip to content

RTO-TKO

RTO-TKO is a stochastic framework introduced in the electromagnetic geophysical community by Blatter et al., 2022 (a and b)

Similar to fixed discretization scheme, the grid sizes do not change. RTO-TKO tries to obtain the samples of the posterior distribution by unrolling the optimization for the misfit function, that is, it perturbs the model space obtained after one iteration before optimizing for the next. It does this iteratively for both the values of the model space and the regularization coefficient in alternative steps.

Therefore, if we have

m=[m1,m2,m3,...,mN]

and

d=F(m)

Then for Cd, the inverse of data covariance matrix, and Cm the inverse of model covariance matrix, we want to explore uncertainty for the following misfit function:

J(m)=[F(m)d]TCd[F(m)d]+μmTCmm

where μ is the regularization weight and L is the derivative matrix. Probabilistically, the above equation implies that the a priori distribution of m is N(0,Cm). RTO-TKO explores the uncertainty in m, as well as the μ-space instead of fixing it, and gives a family of models that fit the data.

The algorithm was proposed for Cm constructed with LL, where L is the discrete derivative matrix. The algorithm works as:

Solving forJ(m)=[F(m)d]TCd[F(m)d]+μ(Lm)T(Lm)1)Solve for mi+1Sample d~N(d,Cd) and m~N(0,1μ(LTL))Solvemi+1=\argminmi+1[F(mi+1)d~]TCd[F(mi+1)d~]+μi[L(mi+1m~)]T[L(mi+1m~)]2)Solve for μi+1Sample d~N(d,Cd)Solveμi+1=\argminμi+1[F(mi+1)d~]TCd[F(mi+1)d~]log(p(μi+1))

Note

  • Usually, the prior of μ is a uniform distribution and we do not have to compute the corresponding log pdf term

  • Implementing the above from scratch might not be trivial because of LL being non-invertible, and we do the optimization in the domain defined by ξ=μLm. Such a variable will then have a standard normal distribution N(0,I) when mN(0,1μ(LTL))

In the following section, we demonstrate RTO-TKO for a 6-layered earth, including the half-space being imaged using the MT method. The prior distribution is composed of 26 layers, with the initial model being a half-space.

Demo

Let's create a synthetic dataset first, with 10% error floors:

julia
m_test = MTModel(log10.([100.0, 10.0, 1000.0]), [1e3, 1e3])
f = 10 .^ range(-4; stop=1, length=25)
ω = vec( .* f)

r_obs = forward(m_test, ω)

err_phi = asin(0.02) * 180 / π .* ones(length(ω))
err_appres = 0.1 * r_obs.ρₐ
err_resp = MTResponse(err_appres, err_phi)

We don't need to construct the a priori the same way as before. Instead, we need to define the optimiser we'd use. Here, we use Occam, and an initial model composed of half-space of 100 Ωm.

julia
z = collect(0:100:2.5e3)
h = diff(z)
m_rto = MTModel(2 .* ones(length(z)), vec(h))

n_samples = 100

r_cache = rto_cache(
    m_rto, [1e-2, 1e4], Occam(), n_samples, n_samples, 1.0, [:ρₐ, ], false)

mt_chain,
μ_chain = stochastic_inverse(
    r_obs, err_resp, ω, r_cache; model_trans_utils=(; m=sigmoid_tf))
(MCMC chain (99×26×1 reshape(adjoint(::Matrix{Float64}), 99, 26, 1) with eltype Float64), MCMC chain (99×1×1 Array{Float64, 3}))

Note that the chain contains fewer samples than we had asked for. This is because a few unstable samples were filtered out. The obtained mt_chain contains the a posteriori distributions that can be saved using JLD2.jl.

julia
using JLD2
JLD2.@save "file_path.jld2" mt_chain

Since RTO-TKO is closely related to Occam, we also include occam results in our figures below. Also, note that we did not require an a priori distribution to obtain samples. However, we would need to create the same to plot our posterior samples. This can just be something that encapsulates the whole prior space (or an envelope of the same), e.g., in the case of magnetotelluric imaging, we know that the resistivity values will always be in [-2, 5] on the log-scale.

julia
modelD = MTModelDistribution(Product([Uniform(-1.0, 5.0) for i in eachindex(z)]), vec(h))
Code for this figure
julia
fig = Figure()
ax = Axis(fig[1, 1])
hm = get_kde_image!(ax, mt_chain, modelD; kde_transformation_fn=log10, colormap=:binary,
    colorrange=(-3.0, 0.0), trans_utils=(m=no_tf, h=no_tf))
Colorbar(fig[1, 2], hm; label="log pdf")

mean_kws = (; color=:seagreen3, linewidth=2)
std_kws = (; color=:red, linewidth=1.5)
get_mean_std_image!(
    ax, mt_chain, modelD; confidence_interval=0.99, trans_utils=(m=no_tf, h=no_tf),
    mean_kwargs=mean_kws, std_plus_kwargs=std_kws, std_minus_kwargs=std_kws)
ylims!(ax, [2500, 0])

plot_model!(ax, m_test; color=:black, linestyle=:dash, label="true", linewidth=2)
Legend(fig[2, :], ax; orientation=:horizontal)

The list of models can then be obtained from chains using

julia
model_list = get_model_list(mt_chain, modelD)

We can then easily check the fit of the response curves

julia
fig = Figure()
ax1 = Axis(fig[1, 1])
ax2 = Axis(fig[1, 2])

resp_post = forward(model_list[1], ω);
for i in 1:(length(model_list) > 100 ? 100 : length(model_list))
    forward!(resp_post, model_list[i], ω)
    plot_response!([ax1, ax2], ω, resp_post; alpha=0.4, color=:gray)
end
plot_response!([ax1, ax2], ω, r_obs; errs=err_resp, plt_type=:errors, whiskerwidth=10)
plot_response!([ax1, ax2], ω, r_obs; plt_type=:scatter, label="true")
ylims!(ax1, exp10.([1.2, 3.1]))
ylims!(ax2, [12, 70])

fig

Warning

It is recommended that the samples from RTO-TKO can be filtered out to reject the samples that have a poor fit on the data.