FEM for 1D magnetotelluric modeling
Introduction
Maxwell’s equations
The theory for understanding the behaviour of electromagnetic waves often begin with their behaviour in vacuum, often governed by a set of four equations called Maxwell’s equations:
\begin{equation} \nabla \cdot E= -\rho_e/\epsilon_0 \label{Coulomb} \end{equation} \begin{equation} \nabla \times E= -\mu_0 \partial_t H \label{Faraday} \end{equation} \begin{equation} \nabla \cdot H= 0 \label{Gauss} \end{equation} \begin{equation} \nabla \times H= J+ \epsilon_0\partial_tE \label{Ampere} \end{equation}
where \(E\) is the electric field, \(H\) is the magnetic field, \(M_s\) is the magnetic field source, \(\rho\) is resistivity, \(\rho_e\) is the charge density, \(J\) is current density, \(\mu_0\) is the magnetic permeability of the free space, and \(\epsilon_0\) is the electrical permittivity of the free space. In the above set of equations, eqn. \eqref{Coulomb} is known as Coulomb’s law of Gauss’ law for electric fields, which describes the electric field generated by a charge distribution. However, in a medium, the Maxwell equations are also accompanied by the constitutive relations. In our case, while investigating the behaviour of electromagnetic fields inside the earth, only one relationship makes an innegligible difference: \(\begin{equation} J= \sigma E \label{Ohm} \end{equation}\\\)
The above is referred as the Ohm’s law, which states that the electric current density \(J\) in a medium is directly proportional to the electric field \(E\) across it with the constant of proportiaonality being the electrical conductivity of the material \(\sigma\). Substituting this in eqn \eqref{Ampere}, we get \begin{equation} \nabla \times H= \sigma E+ \epsilon_0\partial_tE \label{Ampere_Ohm} \end{equation}
Beginning our derivation of the 1D MT partial diffferential equation, we take the curl of \eqref{Faraday},
\begin{equation} \nabla \times \nabla \times E= -\mu_0 \partial_t \nabla \times H \end{equation}
Substituting equation \eqref{Ampere} and making use of \eqref{Ohm}, we get
\[\nabla \times \nabla \times E= -\mu_0 \sigma \partial_t E - \epsilon_0 \partial_t^2E\]Taking advantage of the vector identity \(\nabla \times \nabla \times A= \nabla(\nabla \cdot A)- \nabla^2 A\), this simplifies into
\[\nabla(\nabla \cdot E)- \nabla^2 E= -\mu_0 \sigma \partial_t E - \epsilon_0 \partial_t^2E\]Now substituting \eqref{Coulomb} in the above, we get
\[\nabla(\rho/\epsilon_0)- \nabla^2 E= -\mu_0 \sigma \partial_t E - \epsilon_0 \partial_t^2E\]Noting that charge density does not change a lot in time, the above simplifies to \begin{equation} - \nabla^2 E= -\mu_0 \sigma \partial_t E - \epsilon_0 \partial_t^2E
\label{pde_E} \end{equation} Taking the Fourier transform of the above with respect to \(t\), we have
For the order of frequencies used in the MT method, the second term in the above expression is negligible compared to the first one. This is called the quasi-static approximation. Using this relation, we have
\[- \nabla^2 E= -i \omega \mu_0 \sigma E\] \[\nabla^2 E -i \omega \mu_0 \sigma E = 0\]In 1D, this is equivalent to
\[\begin{align} \notag & \frac{\partial^2 E}{\partial^2 z}- i \omega \mu \sigma E =0 \\ \notag & \texttt{For} \quad k^2= i \omega \mu \sigma \\ & \frac{\partial^2 E}{\partial^2 z}- k^2 E=0 \label{E2k} \end{align}\]Methodology
Getting to the linear system
The idea of the finite element method is to approximate a function in some basis space. To implement the scheme, we need to write the eqn \ref{E2k} in its weak form. In order to do that, we multiply both sides of the equation by any function \(W(z)\), called the weight function.
\[\begin{align} \notag & \frac{\partial^2 E}{\partial^2 z} W- k^2 E W=0 \\ \notag & \texttt{Integrating bothe sides wrt} \quad z \\ \notag & \int_{\Omega} \frac{\partial^2 E}{\partial^2 z} W dz- \int_{\Omega} k^2 E W dz=0 \\ \notag & \texttt{where} \; \Omega \; \texttt{is the computational domain, spanning the model space.} \\ \notag & \texttt{Integrating the first term by parts, we have} \\ \notag & - \int_{\Omega} \frac{\partial E}{\partial z} \frac{\partial W}{\partial z} dz+ \bigg|W \frac{\partial E}{\partial z}\bigg|_{d \Omega}- \int_{\Omega} k^2 E W dz=0 \end{align}\]The second term in the above equation \(\bigg|W \frac{\partial E}{\partial z}\bigg|_{d \Omega}\) is equivalent to \(\bigg|W \frac{\partial E}{\partial z}\bigg|_{0}^Z\), that is, an expression evaluated at the end points. These end points are used to apply the boundary conditions. In the above conditions, we set \(W\)= 0 at the boundaries. This means that the boundary condition (Neumann or Dirichlet) is strictly imposed which we will show later in detail and therefore the residue is zero. This allows us to apply the Dirichlet boundary conditions, which is what we require for the MT problems. This also makes the second term equal to 0 and we have \begin{equation} \int_{\Omega} \frac{\partial E}{\partial z} \frac{\partial W}{\partial z} dz+ \int_{\Omega} k^2 E W dz=0 \label{E2} \end{equation} Before we move on, we need to understand and implement the first line of this section, that is, express the \(E\) field using some known basis functions, say \(\phi(z)\). We can then write \begin{equation} E= \sum_i \alpha_i \phi_i + \phi_0 \label{E_hat} \end{equation} where \(\phi_i(z)\) represents the basis function in the \(i^{th}\) element scald by a factor \(\alpha_i\). For our implementation, we use the linear basis functions, more popularly called as the hat functions. This shows the hat functions for a grid-space defined between 0 and 1. We use 6 hat functions in the figure, each centered on its own node, eg. the function \(\phi_1(z)\) is centered at the node 1 at \(z= 0.2\). It starts with a positive slope at \(z=0\), peaks at \(z=0.2\) and then slopes down until \(z=0.4\).
Using eqn \ref{E_hat} in \ref{E2}, we have \begin{equation} \int_{\Omega} \sum_i \alpha_i \phi_i’ \frac{\partial W}{\partial z} dz+ \int_{\Omega} k^2 \sum_i \alpha_i \phi_i W dz=0 \label{E3} \end{equation} The finite element method also tries to approximate the weight function by expressing it in terms of some basis functions, much like what we have for \(E\) in eqn \ref{E2}. In principle, we can choose any basis function to approximate \(W\) but the most common and efficient way is by using Galerkin’s method, that is, by using the same basis used for \(E\) can be used for \(W\). We then have \begin{equation} W= \sum_j \beta_j \phi_j \label{What} \end{equation} Pulling out the \(j^{th}\) function from the above and plugging into eqn \ref{E3}, we have
\[\begin{align} \notag & \int_{\Omega} \sum_i \alpha_i \phi_i' \beta_j \phi_j' dz+ \int_{\Omega} k^2 \sum_i \alpha_i \phi_i \beta_j \phi_j dz=0 \\ \notag & \beta_j \; \texttt{being a scalar constant} \\ & \int_{\Omega} \sum_i \alpha_i \phi_i(z)' \phi_j(z)' dz+ \int_{\Omega} k^2 \sum_i \alpha_i \phi_i(z) \phi_j(z) dz=0 \label{E4} \end{align}\]Plugging different values \(j\), we can obtain the system of equations to be solve for \(\alpha_i\)’s. Let’s say we have the grid defined for 5 elements, like here. The eqn \ref{E4} then looks like \(\begin{align} \notag & (j=0) \\ \notag & \int_0^1 \alpha_0 \phi_0' \phi_0'+ \alpha_1 \phi_1' \phi_0'+ \dots \alpha_5 \phi_5' \phi_0' dz+ \int_0^1 k^2 \alpha_0 \phi_0 \phi_0+ \alpha_1 \phi_1 \phi_0+ \dots \alpha_5 \phi_5 \phi_0 dz=0 \\ \notag & (j=1) \\ \notag & \int_0^1 \alpha_0 \phi_0' \phi_1'+ \alpha_1 \phi_1' \phi_1'+ \dots \alpha_5 \phi_5' \phi_1' dz+ \int_0^1 k^2 \alpha_0 \phi_0 \phi_1+ \alpha_1 \phi_1 \phi_1+ \dots \alpha_5 \phi_5 \phi_1 dz=0 \\ \notag \vdots \\ \notag & (j=5) \\ \notag & \int_0^1 \alpha_0 \phi_0' \phi_5'+ \alpha_1 \phi_1' \phi_5'+ \dots \alpha_5 \phi_5' \phi_5' dz+ \int_0^1 k^2 \alpha_0 \phi_0 \phi_5+ \alpha_1 \phi_1 \phi_5+ \dots \alpha_5 \phi_5 \phi_5 dz=0 \\ \end{align}\)
Before, we attempt to get the linear system of equations, it would be helpful for notation purposes if we define
\[\begin{align*} A_{ji}= \int_0^1 \phi_i'(z) \phi_j'(z) dz \\ B_{ji}= \int_0^1 k^2 \phi_i(z) \phi_j(z) dz \\ \end{align*}\]We then have
\[\begin{equation}\label{E6} \begin{split} & (j=0) \\ & [ A_{00} \alpha_0+ A_{01} \alpha_1+ \dots A_{05} \alpha_5]+ [B_{00} \alpha_0+ B_{01} \alpha_1+ \dots B_{05} \alpha_5]= 0 \\ & (j=1) \\ & [ A_{10} \alpha_0+ A_{11} \alpha_1+ \dots A_{15} \alpha_5]+ [B_{10} \alpha_0+ B_{11} \alpha_1+ \dots B_{15} \alpha_5]= 0 \\ \dots \\ & (j=5) \\ & [ A_{50} \alpha_0+ A_{51} \alpha_1+ \dots A_{55} \alpha_5]+ [B_{50} \alpha_0+ B_{51} \alpha_1+ \dots B_{55} \alpha_5]= 0 \end{split} \end{equation}\]With \(A_{ij}\) as the elements of the matrix \(A\) and \(B_{ij}\) as the elements of the matrix \(B\), and \(\alpha\) as the vector of \(\alpha_i\)’s, we can write the above as:
\[[A+B]\alpha = 0\]Getting elements of the linear system
While the eqn \eqref{E6} seems to involve a lot of calculations, in reality it is very easy. The following figure shows the functions for \(\phi_0, \phi_1, \phi_4 and \phi_5\).
Before evaluating the explicit integrals, it is noteworthy that the integrals will be non-zero only when both the functions in a region are non-zero. If either of them is zero, the product vanishes.
\[\begin{align*} A_{10} &= \int_0^1 \phi_0'(z) \phi_1'(z) dz \\ &= \int_0^{\Delta x} \phi_0'(z) \phi_1'(z) dz \quad \texttt{(The product vanishes everywhere else)}\\ &= \phi_0'(z) \phi_1'(z) \int_0^{\Delta x} dz= \Delta x \times \frac{1}{\Delta x} \times \frac{-1}{\Delta x} = \frac{-1}{\Delta x}\\ A_{11} &= \int_0^1 \phi_0'(z) \phi_0'(z) dz \\ &= \int_0^{2 \Delta x} \phi_1'(z)^2 dz \quad \texttt{(The product vanishes everywhere else)}\\ &= \phi_1'(z)^2 \int_0^{2 \Delta x} dz= \Delta x \times \frac{1}{(\Delta x)^2} = \frac{1}{\Delta x}\\ A_{51} &= \int_0^1 \phi_1'(z) \phi_5'(z) dz \\ &= 0 \quad \texttt{(The product vanishes everywhere)}\\ \end{align*}\]Similarly, we have for \(B\) (note that \(k^2\) is constant in an element) \(\begin{align*} B_{10} &= \int_0^1 k^2 \phi_0(z) \phi_1(z) dz \\ &= \int_0^{\Delta x} k^2 \phi_0(z) \phi_1(z) dz \quad \texttt{(The product vanishes everywhere else)}\\ &= k^2 \int_0^{\Delta x} \left(1- \frac{x}{\Delta x}\right) \left( \frac{x-0}{\Delta x} \right)dz= k^2 \bigg| \frac{x^2}{2\Delta x}- \frac{x^3}{3(\Delta x)^2}\bigg|_0^{\Delta x}= \frac{k^2 \Delta x}{6}\\ B_{11} &= \int_0^1 k^2 \phi_1(z) \phi_1(z) dz \\ &= \int_0^{\Delta x} k^2 \phi_1(z)^2 dz \quad \texttt{(The product vanishes everywhere else)}\\ &= k^2 \int_0^{\Delta x} \left(\frac{x-0}{\Delta x}\right)^2 = k^2 \bigg| \frac{(x-0)^3}{3(\Delta x)^2}\bigg|_0^{\Delta x}= \frac{k^2 \Delta x}{3}\\ B_{51} &= \int_0^1 \phi_1(z) \phi_5(z) dz \\ &= 0 \quad \texttt{(The product vanishes everywhere)}\\ \end{align*}\)
When we obtain the above for all elements, the matrices look like
\[\begin{align*} A= & \left[ \begin{array}{cccccc} \frac{1}{\Delta x} & \frac{-1}{\Delta x} & 0 & 0 & 0 & 0\\ \frac{-1}{\Delta x} & \frac{2}{\Delta x} & \frac{-1}{\Delta x} & 0 & 0 & 0\\ 0 &\frac{-1}{\Delta x} & \frac{2}{\Delta x} & \frac{-1}{\Delta x} & 0 & 0\\ 0 & 0 & \frac{-1}{\Delta x} & \frac{2}{\Delta x} & \frac{-1}{\Delta x} & 0\\ 0 & 0 & 0 & \frac{-1}{\Delta x} & \frac{2}{\Delta x} & \frac{-1}{\Delta x}\\ 0 & 0 & 0 & 0 & \frac{-1}{\Delta x} & \frac{1}{\Delta x}\ \end{array} \right] \\ \\ B= & \left[ \begin{array}{cccccc} \frac{k^2 \Delta x}{3} & \frac{k^2 \Delta x}{6} & 0 & 0 & 0 & 0\\ \frac{k^2 \Delta x}{6} & \frac{2 k^2 \Delta x}{3} & \frac{k^2 \Delta x}{6} & 0 & 0 & 0\\ 0 & \frac{k^2 \Delta x}{6} & \frac{2 k^2 \Delta x}{3} & \frac{k^2 \Delta x}{6} & 0 & 0\\ 0 & 0 & \frac{k^2 \Delta x}{6} & \frac{2 k^2 \Delta x}{3} & \frac{k^2 \Delta x}{6} & 0\\ 0 & 0 & 0 & \frac{k^2 \Delta x}{6} & \frac{2 k^2 \Delta x}{3} & \frac{k^2 \Delta x}{6}\\ 0 & 0 & 0 & 0 & \frac{k^2 \Delta x}{6} & \frac{k^2 \Delta x}{3}\ \end{array} \right] \end{align*}\]For non-uniform grid spacing, the above modifies as
\[\begin{align*} A= & \left[ \begin{array}{cccccc} \frac{1}{\Delta x_1} & \frac{-1}{\Delta x_1} & 0 & 0 & 0 & 0\\ \frac{-1}{\Delta x_1} & \frac{1}{\Delta x_1}+ \frac{1}{\Delta x_2} & \frac{-1}{\Delta x_2} & 0 & 0 & 0\\ 0 & \frac{-1}{\Delta x_2} & \frac{1}{\Delta x_2}+ \frac{1}{\Delta x_3} & \frac{-1}{\Delta x_3} & 0 & 0\\ 0 & 0 & \frac{-1}{\Delta x_3} & \frac{1}{\Delta x_3}+ \frac{1}{\Delta x_4} & \frac{-1}{\Delta x_4} & 0\\ 0 & 0 & 0 & \frac{-1}{\Delta x_4} & \frac{1}{\Delta x_4}+ \frac{1}{\Delta x_5} & \frac{-1}{\Delta x_5}\\ 0 & 0 & 0 & 0 & \frac{-1}{\Delta x_5} & \frac{1}{\Delta x_5}\ \end{array} \right] \\ \\ B= & \left[ \begin{array}{cccccc} \frac{k^2 \Delta x_1}{3} & \frac{k^2 \Delta x_1}{6} & 0 & 0 & 0 & 0\\ \frac{k^2 \Delta x_1}{6} & \frac{k^2 \Delta x_1}{3}+ \frac{k^2 \Delta x_2}{3} & \frac{k^2 \Delta x_2}{6} & 0 & 0 & 0\\ 0 & \frac{k^2 \Delta x_2}{6} & \frac{k^2 \Delta x_2}{3}+ \frac{k^2 \Delta x_3}{3} & \frac{k^2 \Delta x_3}{6} & 0 & 0\\ 0 & 0 & \frac{k^2 \Delta x_3}{6} & \frac{k^2 \Delta x_3}{3}+ \frac{k^2 \Delta x_4}{3} & \frac{k^2 \Delta x_4}{6} & 0\\ 0 & 0 & 0 & \frac{k^2 \Delta x_4}{6} & \frac{k^2 \Delta x_4}{3}+ \frac{k^2 \Delta x_5}{3} & \frac{k^2 \Delta x_2}{5}\\ 0 & 0 & 0 & 0 & \frac{k^2 \Delta x_5}{6} & \frac{k^2 \Delta x_5}{3}\ \end{array} \right] \end{align*}\]Boundary Conditions
For 1D MT forward modeling, one can either put the a form of source field into the Maxwell’s equations. This, however, can be simplified for the 1D case, where the \(E\) field is typically set to 1 (or any constant) on the surface and noting the attenuating nature of the fields, it is safe to assume that \(E=0\) at the deepest point being modeled. To ensure that this does happen, we need to make the computational domain large enough, which is usually set to a few skin depths corresponding to the largest resistivity for a given frequency.
Discretization
One of the advantages of using finite element is that we can have non-uniform grid sizes. This allows us to have larger grid sizes whenever the structure is homogeneous and do a finer sampling whenever we come across structures/ inhomogeneities. In our case, we tested on a uniform grid spacing to draw comparisons in the matrix structure with finite differences. We then do a non-uniform sampling to show how finite element can be computationally efficient compared to the finite differences. Basing on the idea that we should have finer grid spacing near boundaries, we had the spacing that increase in arithematic progression starting from the edges until the center and then decreasing in the same order until the other edge. We apply this scheme for all the layers. This is schematically shown here.
Note that the above does not plot the depth on a log-scale to show the spacing in between the points, which otherwise gets skewed by the log function. Also, note that one can use any other discretization scheme that fits well with the objective. The one chosen here makes intuitive sense and may not be the best one but is a good one, as will also be shown in the next section.
Since the last layer is treated as an infinite thickness layer, which is not computationally feasible, the common practice is to have its bottom depth extend until a few maximum skin depths for a given frequency. While this post contains the numerical framework, the codes can be accessed here.
Computational Comparison to Finite Difference
Matrix structure
For comparison with the finite differences, we make the grid size uniform for finite elements. The matrix to be inverted in finite element case is
\[\begin{align*} K= & A+ B= \\ & \left[ \begin{array}{cccccc} \frac{1}{\Delta x}+ \frac{k^2 \Delta x}{3} & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & 0 & 0 & 0 & 0\\ \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & \frac{2}{\Delta x}+ \frac{2 k^2 \Delta x}{3} & \frac{-1}{\Delta x+ \frac{k^2 \Delta x}{6}} & 0 & 0 & 0\\ 0 &\frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & \frac{2}{\Delta x}+ \frac{2 k^2 \Delta x}{3} & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & 0 & 0\\ 0 & 0 & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & \frac{2}{\Delta x}+ \frac{2 k^2 \Delta x}{3} & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & 0\\ 0 & 0 & 0 & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & \frac{2}{\Delta x}+ \frac{2 k^2 \Delta x}{3} & \frac{-1}{\Delta x+ \frac{k^2 \Delta x}{6}}\\ 0 & 0 & 0 & 0 & \frac{-1}{\Delta x}+ \frac{k^2 \Delta x}{6} & \frac{1}{\Delta x}+ \frac{k^2 \Delta x}{3}\ \end{array} \right] \end{align*}\]In finite differences, the matrix to be inverted had the form:
\[\begin{align*} K= \\ & \left[ \begin{array}{cccccc} \frac{-1}{(\Delta x)^2} & -k^2+ \frac{1}{\Delta x}^2 & 0 & 0 & 0 & 0\\ \frac{1}{\Delta x} & -k^2 - \frac{2}{(\Delta x)^2} & \frac{1}{(\Delta x)^2} & 0 & 0 & 0\\ 0 & \frac{1}{\Delta x} & -k^2 - \frac{2}{(\Delta x)^2} & \frac{1}{(\Delta x)^2} & 0 & 0\\ 0 & 0 & \frac{1}{\Delta x} & -k^2 - \frac{2}{(\Delta x)^2} & \frac{1}{(\Delta x)^2} & 0\\ 0 & 0 & 0 & \frac{1}{\Delta x} & -k^2 - \frac{2}{(\Delta x)^2} & \frac{1}{(\Delta x)^2}\\ 0 & 0 & 0 & 0 & -k^2+ \frac{1}{\Delta x}^2 & \frac{-1}{(\Delta x)^2}\\ \end{array} \right] \end{align*}\]The matrices look very similar, with both of them being tridiagonal. It is interesting to see that in finite element, the \(k^2\) term is also present in the non-diagonal terms while it is only present in the diagonal elements in the finite difference case.
Error analysis
Due to its flexibility, we expect finite element to be computationally efficient and/or providing better accuracy. We do a quantitative analysis to show how much the the difference between the two schemes is. We first define the error metric as:
\[\operatorname{err}(\rho_a)= \frac{1}{N} \sum_i \frac{|\rho_{analytical}^i- \rho_{numerical}^i|}{|\rho_{analytical}^i|} \times 100\] \[\operatorname{err}(\phi)= \frac{1}{N} \sum_i \frac{|\phi_{analytical}^i- \phi_{numerical}^i|}{|\phi_{analytical}^i|} \times 100\]As can be evidently seen here and here, the finite element accumulates less errors compared to the finite difference method. Though the bigger highlight is that the finite element provides a better accuracy when compared to the finite element with more than an order of magnitude less number of points compared to the finite difference.
Resolution loss in discretization
In finite differences, we are almost always constrained to have a uniform spacing. This leads us to choose a certain grid spacing for the whole space. This is usually chosen to be a small percentage of the minimum skin depth for a given frequency. This invariably leads to a loss in resolution because the layers may not necessarily end at a certain multiple of the grid spacing, as shown here
In finite element, since we deal with the element spaces, we have the conductivity value known in all the element spaces. These element spaces are obviously homoegeneous and heterogeneities are captured by the node points. This allows us to get to the exact boundary structure.
The idea can further be propagated to multiple dimensions, wherein we can reduce the difference the resolution loss because of the discretization. As a last note, we realize that this is not a huge difference that can error the forward model for MT very much, especially in 1D, but is definitely worth noting and can be significant when moving on to 2D and 3D.
Conclusion
From the discussion so far, one can safely say that the finite element is more challenging to implement compared to the finite difference scheme. Since it allows the user to get flexible with the descritization, it needs to be implemented in the right way. The advantages, however, outweigh, the disadvantages. We can get a closer approximation to the resistivity distribution being modeled, which implies less resolution loss. The main highlight of finite element is that it allows us to use less number of grid points. Since both finite element and finite difference have the same tridiagonal matrix structure, the computational cost for inverting both the matrices will be the same, implying finite element will be orders of magnitude faster than the finite difference, as can be observed in the figures in the Error analysis section, where the finite element obtains the same accuracy in 10 times or even 100 times less node points.
These advantages become more pronounced in 2D and 3D, where the number of node points increase dramatically, and also allow us to include topography into the model. Therefore, the use of finite elements over finite differences is encouraged.
Acknowledgement
- Thanks a lot to Dong Lai Yang for starting me with FE methods.
- Class notes of Georgia Tech EAS 8803 Electromagnetic Methods course
- MIT 18.085 Computational Science and Engineering I