Computation of shape functions

From Medusa: Coordinate Free Mehless Method implementation
Jump to: navigation, search
edit 

Related papers

J. Slak, G. Kosec; Medusa : A C++ library for solving PDEs using strong form mesh-free methods, ACM transactions on mathematical software, vol. 47, 2021 [DOI: 10.1145/3450966]


Shape functions, often called stencil weights, are functionals, approximating linear partial differential operators at a chosen point. A shape function $\b{\chi}_{\mathcal{L}} (\vec{p})$ for a linear partial differential operator $\mathcal{L}$ in a point $p$ is a vector of stencil weights, such that \[ (\mathcal{L} u)(\vec{p}) \approx \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}, \] where $\b{u}$ is a vector of function values $u(x_i)$ evaluated in the support of $\vec{p}$. For usage examples see the section on Meshless Local Strong Form Method (MLSM) or some examples .

For computation of shape functions augmented with monomials, see the Radial basis function-generated finite differences (RBF-FD) .

Derivation

Let $\mathcal{L}$ be a linear partial differential operator. We wish to approximate $(\mathcal{L}u)(\vec{p})$ using only values of $u$ in support of $p$. To do so, we construct a WLS approximation (see Weighted Least Squares (WLS)) and compute the coefficients $\b{\alpha}$, such that $\hat{u}(\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha}$. The coefficients are computed by solving a least-squares system $WB \b{\alpha} = W\b{u}$, (see Solving overdetermined systems) writing \[ \b{\alpha} = (B^\T W^2 B^\T)^{-1} B^\T W^2 \b{u}. \] Writing down the approximation function $\hat{u}$ we get \[ \hat u (\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha} = \b{b}(\vec{p})^\T (WB)^+W\b{u} \] and applying operator $\mathcal{L}$ we get \[ \mathcal{L} \hat u (\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T \b{\alpha} = (\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \b{u} = \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u} \]

Depending on what data we have available, we can either compute $\b \alpha$ and obtain a continuous approximation $\hat{u}$, to which $\mathcal{L}$ can be applied at any point, or, if function values $\b u$ are unknown, we can choose to compute $\b \chi_{\mathcal{L}}$ at a point $\vec p$ and approximate the operator for any set of function values. the operator $\mathcal{L}$ at $\vec p$ is approximated as \[ (\mathcal{L}u)(\vec{p}) \approx (\mathcal{L} \hat{u})(p) = \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}. \] Vector $\b{\chi}$ is a vector of stencil weights, also called a shape function. The name comes historically from FEM, and it incorporates the idea of being able to take all the information about shape of the local domain and choice of approximation and store it in a single vector, being able to approximate a function value from given support values $\b{u}$ with a single dot product. For any values $\b{u}$, value $\b{\chi}(p) \b{u}$ gives us the approximation of $(\mathcal{L}u)(\vec{p})$. Mathematically speaking, $\b{\chi}(\vec{p})$ is a Riesz representation of a functional, $\b{\chi}(\vec{p})\colon \R^n \to \R$, mapping $n$-tuples of known function values to approximations of $\mathcal{L}$ in point $\vec{p}$.

For example, take a $1$-dimensional case for approximation of derivatives with weight equal to $1$ and $n=m=3$, with equally spaced support values at distances $h$. We wish to approximate $u''$ in the middle support point, just by making a weighted sum of the values, something like the finite difference \[ u'' \approx \frac{u_1 - 2u_2 + u_3}{h^2}. \] This is exactly the same formula as we would have come to by computing $\b{\chi}$, except that our approach is a lot more general. But one should think about $\b{\chi}$ as one would about the finite difference scheme, it is a rule, telling us how to compute the derivative. \[ u''(s_2) \approx \underbrace{\begin{bmatrix} \frac{1}{h^2} & \frac{-2}{h^2} & \frac{1}{h^2} \end{bmatrix}}_{\b{\chi}^\T} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix} \]

The fact that $\b{\chi}$ is independent of the function values $\b{u}$ but depends only on domain geometry means that we can just compute the shape functions $\b{\chi}$ for points of interest and then approximate any linear operator of any function, given its values, very fast, using only a single dot product.

Special case when $B$ is square and invertible

The expression \[ \b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \] can be simplified to \[ \b \chi_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}. \]

This gives the approximation as $(\mathcal{L} \hat{u})(p) = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1} \b u$.

Coefficients $\b \alpha$ are the solutions of $B\b \alpha = \b u$. Shape function $\b{\chi}_{\mathcal{L}}(\vec{p})$ is defined as $$ \b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}, $$ which can be rewritten as a solution of $$ B^\T \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p}). $$

This formulation is more suitable for numerical solving and also has a deeper interpretation, as presented in the next section.

Another view on derivation of shape functions

Solving the system $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$ can be interpreted as using the method of undetermined coefficients.

We wish to find such a combination of function values evaluated in support points $\vec s_i$ that approximates the value of an operator $(\mathcal{L}u)(\vec p)$, i.e., we are looking for coefficients $\chi_{\mathcal{L}, i}$, such that \[ \sum_{i=1}^n \chi_{\mathcal{L}, i} u_i \approx (\mathcal{L}u)(p). \]

We require that $\chi_{\mathcal{L}, i}$ be such, that above approximations holds for a certain space of functions, spanned by the basis $\{b_j\}_{j=1}^m$. By choosing the basis functions $b_j$ we decide for which functions the aproximation $\b{\chi}_{\mathcal{L}, \vec{p}} \b{u} \approx (\mathcal{L}u)(\vec p)$ should be exact and we write the equations $\sum_{i=1}^n \chi_{\mathcal{L}, i} b_j(\vec s_i) = (\mathcal{L}b_j)(\vec p)$ in a system: \[ \begin{bmatrix} b_1(\vec s_1) & \cdots & b_1(\vec s_n) \\ \vdots & \ddots & \vdots \\ b_m(\vec s_1) & \cdots & b_m(\vec s_n) \\ \end{bmatrix} \begin{bmatrix} \chi_{\mathcal{L}, 1} \\ \vdots \\ \chi_{\mathcal{L}, n} \end{bmatrix} = \begin{bmatrix} (\mathcal{L} b_1)(\vec{p}) \\ \vdots \\ (\mathcal{L} b_m)(\vec{p}) \end{bmatrix} \] Note that the matrix above is precisely the transpose of the matrix $B$ from the Weighted Least Squares (WLS) approximation. In this sense, the computation of shape functions is dual to computation of basis coefficients. The system is written in matrix form as $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$.

For $n=m$ case as in the previous section, the solution is the sought shape function $\b{\chi}_{\mathcal L}$.

If we do not specify enough functions $b_j$, the system above is underdetermined. A common solution is to minimize the weighted norm of $\b \chi_{\mathcal L}$. This gives us a minimization problem $$ \min_{\b \chi_{\mathcal L}} \sum_{i=1}^n (\chi_{\mathcal{L}, i}/w_i)^2, \text{ such that } B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p). $$ If the central weights are bigger, this also forces the central coefficients to be larger.

Using the solution derived in Solving underdetermined systems and substituting inverse weights $W = W^{-1}$ and matrix $A = B^\T$, we get the solution $$ \b{\chi}_{\mathcal{L}} (\vec{p}) = (W^{-1})^{-2} (B^\T)^\T (B^\T (W^{-1})^{-2}(B^\T)^\T)^{-1} (\mathcal{L}\b{b})(\vec p) = W^2 B (B^\T W^2 B)^{-1} (\mathcal{L}\b{b})(\vec p). $$ After transposing, we get $$ \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)^\T (B^\T W^2 B)^{-1} B^\T W^2, $$ which is exactly the same formula as derived using Weighted Least Squares (WLS) approximation.

Just as we get the same stencil weights using interpolation and the method of undetermined coefficients, we get the same result using WLS approximation or weighted norm minimization of shape functions.

Numerical calculation of the shape functions

Numerically cheaper and more stable way is to translate the problem of inverting the matrix to solving a linear system of equations. The alternative view above is useful in this aspect.

Invertible $B$

If $B$ is invertible, then $\b{\chi}_{\mathcal{L}}(\vec{p})$ can be thought as a solution of a system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$, which can be solved using LU or Cholesky decomposition for example. For more on this, see Solving linear systems.

General case

In general, the system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$ is underdetermined and must be solved in the least-weighted-norm sense. This means solving the underdetermined system $B^\T W y = (\mathcal{L}\b{b})(\vec p)$ and computing $\b{\chi}_{\mathcal{L}}(\vec{p}) = Wy$. For more details on this, see Solving underdetermined systems.

Depending on our knowledge of the problem, we can use Cholesky ($B^\T W$ is positive definite), $LDL^\T$ if it is symmetric, $LU$ for a general square matrix, $QR$ for full rank overdetermined system and SVD for a general system.

If more shapes need to be calculated using the same matrix $B^\T W$ and only different right hand sides (as is often the case, if shapes for more than one operator are computed), it can be done efficiently by storing the decomposition of $B^\T W$.

Linearity and reduction to only computing the shapes of the operator space basis

Observe that the expression for the shape functions \[ \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W \] is linear in $\mathcal{L}$. This follows from the fact that if $\mathcal{L} = \mathcal{L}_1 + \mathcal{L}_2$, the vector of derivative values $(\mathcal{L}\b{b})(\vec{p}) = ((\mathcal{L}_1 + \mathcal{L}_2)\b{b})(\vec{p}) = (\mathcal{L}_1\b{b} + \mathcal{L}_2\b{b})(\vec{p}) = (\mathcal{L}_1\b{b})(\vec p) + (\mathcal{L}_2\b{b})(\vec p)$ and consequently $$\b{\chi}_{\mathcal{L}}(\vec{p}) = \b{\chi}_{\mathcal{L}_1}(\vec{p}) + \b{\chi}_{\mathcal{L}_2}(\vec{p}).$$

Say that your equation constitutes of operators up to a certain $k$-th order. Then it is sufficient to compute only shape functions for the basis of that operator space, namely \[ \{\b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}, 1 \leq |\alpha| \leq k \}, \text{ where } \frac{\partial}{\partial x^\alpha} = \frac{\partial^{|\alpha|}}{\partial x_1^{\alpha_1}\cdots \partial x_d^{\alpha_d}} \text{ and } |\alpha| = \sum_{i=1}^d \alpha_i. \] Having the shape functions $\b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}$ available, we write $\mathcal{L}$ in the above basis in point $p$, $\mathcal{L} = \displaystyle \sum_{1 \leq |\alpha| \leq k} a_\alpha(p) \frac{\partial}{\partial x^\alpha}$ and evaluate it's shape function as \[ \b{\chi}_{\mathcal{L}, \vec{p}} = \sum_{1 \leq |\alpha| \leq k} a_\alpha(p) \b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}. \] This saves us space and gives certain elegance to the implementation.