Difference between revisions of "Computation of shape functions"

From Medusa: Coordinate Free Mehless Method implementation
Jump to: navigation, search
(Linearity and reduction to only computing the shapes of the operator space basis)
 
(39 intermediate revisions by 5 users not shown)
Line 1: Line 1:
Shape functions are funcationals, approximating linear partial differential operators in given points. A shape function $\b{\chi}_{\mathcal{L}, \vec{p}}$ of a linear parial differential operator $\mathcal{L}$
+
{{Box-round|title= Related papers |
in a point $p$ is such a functional that
+
[https://e6.ijs.si/ParallelAndDistributedSystems/publications/69777155.pdf J. Slak, G. Kosec; Medusa : A C++ library for solving PDEs using strong form mesh-free methods, ACM transactions on mathematical software, vol. 47, 2021 [DOI: 10.1145/3450966]]
 +
}}
 +
 
 +
__TOC__
 +
 
 +
Shape functions, often called stencil weights, are functionals, approximating linear partial differential operators at a chosen point.  
 +
A shape function $\b{\chi}_{\mathcal{L}} (\vec{p})$ for a linear partial differential operator $\mathcal{L}$
 +
in a point $p$ is a vector of stencil weights, such that
 
\[
 
\[
(\mathcal{L} u)(\vec{p}) \approx \b{\chi}_{\mathcal{L}, \vec{p}} \b{u},
+
(\mathcal{L} u)(\vec{p}) \approx \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u},
 
\]
 
\]
 
where $\b{u}$ is a vector of function values $u(x_i)$ evaluated in the support of $\vec{p}$.
 
where $\b{u}$ is a vector of function values $u(x_i)$ evaluated in the support of $\vec{p}$.
For more definitions and their usage see the section on [[Meshless Local Strong Form Method (MLSM)]].
+
For usage examples see the section on [[Meshless Local Strong Form Method (MLSM)]] or some [[ Poisson's equation | examples ]].
 +
 
 +
For computation of shape functions augmented with monomials, see the [[ Radial basis function-generated finite differences (RBF-FD) ]].
  
 
== Derivation ==
 
== Derivation ==
 
Let $\mathcal{L}$ be a linear partial differential operator.
 
Let $\mathcal{L}$ be a linear partial differential operator.
 
We wish to approximate $(\mathcal{L}u)(\vec{p})$ using only values of $u$ in support of $p$. To do so, we construct a  
 
We wish to approximate $(\mathcal{L}u)(\vec{p})$ using only values of $u$ in support of $p$. To do so, we construct a  
MLS approximation (see [[Moving Least Squares (MLS)]]) and compute the coefficients $\b{\alpha}$, such that $\hat{u}(\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha}$.
+
WLS approximation (see [[Weighted Least Squares (WLS)]]) and compute the coefficients $\b{\alpha}$, such that $\hat{u}(\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha}$.
The coefficients are computed by solving a system $WB \b{\alpha} = W\b{u}$
+
The coefficients are computed by solving a least-squares system $WB \b{\alpha} = W\b{u}$, (see [[Solving overdetermined systems]])
using the pseudoinverse
+
writing
\[ \b{\alpha} = (WB)^+W\b{u}, \]
+
\[ \b{\alpha} = (B^\T W^2 B^\T)^{-1} B^\T W^2 \b{u}. \]
where $A^+$ denotes the Moore-Penrose pseudoinverse that can be calculated using SVD decomposition.
 
 
Writing down the approximation function $\hat{u}$ we get
 
Writing down the approximation function $\hat{u}$ we get
 
\[
 
\[
\hat u (\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha} = \b{b}(\vec{p})^\T (WB)^+W\b{u} = \b{\chi}_{\vec{p}} \b{u}.
+
\hat u (\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha} = \b{b}(\vec{p})^\T (WB)^+W\b{u}  
 +
\]
 +
and applying operator $\mathcal{L}$ we get
 +
\[
 +
    \mathcal{L} \hat u (\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T \b{\alpha} =
 +
(\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \b{u}
 +
= \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}
 
\]
 
\]
  
We have defined $\b{\chi}$ to be
+
Depending on what data we have available, we can either compute $\b \alpha$ and obtain a continuous approximation $\hat{u}$,
\[ \b{\chi}(\vec{p}) = \b{b}(\vec{p})^\T (WB)^+W. \]
+
to which $\mathcal{L}$ can be applied at any point, or, if function values $\b u$ are unknown, we can choose to compute
Vector $\b{\chi}$ is a row vector, also called a ''shape function''. The name comes from being able to take all the information
+
$\b \chi_{\mathcal{L}}$ at a point $\vec p$ and approximate the operator for ''any'' set of function values.  
about shape of the domain and choice of approximation and store it in a single row vector, being able to approximate
+
the operator $\mathcal{L}$ at $\vec p$ is approximated as
a function value from given support values $\b{u}$ with a single dot product. For any values $\b{u}$, value $\b{\chi}(p) \b{u}$
 
gives us the approximation $\hat{u}(\vec{p})$ of $u$ in point $\vec{p}$.
 
Mathematically speaking, $\b{\chi}(\vec{p})$ is a functional, $\b{\chi}(\vec{p})\colon \R^n \to \R$, mapping $n$-tuples of known function values to
 
their approximations in point $\vec{p}$.
 
 
 
The same approach works for any linear operator $\mathcal L$ applied to $u$. We approximate
 
 
\[
 
\[
   (\mathcal{L}u)(\vec{p}) \approx (\mathcal{L} \hat{u})(p) = (\mathcal{L}\b{b})(\vec{p})^\T \b{\alpha} = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W\b{u} = \b{\chi}_{\mathcal{L},\vec{p}} \b{u}.
+
   (\mathcal{L}u)(\vec{p}) \approx (\mathcal{L} \hat{u})(p) = \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}.
 
\]
 
\]
 +
Vector $\b{\chi}$ is a vector of stencil weights, also called a ''shape function''. The name comes historically from FEM, and it incorporates the idea of being able to take all the information
 +
about shape of the local domain and choice of approximation and store it in a single vector, being able to approximate
 +
a function value from given support values $\b{u}$ with a single dot product. For any values $\b{u}$, value $\b{\chi}(p) \b{u}$
 +
gives us the approximation of $(\mathcal{L}u)(\vec{p})$.
 +
Mathematically speaking, $\b{\chi}(\vec{p})$ is a Riesz representation of a functional, $\b{\chi}(\vec{p})\colon \R^n \to \R$, mapping $n$-tuples of known function values to
 +
approximations of $\mathcal{L}$ in point $\vec{p}$.
  
 
For example, take a $1$-dimensional case for approximation of derivatives with weight equal to $1$ and $n=m=3$, with equally spaced support values at distances $h$.
 
For example, take a $1$-dimensional case for approximation of derivatives with weight equal to $1$ and $n=m=3$, with equally spaced support values at distances $h$.
Line 39: Line 53:
 
This is exactly the same formula as we would have come to by computing $\b{\chi}$, except that our approach is a lot more general. But one should think about
 
This is exactly the same formula as we would have come to by computing $\b{\chi}$, except that our approach is a lot more general. But one should think about
 
$\b{\chi}$ as one would about the finite difference scheme, it is a rule, telling us how to compute the derivative.
 
$\b{\chi}$ as one would about the finite difference scheme, it is a rule, telling us how to compute the derivative.
\[ u''(s_2) \approx \underbrace{\begin{bmatrix} \frac{1}{h^2} & \frac{-2}{h^2} & \frac{1}{h^2} \end{bmatrix}}_{\b{\chi}} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix}  \]
+
\[ u''(s_2) \approx \underbrace{\begin{bmatrix} \frac{1}{h^2} & \frac{-2}{h^2} & \frac{1}{h^2} \end{bmatrix}}_{\b{\chi}^\T} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix}  \]
  
The fact that $\b{\chi}$ is independent of the function values $\b{u}$ but depend only on domain geometry means that
+
The fact that $\b{\chi}$ is independent of the function values $\b{u}$ but depends only on domain geometry means that
 
'''we can just compute the shape functions $\b{\chi}$ for points of interest and then approximate any linear operator
 
'''we can just compute the shape functions $\b{\chi}$ for points of interest and then approximate any linear operator
 
of any function, given its values, very fast, using only a single dot product.'''
 
of any function, given its values, very fast, using only a single dot product.'''
  
== Numerical calculation of the shape functions ==
+
== Special case when $B$ is square and invertible ==
 
The expression
 
The expression
\[ \b{\chi}_{\mathcal{L}, \vec{p}} = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W \]
+
\[ \b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \]
can be evaluated directly, but this is not the most optimal approach. Numerically cheaper and more stable way is to translate the problem of inverting the matrix to solving a linear system of equations.
+
can be simplified to  
 +
\[
 +
  \b \chi_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}.
 +
\]
  
'''Invertible $B$ case:'''
+
This gives the approximation as $(\mathcal{L} \hat{u})(p) = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1} \b u$.  
If $B$ is invertible, then $\b{\chi}_{\mathcal{L}, \vec{p}} = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}$, transposing the equation then multiplying it from the left by $B$,
 
$\b{\chi}_{\mathcal{L}, \vec{p}}$ can be thought as a solution of a system $B^\T\b{\chi}_{\mathcal{L}, \vec{p}}^\T = (\mathcal{L}\b{b})$, which can be solved using LU or Cholesky decomposition for example.
 
  
'''General case:'''
+
Coefficients $\b \alpha$ are the solutions of $B\b \alpha = \b u$. Shape function $\b{\chi}_{\mathcal{L}}(\vec{p})$ is defined as
For a system written as $Ax = b$, where $A$ is a $n\times m$ matrix, $x$ is a vector of length $m$ and $b$ a vector of length $n$, a generalized solution
+
$$  
is defined as $x$ that minimizes $\|A x - b\|_2^2$. If more $x$ attain the minimal value, $x$ with the minimal $\|x\|$ is chosen. Note that this generalizes the solution
+
\b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1},
a general system ($A$ is invertible) and over-determined system ($n > m$ and $A$ has full rank). Such an $x$ can be computed using the pseudoinverse $x = A^{+} b$.
+
$$
 +
which can be rewritten as a solution of
 +
$$ B^\T \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p}). $$
  
In our case, let us denote a part of the solution containing the pseudoinverse by $\tilde{\b{\chi}}_{\mathcal{L}, \vec{p}}$.
+
This formulation is more suitable for numerical solving and also has a deeper interpretation, as presented in the next section.
\[ \b{\chi}_{\mathcal{L}, \vec{p}}(\vec{p}) = \underbrace{(\mathcal{L}\b{b})(\vec{p})^\T (WB)^+}_{\tilde{\b{\chi}}_{\mathcal{L}, \vec{p}}} W \]
 
We have an expression $\tilde{\b{\chi}}_{\mathcal{L}, \vec{p}} = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+$ which after transposition takes the form
 
$\tilde{\b{\chi}}_{\mathcal{L}, \vec{p}}^\T = ((WB)^\T)^+(\mathcal{L}\b{b})(\vec{p})$, the same as $x = A^+b$ above.
 
Therefore, $\tilde{\b{\chi}}_{\mathcal{L}, \vec{p}}^\T$ is the solution of an (indeterminate) system $(WB)^\T \tilde{\b{\chi}}_{\mathcal{L}, \vec{p}}^\T = (\mathcal{L}\b{b})(\vec{p})$.
 
After solving that, we can get the shape function $\b{\chi}_{\mathcal{L}, \vec{p}} = \tilde{\b{\chi}}_{\mathcal{L}, \vec{p}} W$ by multiplying by matrix $W$.  
 
  
The system before can be solved using any decomposition of matrix $(WB)^\T = B^\T W$, not neceserally the SVD decompostion, but depending on our knowledge of the  
+
== Another view on derivation of shape functions ==
problem, we can use Cholesky ($B^\T W$ is positive definite), $LDL^\T$ if it is symmetric, $LU$ for a general square matrix, $QR$ for full rank overdetermined system and SVD for a general system.
+
Solving the system $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$ can be interpreted as using the method of undetermined coefficients.
If more shapes need to be calculated using the same matrix $B^\T W$ and only different right hand sides, it can be done efficiently by storing the decomposition of $B^\T W$.
 
  
 +
We wish to find such a combination of function values evaluated in support points $\vec s_i$ that approximates the value of an operator $(\mathcal{L}u)(\vec p)$, i.e., we are looking for coefficients $\chi_{\mathcal{L}, i}$, such that
 +
\[ \sum_{i=1}^n \chi_{\mathcal{L}, i} u_i \approx (\mathcal{L}u)(p). \]
  
== Another view on derivation of shape functions ==
+
We require that $\chi_{\mathcal{L}, i}$ be such, that above approximations holds for a certain space of functions, spanned by the basis $\{b_j\}_{j=1}^m$.
Solving the system $B^\T\b{\chi}_{\mathcal{L}, \vec{p}}^\T = (\mathcal{L}\b{b})$ (in invertible $B$ case) has more than just algebraic meaning. It can be interpreted as using the method of undetermined coefficients.
+
By choosing the basis functions $b_j$ we decide for which functions the aproximation $\b{\chi}_{\mathcal{L}, \vec{p}} \b{u} \approx (\mathcal{L}u)(\vec p)$ should be exact and we write the equations
We wish to find such a combination of function values evaluated in support points that approximates the value of an operator $(\mathcal{L}u)(p)$, ie., we are looking for coefficients $\chi_i$, such that
+
$\sum_{i=1}^n \chi_{\mathcal{L}, i} b_j(\vec s_i) = (\mathcal{L}b_j)(\vec p)$ in a system:
\[ \sum_{i=1}^n \chi_i u_i \approx (\mathcal{L}u)(p). \]
 
 
 
We require that $\chi_i$ be such, that above approximations holds on a certain set of functions, $\{b_j\}_{j=1}^m$ and we write the equations:
 
 
\[
 
\[
 
\begin{bmatrix}
 
\begin{bmatrix}
   b_1(x_1) & \cdots & b_1(x_n) \\
+
   b_1(\vec s_1) & \cdots & b_1(\vec s_n) \\
 
   \vdots & \ddots & \vdots \\
 
   \vdots & \ddots & \vdots \\
   b_m(x_1) & \cdots & b_m(x_n) \\
+
   b_m(\vec s_1) & \cdots & b_m(\vec s_n) \\
 
\end{bmatrix}
 
\end{bmatrix}
 
\begin{bmatrix}
 
\begin{bmatrix}
  \chi_1 \\ \vdots \\ \chi_n
+
  \chi_{\mathcal{L}, 1} \\ \vdots \\ \chi_{\mathcal{L}, n}
 
\end{bmatrix}
 
\end{bmatrix}
 +
=
 
\begin{bmatrix}
 
\begin{bmatrix}
 
   (\mathcal{L} b_1)(\vec{p}) \\ \vdots \\ (\mathcal{L} b_m)(\vec{p})
 
   (\mathcal{L} b_1)(\vec{p}) \\ \vdots \\ (\mathcal{L} b_m)(\vec{p})
 
\end{bmatrix}
 
\end{bmatrix}
 
\]
 
\]
The solution os the sought shape function $\b{\chi}$.
+
Note that the matrix above is precisely the transpose of the matrix $B$ from the [[Weighted Least Squares (WLS) ]] approximation. In this sense, the computation of
By choosing the basis functions $b_j$ we decide for which functions the aproximation $\b{\chi}_{\mathcal{L}, \vec{p}} \b{u} \approx (\mathcal{L}u)(p)$ should be exact.
+
shape functions is dual to computation of basis coefficients. The system is written in matrix form as $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$.  
If we do not specify enough functions $b_j$, we minimize $\|\b{\chi}_{\mathcal{L}, \vec{p}}\|$ instead.
+
 
 +
For $n=m$ case as in the previous section, the solution is the sought shape function $\b{\chi}_{\mathcal L}$.
 +
 
 +
If we do not specify enough functions $b_j$, the system above is underdetermined.
 +
A common solution is to minimize the weighted norm of $\b \chi_{\mathcal L}$.
 +
This gives us a minimization problem
 +
$$
 +
\min_{\b \chi_{\mathcal L}} \sum_{i=1}^n (\chi_{\mathcal{L}, i}/w_i)^2, \text{ such that } B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p).
 +
$$
 +
If the central weights are bigger, this also forces the central coefficients to be larger.
  
 +
Using the solution derived in [[Solving underdetermined systems]] and substituting inverse weights $W = W^{-1}$ and matrix $A = B^\T$, we get the solution
 +
$$
 +
\b{\chi}_{\mathcal{L}} (\vec{p}) = (W^{-1})^{-2} (B^\T)^\T (B^\T (W^{-1})^{-2}(B^\T)^\T)^{-1} (\mathcal{L}\b{b})(\vec p) = W^2 B (B^\T W^2 B)^{-1} (\mathcal{L}\b{b})(\vec p).
 +
$$
 +
After transposing, we get
 +
$$
 +
\b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)^\T (B^\T W^2 B)^{-1} B^\T W^2,
 +
$$
 +
which is exactly the same formula as derived using [[ Weighted Least Squares (WLS) ]] approximation.
 +
 +
Just as we get the same stencil weights using interpolation and the method of undetermined coefficients, we get the same result using
 +
WLS approximation or weighted norm minimization of shape functions.
 +
 +
== Numerical calculation of the shape functions ==
 +
Numerically cheaper and more stable way is to translate the problem of inverting the matrix to solving a linear system of equations.
 +
The alternative view above is useful in this aspect.
 +
 +
=== Invertible $B$ ===
 +
 +
If $B$ is invertible, then $\b{\chi}_{\mathcal{L}}(\vec{p})$ can be thought as a solution of a system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$, which can be solved using LU or Cholesky decomposition for example.
 +
For more on this, see [[Solving linear systems]].
 +
 +
=== General case ===
 +
In general, the system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$ is underdetermined and must be solved in the least-weighted-norm sense.
 +
This means solving the underdetermined system $B^\T W y =  (\mathcal{L}\b{b})(\vec p)$ and computing $\b{\chi}_{\mathcal{L}}(\vec{p}) = Wy$.
 +
For more details on this, see [[Solving underdetermined systems]].
 +
 +
Depending on our knowledge of the
 +
problem, we can use Cholesky ($B^\T W$ is positive definite), $LDL^\T$ if it is symmetric, $LU$ for a general square matrix, $QR$ for full rank overdetermined system and SVD for a general system.
 +
 +
If more shapes need to be calculated using the same matrix $B^\T W$ and only different right hand sides (as is often the case, if shapes for more than one operator are computed),
 +
it can be done efficiently by storing the decomposition of $B^\T W$.
  
 
== Linearity and reduction to only computing the shapes of the operator space basis ==
 
== Linearity and reduction to only computing the shapes of the operator space basis ==
 
Observe that the expression for the shape functions
 
Observe that the expression for the shape functions
\[ \b{\chi}_{\mathcal{L}, \vec{p}} = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W \]
+
\[ \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W \]
is linear in $\mathcal{L}$. Say that your equation constitutes of operators up to a certain $k$-th order.
+
is linear in $\mathcal{L}$. This follows from the fact that if $\mathcal{L} = \mathcal{L}_1 + \mathcal{L}_2$, the vector of
 +
derivative values $(\mathcal{L}\b{b})(\vec{p}) = ((\mathcal{L}_1 + \mathcal{L}_2)\b{b})(\vec{p}) =
 +
(\mathcal{L}_1\b{b} + \mathcal{L}_2\b{b})(\vec{p}) =  (\mathcal{L}_1\b{b})(\vec p) +  (\mathcal{L}_2\b{b})(\vec p)$
 +
and consequently
 +
$$\b{\chi}_{\mathcal{L}}(\vec{p}) = \b{\chi}_{\mathcal{L}_1}(\vec{p}) + \b{\chi}_{\mathcal{L}_2}(\vec{p}).$$
 +
 
 +
Say that your equation constitutes of operators up to a certain $k$-th order.
 
Then it is sufficient to compute only shape functions for the basis of that operator space, namely
 
Then it is sufficient to compute only shape functions for the basis of that operator space, namely
 
\[
 
\[

Latest revision as of 19:02, 4 December 2022

edit 

Related papers

J. Slak, G. Kosec; Medusa : A C++ library for solving PDEs using strong form mesh-free methods, ACM transactions on mathematical software, vol. 47, 2021 [DOI: 10.1145/3450966]


Shape functions, often called stencil weights, are functionals, approximating linear partial differential operators at a chosen point. A shape function $\b{\chi}_{\mathcal{L}} (\vec{p})$ for a linear partial differential operator $\mathcal{L}$ in a point $p$ is a vector of stencil weights, such that \[ (\mathcal{L} u)(\vec{p}) \approx \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}, \] where $\b{u}$ is a vector of function values $u(x_i)$ evaluated in the support of $\vec{p}$. For usage examples see the section on Meshless Local Strong Form Method (MLSM) or some examples .

For computation of shape functions augmented with monomials, see the Radial basis function-generated finite differences (RBF-FD) .

Derivation

Let $\mathcal{L}$ be a linear partial differential operator. We wish to approximate $(\mathcal{L}u)(\vec{p})$ using only values of $u$ in support of $p$. To do so, we construct a WLS approximation (see Weighted Least Squares (WLS)) and compute the coefficients $\b{\alpha}$, such that $\hat{u}(\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha}$. The coefficients are computed by solving a least-squares system $WB \b{\alpha} = W\b{u}$, (see Solving overdetermined systems) writing \[ \b{\alpha} = (B^\T W^2 B^\T)^{-1} B^\T W^2 \b{u}. \] Writing down the approximation function $\hat{u}$ we get \[ \hat u (\vec{p}) = \b{b}(\vec{p})^\T \b{\alpha} = \b{b}(\vec{p})^\T (WB)^+W\b{u} \] and applying operator $\mathcal{L}$ we get \[ \mathcal{L} \hat u (\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T \b{\alpha} = (\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \b{u} = \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u} \]

Depending on what data we have available, we can either compute $\b \alpha$ and obtain a continuous approximation $\hat{u}$, to which $\mathcal{L}$ can be applied at any point, or, if function values $\b u$ are unknown, we can choose to compute $\b \chi_{\mathcal{L}}$ at a point $\vec p$ and approximate the operator for any set of function values. the operator $\mathcal{L}$ at $\vec p$ is approximated as \[ (\mathcal{L}u)(\vec{p}) \approx (\mathcal{L} \hat{u})(p) = \b{\chi}_{\mathcal{L}}(\vec{p})^\T \b{u}. \] Vector $\b{\chi}$ is a vector of stencil weights, also called a shape function. The name comes historically from FEM, and it incorporates the idea of being able to take all the information about shape of the local domain and choice of approximation and store it in a single vector, being able to approximate a function value from given support values $\b{u}$ with a single dot product. For any values $\b{u}$, value $\b{\chi}(p) \b{u}$ gives us the approximation of $(\mathcal{L}u)(\vec{p})$. Mathematically speaking, $\b{\chi}(\vec{p})$ is a Riesz representation of a functional, $\b{\chi}(\vec{p})\colon \R^n \to \R$, mapping $n$-tuples of known function values to approximations of $\mathcal{L}$ in point $\vec{p}$.

For example, take a $1$-dimensional case for approximation of derivatives with weight equal to $1$ and $n=m=3$, with equally spaced support values at distances $h$. We wish to approximate $u''$ in the middle support point, just by making a weighted sum of the values, something like the finite difference \[ u'' \approx \frac{u_1 - 2u_2 + u_3}{h^2}. \] This is exactly the same formula as we would have come to by computing $\b{\chi}$, except that our approach is a lot more general. But one should think about $\b{\chi}$ as one would about the finite difference scheme, it is a rule, telling us how to compute the derivative. \[ u''(s_2) \approx \underbrace{\begin{bmatrix} \frac{1}{h^2} & \frac{-2}{h^2} & \frac{1}{h^2} \end{bmatrix}}_{\b{\chi}^\T} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix} \]

The fact that $\b{\chi}$ is independent of the function values $\b{u}$ but depends only on domain geometry means that we can just compute the shape functions $\b{\chi}$ for points of interest and then approximate any linear operator of any function, given its values, very fast, using only a single dot product.

Special case when $B$ is square and invertible

The expression \[ \b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T (B^\T W^2 B)^{-1} B^\T W^2 \] can be simplified to \[ \b \chi_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}. \]

This gives the approximation as $(\mathcal{L} \hat{u})(p) = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1} \b u$.

Coefficients $\b \alpha$ are the solutions of $B\b \alpha = \b u$. Shape function $\b{\chi}_{\mathcal{L}}(\vec{p})$ is defined as $$ \b{\chi}_{\mathcal{L}}(\vec{p})^\T = (\mathcal{L}\b{b})(\vec{p})^\T B^{-1}, $$ which can be rewritten as a solution of $$ B^\T \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p}). $$

This formulation is more suitable for numerical solving and also has a deeper interpretation, as presented in the next section.

Another view on derivation of shape functions

Solving the system $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$ can be interpreted as using the method of undetermined coefficients.

We wish to find such a combination of function values evaluated in support points $\vec s_i$ that approximates the value of an operator $(\mathcal{L}u)(\vec p)$, i.e., we are looking for coefficients $\chi_{\mathcal{L}, i}$, such that \[ \sum_{i=1}^n \chi_{\mathcal{L}, i} u_i \approx (\mathcal{L}u)(p). \]

We require that $\chi_{\mathcal{L}, i}$ be such, that above approximations holds for a certain space of functions, spanned by the basis $\{b_j\}_{j=1}^m$. By choosing the basis functions $b_j$ we decide for which functions the aproximation $\b{\chi}_{\mathcal{L}, \vec{p}} \b{u} \approx (\mathcal{L}u)(\vec p)$ should be exact and we write the equations $\sum_{i=1}^n \chi_{\mathcal{L}, i} b_j(\vec s_i) = (\mathcal{L}b_j)(\vec p)$ in a system: \[ \begin{bmatrix} b_1(\vec s_1) & \cdots & b_1(\vec s_n) \\ \vdots & \ddots & \vdots \\ b_m(\vec s_1) & \cdots & b_m(\vec s_n) \\ \end{bmatrix} \begin{bmatrix} \chi_{\mathcal{L}, 1} \\ \vdots \\ \chi_{\mathcal{L}, n} \end{bmatrix} = \begin{bmatrix} (\mathcal{L} b_1)(\vec{p}) \\ \vdots \\ (\mathcal{L} b_m)(\vec{p}) \end{bmatrix} \] Note that the matrix above is precisely the transpose of the matrix $B$ from the Weighted Least Squares (WLS) approximation. In this sense, the computation of shape functions is dual to computation of basis coefficients. The system is written in matrix form as $B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)$.

For $n=m$ case as in the previous section, the solution is the sought shape function $\b{\chi}_{\mathcal L}$.

If we do not specify enough functions $b_j$, the system above is underdetermined. A common solution is to minimize the weighted norm of $\b \chi_{\mathcal L}$. This gives us a minimization problem $$ \min_{\b \chi_{\mathcal L}} \sum_{i=1}^n (\chi_{\mathcal{L}, i}/w_i)^2, \text{ such that } B^\T \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p). $$ If the central weights are bigger, this also forces the central coefficients to be larger.

Using the solution derived in Solving underdetermined systems and substituting inverse weights $W = W^{-1}$ and matrix $A = B^\T$, we get the solution $$ \b{\chi}_{\mathcal{L}} (\vec{p}) = (W^{-1})^{-2} (B^\T)^\T (B^\T (W^{-1})^{-2}(B^\T)^\T)^{-1} (\mathcal{L}\b{b})(\vec p) = W^2 B (B^\T W^2 B)^{-1} (\mathcal{L}\b{b})(\vec p). $$ After transposing, we get $$ \b{\chi}_{\mathcal{L}} (\vec{p})^\T = (\mathcal{L}\b{b})(\vec p)^\T (B^\T W^2 B)^{-1} B^\T W^2, $$ which is exactly the same formula as derived using Weighted Least Squares (WLS) approximation.

Just as we get the same stencil weights using interpolation and the method of undetermined coefficients, we get the same result using WLS approximation or weighted norm minimization of shape functions.

Numerical calculation of the shape functions

Numerically cheaper and more stable way is to translate the problem of inverting the matrix to solving a linear system of equations. The alternative view above is useful in this aspect.

Invertible $B$

If $B$ is invertible, then $\b{\chi}_{\mathcal{L}}(\vec{p})$ can be thought as a solution of a system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$, which can be solved using LU or Cholesky decomposition for example. For more on this, see Solving linear systems.

General case

In general, the system $B^\T\b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec p)$ is underdetermined and must be solved in the least-weighted-norm sense. This means solving the underdetermined system $B^\T W y = (\mathcal{L}\b{b})(\vec p)$ and computing $\b{\chi}_{\mathcal{L}}(\vec{p}) = Wy$. For more details on this, see Solving underdetermined systems.

Depending on our knowledge of the problem, we can use Cholesky ($B^\T W$ is positive definite), $LDL^\T$ if it is symmetric, $LU$ for a general square matrix, $QR$ for full rank overdetermined system and SVD for a general system.

If more shapes need to be calculated using the same matrix $B^\T W$ and only different right hand sides (as is often the case, if shapes for more than one operator are computed), it can be done efficiently by storing the decomposition of $B^\T W$.

Linearity and reduction to only computing the shapes of the operator space basis

Observe that the expression for the shape functions \[ \b{\chi}_{\mathcal{L}}(\vec{p}) = (\mathcal{L}\b{b})(\vec{p})^\T (WB)^+W \] is linear in $\mathcal{L}$. This follows from the fact that if $\mathcal{L} = \mathcal{L}_1 + \mathcal{L}_2$, the vector of derivative values $(\mathcal{L}\b{b})(\vec{p}) = ((\mathcal{L}_1 + \mathcal{L}_2)\b{b})(\vec{p}) = (\mathcal{L}_1\b{b} + \mathcal{L}_2\b{b})(\vec{p}) = (\mathcal{L}_1\b{b})(\vec p) + (\mathcal{L}_2\b{b})(\vec p)$ and consequently $$\b{\chi}_{\mathcal{L}}(\vec{p}) = \b{\chi}_{\mathcal{L}_1}(\vec{p}) + \b{\chi}_{\mathcal{L}_2}(\vec{p}).$$

Say that your equation constitutes of operators up to a certain $k$-th order. Then it is sufficient to compute only shape functions for the basis of that operator space, namely \[ \{\b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}, 1 \leq |\alpha| \leq k \}, \text{ where } \frac{\partial}{\partial x^\alpha} = \frac{\partial^{|\alpha|}}{\partial x_1^{\alpha_1}\cdots \partial x_d^{\alpha_d}} \text{ and } |\alpha| = \sum_{i=1}^d \alpha_i. \] Having the shape functions $\b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}$ available, we write $\mathcal{L}$ in the above basis in point $p$, $\mathcal{L} = \displaystyle \sum_{1 \leq |\alpha| \leq k} a_\alpha(p) \frac{\partial}{\partial x^\alpha}$ and evaluate it's shape function as \[ \b{\chi}_{\mathcal{L}, \vec{p}} = \sum_{1 \leq |\alpha| \leq k} a_\alpha(p) \b{\chi}_{\frac{\partial}{\partial x^\alpha}, p}. \] This saves us space and gives certain elegance to the implementation.