# NonlinearSystem

The NonlinearSystem object holds the equation system created by the normal FEM process (e.g. the Matrix and RHS vector) to be solved. Normally MOOSE uses PETSc to store and solve this system. This object is where you will find the callback routines used by the PETSc solvers.

## Solving Non-linear Systems

Application of the finite element method converts PDE(s) into a system of non-linear equations, to solve for the coefficients .

Newton's method has good convergence properties, we use it to solve this system of nonlinear equations.

Newton's method is a "root finding" method: it finds zeros of nonlinear equations.

Newton's Method in "Update Form" for finding roots of the scalar equation

We don't have just one scalar equation: we have a system of nonlinear equations.

This leads to the following form of Newton's Method:

Where is the Jacobian matrix evaluated at the current iterate:

Note that:

## Jacobian Definition

An efficient Newton solve, e.g. one that requires few "non-linear" iterations, requires an accurate Jacobian matrix or an accurate approximation of its action on a vector. When no explicit matrix is formed for the Jacobian and only its action on a vector is computed, the algorithm is commonly referred to as matrix-free (PETSc jargon) or Jacobian-free (MOOSE jargon). The default solve algorithm in MOOSE is `PJFNK`

, or Preconditioned Jacobian-Free Newton-Krylov. "Krylov" refers to the *linear* solution algorithm used to solve each non-linear iteration of the Newton algorithm. For more information on solving linear systems, please see Solving Linear Systems. Even if a Jacobian-free non-linear algorithm is chosen, typically a good preconditioning matrix is still needed. Building the matrix can be accomplished automatically, using automatic differentiation and/or manually.

### Automatic Differentiation

One can elect to sacrifice some computing speed and calculate Jacobians automatically using automatic differentiation (AD). MOOSE employs the `DualNumber`

class from the MetaPhysicL package in order to enable AD. If the application developer wants to make use of AD, they should inherit from `ADKernel`

as opposed to `Kernel`

. Additionally, when coupling in variables, the `adCoupled*`

methods should be used. For example, to retrieve a coupled value, instead of using `coupledValue("v")`

in the `ADKernel`

constructor, `adCoupledValue("v")`

should be used. `adCoupledGradient`

should replace `coupledGradient`

, etc. An example of coupling in an AD variable can be found in `ADCoupledConvection.C`

and `ADCoupledConvection.h`

. Moreover, material properties that may depend on the non-linear variables should be retrieved using `getADMaterialProperty`

instead of `getMaterialProperty`

. They should be declared in materials using `declareADProperty`

. Example AD material source and header files can be found here and here; example kernel source and header files that use AD material properties can be found here and here. The object central to AD computing objects is `ADReal`

which is defined in `MooseTypes`

.

### Traditional Hand-coded Jacobians

Finite element shape functions are introduced in the documentation section Shape Functions. There we outline how our primary variables are summations of those shape functions multiplied by constant coefficients which are our degrees of freedom. At the end of Solving Non-linear Systems we gave an explicit illustration of how the derivative of a variable `u`

with respect to its jth degree of freedom () is equal to the jth shape function . Similarly the derivative of with respect to is equal to . The code expression `_phi[_j][_qp]`

represents in any MOOSE framework residual and Jacobian computing objects such as kernels and boundary conditions.

Any MOOSE kernel may have an arbitrary number of variables coupled into it. If these coupled variables use the same shape function family and order, then their associated s will be equivalent. However, if `u`

and `v`

use different shape functions then . As a developer, however, you do not *in most cases* have to worry about these differences in . MOOSE automatically updates the object member variable `_phi`

to use the shape functions of the variable for whom the Jacobian is currently being computed. *However*, if the primary variable `u`

is a scalar-valued (single-component) finite element variable and the coupled variable `v`

is a vector-valued (multi-component) finite element variable (or visa versa), then you must introduce an additional member variable to represent the shape functions of the vector-valued (scalar-valued) variable. The name of this variable is up to the developer, but we suggest perhaps a `_standard_`

prefix for scalar valued finite-element variables and `_vector_`

for vector valued finite-element variables. The `_standard_`

prefix is suggested over `_scalar_`

so as not to be confused with a `MooseVariableScalar`

, which only has a single value over the entire spatial domain. An example constructor for a standard kernel that couples in a vector-valued FE variable is shown below:

```
EFieldAdvection::EFieldAdvection(const InputParameters & parameters)
: Kernel(parameters),
_efield_id(coupled("efield")),
_efield(coupledVectorValue("efield")),
_efield_var(*getVectorVar("efield", 0)),
_vector_phi(_assembly.phi(_efield_var)),
_mobility(getParam<Real>("mobility"))
{
}
```

The associated declarations are:

```
const unsigned int _efield_id;
const VectorVariableValue & _efield;
VectorMooseVariable & _efield_var;
const VectorVariablePhiValue & _vector_phi;
const Real _mobility;
Real _sgn;
```

Residual, on-diagonal, and off-diagonal methods are respectively

```
Real
EFieldAdvection::computeQpResidual()
{
return -_grad_test[_i][_qp] * _sgn * _mobility * _efield[_qp] * _u[_qp];
}
```

and

```
Real
EFieldAdvection::computeQpJacobian()
{
return -_grad_test[_i][_qp] * _sgn * _mobility * _efield[_qp] * _phi[_j][_qp];
}
```

and

```
Real
EFieldAdvection::computeQpOffDiagJacobian(unsigned int jvar)
{
if (jvar == _efield_id)
return -_grad_test[_i][_qp] * _sgn * _mobility * _vector_phi[_j][_qp] * _u[_qp];
else
return 0;
}
```
An example constructor for a vector kernel that couples in a
scalar-valued FE variable is shown below:
```
VectorCoupledGradientTimeDerivative::VectorCoupledGradientTimeDerivative(
const InputParameters & parameters)
: VectorKernel(parameters),
_grad_v_dot(coupledGradientDot("v")),
_d_grad_v_dot_dv(coupledDotDu("v")),
_v_id(coupled("v")),
_v_var(*getVar("v", 0)),
_standard_grad_phi(_assembly.gradPhi(_v_var))
{
}
```
The associated declarations are:
```
const VariableGradient & _grad_v_dot;
const VariableValue & _d_grad_v_dot_dv;
const unsigned _v_id;
MooseVariable & _v_var;
const VariablePhiGradient & _standard_grad_phi;
```

Residual and off-diagonal Jacobian methods are respectively:

```
Real
VectorCoupledGradientTimeDerivative::computeQpResidual()
{
return _test[_i][_qp] * _grad_v_dot[_qp];
}
```

and

```
Real
VectorCoupledGradientTimeDerivative::computeQpOffDiagJacobian(unsigned jvar)
{
if (jvar == _v_id)
return _test[_i][_qp] * _d_grad_v_dot_dv[_qp] * _standard_grad_phi[_j][_qp];
else
return 0.;
}
```

Note that only one member is needed to represent shape functions for standard `MooseVariable`

s and `VectorMooseVariable`

s. For example, if the vector-variables `v`

and `w`

are coupled into a standard kernel for `u`

, only a single `_vector_phi`

member needs to be added; there is not need for both a `_v_phi`

and `_w_phi`

. `_vector_phi`

will be automatically updated to represent the shape functions for whichever vector variable the Jacobian is being computed for.

### Newton for a Simple Equation

Consider the convection-diffusion equation with nonlinear , , and :

The component of the residual vector is:

Using the previously-defined rules for and , the entry of the Jacobian is then:

Note that even for this "simple" equation, the Jacobian entries are nontrivial: they depend on the partial derivatives of , , and , which may be difficult or time-consuming to compute analytically.

In a multiphysics setting with many coupled equations and complicated material properties, the Jacobian might be extremely difficult to determine.

### Chain Rule

On the previous slide, the term was used, where was a nonlinear forcing function.

The chain rule allows us to write this term as

If a functional form of is known, e.g. , this formula implies that its Jacobian contribution is given by

### Jacobian-Free Newton-Krylov

is a linear system solved during each Newton step.

For simplicity, we can write this linear system as , where: - - -

We employ an iterative Krylov method (e.g. GMRES) to produce a sequence of iterates ,

and remain

*fixed*during the iterative process.The "linear residual" at step is defined as

MOOSE prints the norm of this vector, , at each iteration, if you set

`print_linear_residuals = true`

in the`Outputs`

block.The "nonlinear residual" printed by MOOSE is .

By iterate , the Krylov method has constructed the subspace

Different Krylov methods produce the iterates in different ways:

Conjugate Gradients: orthogonal to .

GMRES/MINRES: has minimum norm for in .

Biconjugate Gradients: is orthogonal to

is never explicitly needed to construct the subspace, only the action of on a vector is required.

This action can be approximated by:

This form has many advantages: - No need to do analytic derivatives to form - No time needed to compute (just residual computations) - No space needed to store

## Solving Linear Systems

You will commonly hear of two ways to solve an implicit linear system of equations: directly or iteratively. A typical direct solve will perform a LU factorization. Direct solves are a great tool for solving small-medium sized systems; however, they are extremely expensive when applied to large-scale problems. To solve large-scale systems, iterative methods must be used. The most successful iterative methods are Krylov methods. Krylov methods are work by finding a solution to within a space called the Krylov sub-space which is spanned by images of under powers of . Two of the most used Krylov algorithms are Conjugate gradient and GMRES. Conjugate gradient generally only works for symmetric positive-definite matrices. Because of its greater flexibility, GMRES is the default linear solution algorithm in PETSc and consequently for MOOSE.

## Augmenting Sparsity

One such routine is `NonlinearSystemBase::augmentSparsity`

, which as its name suggests augments the sparsity pattern of the matrix. Currently this method adds sparsity coming from MOOSE `Constraint`

objects. It does this by querying geometric connectivity information between secondary and primary boundary pairs, and then querying the `DofMap`

attached to the `NonlinearSystemBase`

(through the libMesh `NonlinearImplicitSystem`

) for the dof indices that exist on the elements attached to the secondary/primary nodes. The geometric connectivity information comes from `NearestNodeLocators`

held by `GeometricSearchData`

objects in the `FEProblemBase`

and `DisplacedProblem`

(the latter only if there are mesh displacements). In the future sparsity augmentation from constraints will occur through `RelationshipManagers`

instead of through the `augmentSparsity`

method.