Table Of Contents

Previous topic

5. Thoughts on finance

Next topic

7. Notes on spreadsheets etc

This Page

6. Mathematical notes

Here mathematical notes are kept.

6.1. Using LaTeX syntax for mathematics in RestructuredText and Sphinx


This document is about using Latex in reStructuredText and Sphinx. At the link below there is a introduction to mathematical symbols in Latex.

6.1.1. Math/Latex in RestructuredText and Sphinx

There are 2 ways of writing math in restructuredText:

Inline ``:math:`SomeLatex```
To insert relatively simple latex for mathematical expressions like \psi(r)=\exp(-2r) in the text write :math:`\psi(r) = \exp(-2r)` Inside the back-tics (`) any Latex code can be written. There must be no space between :math: and the first back-tic.
Complex Latex like eg equation

For producing more complex math like eg an equation* environment in a LaTeX document write:

.. math::
    \psi(r) = e^{-2r}

you will get:

\psi(r) = e^{-2r}


The math markup can be used within reST files (to be parsed by Sphinx) but within your python’s docstring, the slashes need to be escaped ! :math:`\alpha` should therefore be written :math:`\\alpha` or put an “r” before the docstring

6.1.2. Examples

Below are some examples of Latex commands and symbols:

To do square roots like eg \sqrt{x^2-1} use :math:`\sqrt{x^2-1}`.

To do fractions like eg \frac{1}{2} use :math:`\frac{1}{2}`.

In order to insert text into a formula use :math:`k_{\text{B}}T` to get like in k_{\text{B}}T

Use \left and \right for before brackets like in :math:`\left(\frac{1}{2}\right)^n` to get \left(\frac{1}{2}\right)^n.

Displayed math can use \\ and & for line shifts and alignments, respectively.

.. math::
  a & = (x + y)^2 \\
    & = x^2 + 2xy + y^2

The result is:

a & = (x + y)^2 \\
  & = x^2 + 2xy + y^2

The matrix environment can also contain \\ and &. To get like:

\left(\begin{matrix} a & b \\ c & d \end{matrix}\right)


.. math::
  \left(\begin{matrix} a & b \\
    c & d \end{matrix}\right)

Equations are labeled with the label option and referred to using:



(1)a^2 + b^2 = c^2

See equation (1)

is written like:

.. math::
   :label: pythag

    a^2 + b^2 = c^2

See equation :eq:`pythag`

Note that:

  1. No spaces in label name
  2. There should be a blank line before the actual latex code
  3. There should be a space between :label: and the label name
  4. When referred to label name should be surrounded by `

6.2. Splines at Scipy


This chapter is about splines in Scipy and how they can be used for yield curve calculations.

It is shown that splines in scipy is probably b-splines. These splines gives different (but not nessacerily wrong) values when interpolating.

This information is important when using splines from scipy.

6.2.1. Natural cubic splines

The natural cubic spline is already implemented in decimalpy

In order to redo the calculations below you need to install and import the package decimalpy.

>>> import decimalpy as dp

The example will be taken from [Kiusalaas] p. 119. The data are:

>>> x_data = dp.Vector([1, 2, 3, 4, 5])
>>> y_data = dp.Vector([0, 1, 0, 1, 0])

And the natural cubic spline function is instantiated by:

>>> f = dp.NaturalCubicSpline(x_data, y_data)

And the it is possible to do calculations like function values:

>>> print f(1.5), f(4.5)
0.7678571428571428571428571427 0.7678571428571428571428571427


>>> print f(1.5, 1), f(4.5, 1)
1.178571428571428571428571429 -1.178571428571428571428571428

and curvatures:

>>> print f(1.5, 2), f(4.5, 2)
-2.142857142857142857142857142 -2.142857142857142857142857143

and it is quite easy to do a plot:

>>> import matplotlib.pyplot as plt
>>> linspace = lambda start, stop, steps: dp.Vector([(stop - start) / (steps - 1) * i + start for i in range(steps)])
>>> xnew = linspace(1, 5, 40)
>>> x0, spread  = 2, 0.5
>>> x2new = linspace(x0 - spread, x0 + spread, 40)
>>> f0x0, f1x0, f2x0 = f(x0, 0), f(x0, 1), f(x0, 2)
>>> f0Lst = dp.Vector([f(x) for x in xnew])
>>> f1Lst = dp.Vector([f0x0 + f1x0 * (x - x0) for x in x2new])
>>> f2Lst = dp.Vector([f0x0 + f1x0 * (x - x0) + f2x0 * (x - x0)**2 / 2 for x in x2new])
>>> plt.plot(x_data, y_data, 'o', xnew, f0Lst, '-', x2new, f1Lst, '-', x2new, f2Lst, '-')   
>>> plt.legend(['data', 'fitted f', 'tangent', '2. order'], loc='best')   



6.2.2. Splines at Scipy

On the other hand [scipy] is well equipped with spline functionality

First set up scipy and numpy:

>>> from numpy import linspace, array, float64
>>> from scipy.interpolate import InterpolatedUnivariateSpline as spline

Use the same data as before:

>>> x_data = array([1, 2, 3, 4, 5], float64)
>>> y_data = array([0, 1, 0, 1, 0], float64)

Creating the cubic spline function:

>>> cubic = spline(x_data, y_data, k=3)

And calculate the cubic spline at 1.5 and 4.5:

>>> cubic(1.5)
>>> cubic(4.5)

This isn’t the same value as before (0.767857142857).

And if one uses interp1d from scipy.interpolate to get the function cubic2:

>>> from numpy import linspace, array, float64
>>> from scipy.interpolate import interp1d
>>> cubic2 = interp1d(x_data, y_data, kind='cubic')

Using cubic2 one gets:

>>> cubic2(1.5)
>>> cubic2(4.5)

Again a new set of values. What is a bit surprising is that a third set of values appears.

So to check the functions in a graph I do:

>>> xnew = linspace(1, 5, 40)
>>> import matplotlib.pyplot as plt
>>> plt.plot(x_data, y_data, 'o', xnew, f0Lst, '-', xnew, cubic(xnew), '--', xnew, cubic2(xnew), '+-')   
>>> plt.legend(['data', 'natural spline', 'cubic', 'cubic2'], loc='best')   
>>> plt.draw()

to get the graph:


And now it is obvious that scipy isn’t using the natural spline as cubic spline.

What kind of spline is used in scipy? Their code is build upon FITPACK by P. Dierckx.

The main reference is [Dierckx] which is all about b-splines. So a fair guess is that scipy also is b-splines.

This isn’t necessarily bad. Actually I haven’t looked into the differences between the 2 scipy interpolations.


It matters what kind of interpolation one uses.

The difference in values is probably due to different derivations of (cubic) splines where scipy interpolation is based on b-splines.

6.3. Derivation of Cubic Spline


This chapter is about derivation of cubic splines and the use of cubic splines for yield curve calculation.

6.3.1. Deriving the cubic spline function

Assume n points ((x_{i},y_{i}))_{i=1}^{n} and n curvatures (k_{i})_{i=1}^{n} at each point.

To ease notation we introduce:

\triangle x_{i}=x_{i+1}-x_{i}

The point are known, the curvatures are assumed known

f_{i}''(x) & = \frac{x-x_{i}}{\triangle x_{i}}\cdot k_{i+1}+\frac{x_{i+1}-x}{\triangle x_{i}}\cdot k_{i}\\
           & = \frac{(x-x_{i})\cdot k_{i+1}+(x_{i+1}-x)\cdot k_{i}}{\triangle x_{i}}

Because we want to evaluate f,f’,f’’ at x_{i} it makes things easier when they are formulated as functions of (x-x_{i}) and (x-x_{i+1}). Eg it is quite simple to see that:


So the second derivative is continous and piecewise linear.

Also the first derivative f’ can be formulated as (since \int(a\cdot x-b)\partial x=\frac{1}{2a}\cdot(a\cdot x-b)^{2}+c):

f_{i}'(x)=\frac{(x-x_{i})^{2}\cdot k_{i+1}-(x_{i+1}-x)^{2}\cdot k_{i}}{2\cdot\triangle x_{i}}+c_{i}

The function f_{i} which lies x_{i} and x_{i+1} for i=1,\ldots,n-1 must look like (integrating twice):

f_{i}(x) & = \frac{(x-x_{i})^{3}\cdot k_{i+1}+(x_{i+1}-x)^{3}\cdot k_{i}}{6\cdot\triangle x_{i}}+c_{i}\cdot x+d_{i}\\
         & = \frac{(x-x_{i})^{3}\cdot k_{i+1}+(x_{i+1}-x)^{3}\cdot k_{i}}{6\cdot\triangle x_{i}}+a_{i}\cdot(x_{i+1}-x)+b_{i}\cdot(x-x_{i})

where c_{i}=-a_{i}+b_{i} and d_{i}=a_{i}\cdot x_{i+1}-b_{i}\cdot x_{i} or

\left(\begin{array}{c}\begin{array}{c}c_{i}\end{array}\\d_{i}\end{array}\right)=\left(\begin{array}{cc}-1 & 1\\x_{i+1} & -x_{i}\end{array}\right)\cdot\left(\begin{array}{c}a_{i}\\b_{i}\end{array}\right)

This is possible if the determinant x_{i}-x_{i+1}\neq0. This is the case if there isn’t 2 points with the same x-value. An asumption which already must be in place.

And since f must be continous and go through the points ((x_{i},y_{i}))_{i=1}^{n} we have:

f_{i}(x_{i}) & = y_{i}\\
\frac{(x_{i}-x_{i})^{3}\cdot k_{i+1}+\triangle x_{i}^{3}\cdot k_{i}}{6\cdot\triangle x_{i}}+a_{i}\cdot\triangle x_{i}+b_{i}\cdot(x_{i}-x_{i}) & = y_{i}\\
\frac{\triangle x_{i}\cdot k_{i}}{6}+a_{i} & = \frac{y_{i}}{\triangle x_{i}}\\
\frac{y_{i}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot k_{i}}{6} & = a_{i}


f_{i}(x_{i+1}) & = y_{i+1}\\
\frac{\triangle x_{i}^{3}\cdot k_{i+1}+(x_{i+1}-x_{i+1})^{3}\cdot k_{i}}{6\cdot\triangle x_{i}}+a_{i}\cdot(x_{i+1}-x_{i+1})+b_{i}\cdot\triangle x_{i} & = y_{i+1}\\
\frac{y_{i+1}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot k_{i+1}}{6} & = b_{i}

This means that the constant c_{i}:

c_{i} & = -a_{i}+b_{i}\\
      & = -(\frac{y_{i}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot k_{i}}{6})+\frac{y_{i+1}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot k_{i+1}}{6}\\
      & = \frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot\triangle k_{i}}{6}

6.3.2. Ekstrapolation

The question is what to do when there is a need for extrapolation. This is a situation quite common when talking about yieldcurve. Eg there are observed zero coupons up until year 20 but a zero coupon for year 25 is needed.

Talking about the natural cubic spline and financial cubic spline they are just linear extrapolations of the endpoints with the same slope as in the endpoints.

The only difference between the 2 cubic splines is that financial cubic spline is set to have a slope equal to zero at the endpoint to the right.

6.3.3. Estimation of the curvatures

To estimate the curvatures one uses the first order derivatives:

f_{i}'(x)=\frac{(x-x_{i})^{2}\cdot k_{i+1}-(x_{i+1}-x)^{2}\cdot k_{i}}{2\cdot\triangle x_{i}}+c_{i}

And the assumption that the slopes are continous at the points ((x_{i},y_{i}))_{i=1}^{n}, ie

f_{i-1}'(x_{i}) & = f_{i}'(x_{i})\;,i=2,\ldots,n-1\\
\frac{\triangle x_{i-1}^{2}\cdot k_{i}-(x_{i}-x_{i})^{2}\cdot k_{i-1}}{2\cdot\triangle x_{i-1}}+c_{i-1} & = \frac{(x_{i}-x_{i})^{2}\cdot k_{i+1}-\triangle x_{i}^{2}\cdot k_{i}}{2\cdot\triangle x_{i}}+c_{i}\\
\frac{\triangle x_{i-1}\cdot k_{i}}{2}+\frac{\triangle y_{i-1}}{\triangle x_{i-1}}-\frac{\triangle x_{i-1}\cdot\triangle k_{i-1}}{6} & = \frac{-\triangle x_{i}\cdot k_{i}}{2}+\frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot\triangle k_{i}}{6}\\
\frac{\triangle x_{i-1}\cdot k_{i}}{3}+\frac{\triangle y_{i-1}}{\triangle x_{i}}+\frac{\triangle x_{i-1}\cdot k_{i-1}}{6} & = \frac{-\triangle x_{i}\cdot k_{i}}{3}+\frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle x_{i}\cdot k_{i+1}}{6}\\
\frac{\triangle x_{i-1}\cdot k_{i-1}}{6}+\frac{(\triangle x_{i}+\triangle x_{i-1})\cdot k_{i}}{3}+\frac{\triangle x_{i}\cdot k_{i+1}}{6} & = \frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle y_{i-1}}{\triangle x_{i-1}}\\
\triangle x_{i-1}\cdot k_{i-1}+2\cdot(\triangle x_{i}+\triangle x_{i-1})\cdot k_{i}+\triangle x_{i}\cdot k_{i+1} & = 6\cdot(\frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle y_{i-1}}{\triangle x_{i-1}})\;,i=2,\ldots,n-1

This gives n - 2 linear equations to estimate n curvatures. This means that 2 asumptions are necessary in order to calculate the n curvtures. The Natural Cubic Spline

Here the asumptions are k_{1} = k_{n} = 0. The other equations becomes i=2,\ldots,n-1:

\triangle x_{i-1}\cdot k_{i-1}+2\cdot(\triangle x_{i}+\triangle x_{i-1})\cdot k_{i}+\triangle x_{i}\cdot k_{i+1} & = 6\cdot(\frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle y_{i-1}}{\triangle x_{i-1}})

In all n variables (k_{i})_{i=1}^n and n linear equations The Financial Cubic Spline

Here the asumptions are k_{1} = 0 and f_{n-1}'(x_{n})  = 0 or:

0 & = \frac{\triangle x_{n-1}^{2}\cdot k_{n}-(x_{n}-x_{n})^{2}\cdot k_{n-1}}{2\cdot\triangle x_{n-1}}+c_{n-1}\\
  & = \frac{\triangle x_{n-1}\cdot k_{n}}{2}+\frac{\triangle y_{n-1}}{\triangle x_{n-1}}-\frac{\triangle x_{n-1}\cdot (k_{n} - k_{n-1})}{6}\\
  -6\cdot \frac{\triangle y_{n-1}}{\triangle x_{n-1}} & = \triangle x_{n-1}\cdot k_{n-1} + 2\cdot \triangle x_{n-1}\cdot k_{n}

The other equations becomes i=2,\ldots,n-1:

\triangle x_{i-1}\cdot k_{i-1}+2\cdot(\triangle x_{i}+\triangle x_{i-1})\cdot k_{i}+\triangle x_{i}\cdot k_{i+1} & = 6\cdot(\frac{\triangle y_{i}}{\triangle x_{i}}-\frac{\triangle y_{i-1}}{\triangle x_{i-1}})

Also here there are in all n variables (k_{i})_{i=1}^n and n linear equations

6.4. Using Barycentric Lagrange Interpolation in Finance


Lagrange Interpolation has had a revelation since the Barycentric formulas were developed.

Here Barycentric Lagrange Interpolation will be examined as a tool for finding weights to approximate the first and second order derivative for a function only known by it’s functional values.

The first order is used for the duration in finance while the second order derivative is used for the convexity.

This section is highly inspired on [BerrutTrefethen], [SadiqViswanath] and [Fornberg].

Also as references [Kiusalaas] and [Ralston] has been invaluable.

6.4.1. Lagrange Interpolation

First construct a polynomial with n + 1 different grid values (x_j)_{j=0 \dotsc n}:

l(x) = \prod_{j=0 \dotsc n}(x - x_j)

This polynomial has degree n + 1. And it is obvious then that p(x_j) = 0 \ \forall j.

Consider then the tangent for p in grid value x_j , ie.:

y & = l'(x_j) \cdot (x - x_j) + l(x_j) \\
  & = l'(x_j) \cdot (x - x_j)

Now define n + 1 polynomials as:

l_j(x) = \frac{l(x)}{l'(x_j) \cdot (x - x_j)}

These polynomials all satisfy:

l_j(x_i) & = \begin{cases} 1 \ if \ i = j \\
                         0 \ if \ i \neq j
             \end{cases} \\
         & = \delta_{ij}

In other words l_j(x_i) is Kroenecker’s delta. Note that the l_j‘s all have degree n.

Since l(x) = (x - x_j) \cdot \prod_{i \neq j} (x - x_i) we have:

l'(x) = \prod_{i \neq j} (x - x_i) + (x - x_j) \cdot ( \prod_{i \neq j} (x - x_i) )'

And then when letting the grid values be the argument of l(x) (last part above is zero since (x_j - x_j) is):

l'(x_j) & = \prod_{i \neq j} (x_j - x_i) \\
        & \neq 0

we have that the polynomials l_j(x) are well defined. This is because (x - x_j) can be shortened out and then the polynomials l_j(x) are just scaled with a nonzero scaling constants.

Definition, Lagrange Interpolation and Remainder:

Consider n + 1 points (x_j, f(x_j))_{j=0 \dotsc n}. Then the unique polynomial of the minimal degree n going through all n + 1 points is:

p(x) = \sum_{j=0 \dotsc n} l_j(x) \cdot f(x_j)

Define R(x) = f(x) - p(x) as the Lagrange remainder.

Since l_j(x_i) is Kroenecker’s delta it is obvious that l(x_j) = l_j(x_j) \cdot f(x_j) = f(x_j).

Theorem, Lagrange Remainder:

Let ]a, b[ being an open interval containing all the n+1 grid values (x_j)_{j=0 \dotsc n} and f is a n+1 times continous differentiable function on ]a, b[:

\forall x \in ]a, b[ \quad \exists \zeta \in ]a, b[:
    f(x) &= \sum_{j=0 \dotsc n} l_j(x) \cdot f(x_j)
            + \frac{l(x) \cdot f^{(n+1)}(\zeta)}{(n+1)!} \\
         &= p(x) + R(x)


For x \in ]a,b[:

R(x) &= f(x) - \sum_{j=0 \dotsc n} l_j(x) \cdot f(x_j) \\
     &= f(x) - p(x)

Observe that f(x_j) = p(x_j) \forall j=0 \dotsc n i.e. R(x) has n+1 zeros.

Also observe that since p(x) has degree n R^{(n+1)}(x) = f^{(n+1)}(x) and p^{(n)}(x) = (n+1)!.

Now define:

g(t) = R(t) - \frac{l(t) \cdot R(x)}{l(x)}

Observe that g(t) has both (x_j)_{j=0 \dotsc n} and x as roots, ie n+2 zeros.

According to Rolle’s theorem g^{(1)}(t) has n+1 zeros and successively g^{(2)}(t) has n zeros.

So g^{(n+1)}(t) has 1 zero, ie \exists \zeta :g^{(n+1)}(\zeta) = 0 or

0 &= g^{(n+1)}(\zeta)\\
0 &= R^{(n+1)}(\zeta) - \frac{(n+1)! \cdot R(x)}{l(x)} \\
0 &= f^{(n+1)}(\zeta) - \frac{(n+1)! \cdot R(x)}{l(x)} \\
& \Updownarrow \\
R(x) &= \frac{l(x) \cdot f^{(n+1)}(\zeta)}{(n+1)!}

(see also)


The Lagrange Remainder can be used to limit the overall accuracy for the Lagrange Interpolation if the function f(x) is n+1 times continous differentiable on the closed interval [a,b].

This is because \exists M : |f^{(n+1)}(x)| \le M \forall x \in [a,b], ie:

|f(x) - p(x)| = |R(x)| &\le \frac{l(x) \cdot M}{(n+1)!} \\
                       &\le \frac{(b-a)^{n+1} \cdot M}{(n+1)!}

So if (b-a) converges to zero then the accuracy becomes of order n+1 ie \mathcal{O}((b-a)^{n+1}).


The Lagrange Interpolation can for a set of n+1 grid values (x_j)_{j=0 \dotsc n} can be seen as a way of interpolating a function value f(x) based on a weighted average of the functional values (f(x_j))_{j=0 \dotsc n} where the weights are (l_j(x))_{j=0 \dotsc n} and the accuracy will be of order \mathcal{O}((b-a)^{n+1}).

This can be taken even further. Since only the weights depends on x the first order differentiation of f(x),\ f^{(1)}(x) can be approximated by:

p^{(1)}(x) = \sum_{j=0 \dotsc n} l_j^{(1)}(x) \cdot f(x_j)

So the first order derivative f^{(1)}(x) at x can be approximated by the functional values (f(x_j))_{j=0 \dotsc n} and the weights are (l_j^{(1)}(x))_{j=0 \dotsc n} and the accuracy will be of order \mathcal{O}((b-a)^{n}), ie the accuracy loses 1 degree.

And successively for the second order derivative:

p^{(2)}(x) = \sum_{j=0 \dotsc n} l_j^{(2)}(x) \cdot f(x_j)
            + \mathcal{O}((b-a)^{n-1})

This makes Lagrange Interpolation well suited for finding weighted averages of functional values (f(x_j))_{j=0 \dotsc n} for derivatives of any order with high accuracy, \mathcal{O}((b-a)^{number\ of\ points - degree\ of\ differentiation}).

6.4.2. Barycentric Lagrange Interpolation

Computationally Lagrange Interpolation isn’t that effective. But further analysis shows that the p(x_j)‘s all have the same common factor \prod_{j=0 \dotsc n}(x - x_j) so we have the First Barycentric formula:

(2)p(x) &= \sum_{j=0 \dotsc n} l_j(x) \cdot f(x_j) \\
      &= \prod_{j=0 \dotsc n}(x - x_j) \cdot \sum_{j=0 \dotsc n} \frac{w_j}{x - x_j} \cdot f(x_j) \\
      &= l(x) \cdot \sum_{j=0 \dotsc n} \frac{w_j}{x - x_j} \cdot f(x_j)


w_j &= \frac{1}{l'(x_j)} \\
    &= \frac{1}{\prod_{i \neq j} (x_j - x_i)}

Looking at the constant function 1 gives:

1 = l(x) \cdot \sum_{j=0 \dotsc n} \frac{w_j}{x - x_j}


l(x) = \frac{1}{\sum_{j=0 \dotsc n} \frac{w_j}{x - x_j}}

This way we have the Second Barycentric formula:

(3)p(x) = \frac{\sum_{j=0 \dotsc n} \frac{w_j}{x - x_j} \cdot f(x_j)}{\sum_{j=0 \dotsc n} \frac{w_j}{x - x_j}}

Both formulas shows a remarkable low dependence on the f(x_j)‘s since a change in one f(x_j)‘s will have a multiplicative effect only one place. Algorithms

Interpolation should be done based on the Second Barycentric formula (3).

For updating the weights with an extra point consider a set of basis points (x_j, f(x_j))_{j=0 \dotsc n} to be used as a basis for Barycentric Lagrange interpolation.

When a new grid value x_{n+1} has to be added the algorithm is:

  1. Calculate w_j := \frac{w_j}{x_j - x_{n+1}}\forall j
  2. Calculate w_{n+1} := \frac{1}{\prod_{j=0 \dotsc n+1}(x_{n+1} - x_j)}

It is obvious that there needs to be minimum 2 values. When there are only 2 grid values x_1 and x_2 then w_1 = -w_2. Numerical differentiation at grid points

One of the things to use Barycentric Lagrange interpolation is to evaluate is to differentiate a curve at a point when only a set points are known with a high precision.

Usual formulas are eg, \frac{f(x + h) - f(x - h)}{2 \cdot h} which is considered as a 3 point estimation based on the points with first coordinate (x - h, x, x + h).

The precision or accuracy is here of order 2.

Now consider:

f(x) & = \sum_{j=0 \dotsc n} l_j(x) \cdot f(x_j) \\
& \Updownarrow \\
f^{(1)}(x) & = \sum_{j=0 \dotsc n} l^{(1)}_j(x) \cdot f(x_j)

Now according to Second Barycentric formula (3):

l_j(x) &= \frac{\frac{w_j}{x - x_j}}{\sum_{k=0 \dotsc n} \frac{w_k}{x - x_k}} \\
& \Updownarrow \\
 l_j(x) \cdot \sum_{k=0 \dotsc n} \frac{w_k}{x - x_k} &= \frac{w_j}{x - x_j}

Letting s_i(x) = \sum_{k=0 \dotsc n} \frac{w_k \cdot (x - x_i)}{x - x_k} where x_i is one of the roots (x_k)_{k=0 \dotsc n} (Important assumption) we have:

 l_j(x) \cdot s_i(x) &=& \frac{w_j \cdot (x - x_i)}{x - x_j} \nonumber \\
 & \Downarrow \nonumber \\
 l_j^{(1)}(x) \cdot s_i(x) + l_j(x) \cdot s_i^{(1)}(x) &=& \frac{(x - x_j) - (x - x_i)}{(x - x_j)^2} \cdot w_j \nonumber \\
 & \Updownarrow \nonumber \\
 l_j^{(1)}(x) \cdot s_i(x) + l_j(x) \cdot s_i^{(1)}(x) &=& \frac{x_i - x_j}{(x - x_j)^2} \cdot w_j \nonumber

The last equation will be used several times so it gets a number:

(4)l_j^{(1)}(x) \cdot s_i(x) + l_j(x) \cdot s_i^{(1)}(x) = \frac{x_i - x_j}{(x - x_j)^2} \cdot w_j

Since by design s_i(x_i) = w_i and l_j(x_i) = \delta_{ij} we have for i \neq j in (4):

l_j^{(1)}(x_ i) \cdot s_i(x_i) + l_j(x_i) \cdot s_i^{(1)}(x_i) &= \frac{1}{x_i - x_j} \cdot w_j \\
& \Updownarrow \\
l_j^{(1)}(x_i) \cdot w_i &= \frac{1}{x_i - x_j} \cdot w_j \\
& \Updownarrow \\
l_j^{(1)}(x_i) &= \frac{w_j}{(x_i - x_j) \cdot w_i}

Now for i = j we have:

1 &= \sum_{k=0 \dotsc n} l_k(x) \\
 & \Downarrow \\
0 &= \sum_{k=0 \dotsc n} l^{(1)}_k(x) \\
 & \Downarrow \\
0 &= \sum_{k=0 \dotsc n} l^{(1)}_k(x_j) \\
 & \Updownarrow \\
 l^{(1)}_j(x_j) &= - \sum_{k \neq j} l^{(1)}_k(x_j)

This way we have:

Theorem, Barycentric first order derivative:

Consider a set of grid values (x_i)_{i=0 \dotsc n} and the function values (f(x_i))_{i=0 \dotsc n} = (f_i)_{i=0 \dotsc n}. The formula for the first order derivative at one of the grid values x_i, f^{(1)}(x_i) is:

f^{(1)}(x_i) = \sum_{j=0 \dotsc n} l^{(1)}_j(x_i) \cdot f(x_j)


l_j^{(1)}(x_i) &= \frac{w_j}{(x_i - x_j) \cdot w_i} \ for \  i \neq j \\
l^{(1)}_i(x_i) &= - \sum_{k \neq i} l^{(1)}_k(x_i) \ for \  i = j

Example 1:

The formula for interpolating the first order derivative can used to optimize the approximation of the derivative. Consider eg (x_i)_{i = 0 \dotsc 1}. Then (w_i)_{i = 0 \dotsc 1} = (\frac{1}{x_1 - x_0}, \frac{-1}{x_1 - x_0}) and:

l^{(1)}_j(x_i) &= \left( \begin{matrix}
                \frac{1}{x_0 - x_1}  & \frac{-1}{x_0 - x_1} \\
                \frac{1}{x_0 - x_1} & \frac{-1}{x_0 - x_1}
             \end{matrix} \right) \\
           &= \frac{1}{x_0 - x_1} \cdot
             \left( \begin{matrix}
                1  & -1 \\
                1 & -1
             \end{matrix} \right)

where i is the row index while j is the column index.

And then with:

\left( \begin{matrix} f^{(1)}(x_0) \\ f^{(1)}(x_1) \end{matrix} \right) = l^{(1)}_j(x_i) \cdot \left( \begin{matrix} f(x_0) \\ f(x_1) \end{matrix} \right)

The result is:

f^{(1)}(x_0) = f^{(1)}(x_1) = \frac{f(x_0) - f(x_1)}{x_0 - x_1} + \mathcal{O}(|x_0 -x_1|)

And this is the formula for Newton’s difference quotient.

Example 2:

Consider eg (x_i)_{i = 0 \dotsc 2} = (x - h, x, x + h). Then (w_i)_{i = 0 \dotsc 2} = (\frac{1}{2h^2}, \frac{1}{-h^2}, \frac{1}{2h^2}) and:

l^{(1)}_j(x_i) &= \left( \begin{matrix}
               - \sum_{k \neq (x - h)} l^{(1)}_k(x - h) & \frac{2h^2}{-h^2 \cdot ((x - h) - x)} & \frac{2h^2}{2h^2 \cdot ((x - h) -(x + h))} \\
               \frac{-h^2}{2h^2 \cdot (x - (x - h))} & - \sum_{k \neq x} l^{(1)}_k(x) & \frac{-h^2}{2h^2 \cdot (x - (x + h))} \\
               \frac{2h^2}{2h^2 \cdot ((x + h) - (x - h))} & \frac{2h^2}{-h^2 \cdot ((x + h) - x)} & - \sum_{k \neq (x + h)} l^{(1)}_k(x + h)
            \end{matrix} \right) \\
          &= \frac{1}{h} \cdot \left( \begin{matrix}
               -1 \frac{1}{2} & 2 & \frac{-1}{2} \\
               \frac{-1}{2} & 0 & \frac{1}{2} \\
               \frac{1}{2} & -2 & 1 \frac{1}{2}
            \end{matrix} \right)

These coefficients are to formula 25.3.4 on page 883 in [AbramowitzStegun].

They are all of accuracy \mathcal{O}(h^{2}).

Looking at the middle row the formula for Newton’s difference quotient is found once again:

f^{(1)}(x) &= \frac{\frac{-1}{2} \cdot f(x - h) + \frac{1}{2} \cdot f(x + h)}{h} \\
      &= \frac{f(x + h) - f(x - h)}{(x + h) - (x - h)} \\
      &= \frac{f(x + h) - f(x - h)}{2 \cdot h}

Differentiating (4) leads to:

(5)l_j^{(2)}(x) \cdot s_i(x) + 2 \cdot l_j^{(1)}(x) \cdot s_i^{(1)}(x) + l_j(x) \cdot s_i^{(2)}(x) = -2 \cdot \frac{x_i - x_j}{(x - x_j)^3} \cdot w_j

Again by design s_i(x_i) = w_i and l_j(x_i) = \delta_{ij} we have for i \neq j in (5):

l_j^{(2)}(x_i) \cdot s_i(x_i) + 2 \cdot l_j^{(1)}(x_i) \cdot s_i^{(1)}(x_i) + l_j(x_i) \cdot s_i^{(2)}(x_i) &= \frac{-2 \cdot w_j}{(x_i - x_j)^2} \\
\Updownarrow \\
l_j^{(2)}(x_i) \cdot w_i + 2 \cdot l_j^{(1)}(x_i) \cdot s_i^{(1)}(x_i) &= \frac{-2 \cdot w_j}{(x_i - x_j)^2}

From Barycentric first order derivative Barycentric first order derivative we have l_j^{(1)}(x_i) = \frac{w_j}{(x_i - x_j) \cdot w_i} \ for \  i \neq j so:

l_j^{(2)}(x_i) \cdot w_i + 2 \cdot \frac{w_j}{(x_i - x_j) \cdot w_i} \cdot s_i^{(1)}(x_i) = \frac{-2 \cdot w_j}{(x_i - x_j)^2}

Finally since s_i^{(1)}(x) = \sum_{k \neq i} w_k \cdot \frac{(x - x_k) - (x - x_i)}{(x - x_k)^2} = \sum_{k \neq i} w_k \cdot \frac{(x_i - x_k)}{(x - x_k)^2}:

l_j^{(2)}(x_i) \cdot w_i + 2 \cdot \frac{w_j}{(x_i - x_j) \cdot w_i} \cdot \sum_{k \neq i} \frac{w_k}{x - x_k} &= \frac{-2 \cdot w_j}{(x_i - x_j)^2} \\
& \Updownarrow \\
l_j^{(2)}(x_i) \cdot w_i &= \frac{-2 \cdot w_j}{(x_i - x_j)^2} - 2 \cdot \frac{w_j}{(x_i - x_j) \cdot w_i} \cdot \sum_{k \neq i} \frac{w_k}{x_i - x_k} \\
& \Updownarrow \\
l_j^{(2)}(x_i) &= \frac{ - 2 \cdot w_j}{(x_i - x_j) \cdot w_i} \cdot \left(\frac{1}{x_i - x_j} + \sum_{k \neq i} \frac{w_k}{(x_i - x_k) \cdot w_i} \right) \\
& \Updownarrow \\
l_j^{(2)}(x_i) &= \frac{ - 2 \cdot w_j}{(x_i - x_j) \cdot w_i} \cdot \left(\frac{1}{x_i - x_j} + l_i^{(1)}(x_i) \right)

Now for i = j we have something similar as for the first order case:

1 &= \sum_{k=0 \dotsc n} l_k(x) \\
 & \Downarrow \\
0 &= \sum_{k=0 \dotsc n} l^{(2)}_k(x_j) \\
 & \Updownarrow \\
 l^{(2)}_j(x_j) &= - \sum_{k \neq j} l^{(2)}_k(x_j)

This way we have:

Theorem, Barycentric second order derivative:

Consider a set of grid values (x_i)_{i=0 \dotsc n} and the function values (f(x_i))_{i=0 \dotsc n} = (f_i)_{i=0 \dotsc n}. The formula for the second order derivative at one of the grid values x_i, f^{(1)}(x_i) is:

f^{(2)}(x_i) = \sum_{j=0 \dotsc n} l^{(2)}_j(x_i) \cdot f(x_j)


l_j^{(2)}(x_i) &= \frac{ - 2 \cdot w_j}{(x_i - x_j) \cdot w_i} \cdot \left(\frac{1}{x_i - x_j} + l_i^{(1)}(x_i) \right) \ for \  i \neq j \\
l^{(2)}_i(x_i) &= - \sum_{k \neq i} l^{(2)}_k(x_i) \ for \  i = j


There is a sign error in [BerrutTrefethen] page 513

Example 2, second order:

Consider eg (x_i)_{i = 0 \dotsc 2} = (x - h, x, x + h). Then (w_i)_{i = 0 \dotsc 2} = (\frac{1}{2h^2}, \frac{1}{-h^2}, \frac{1}{2h^2}) and l_x^{(1)}(x) = 0 so:

l^{(2)}_j(x) &= \left( \begin{matrix}
                \frac{1}{h^2}, & \frac{-2}{h^2}, & \frac{1}{h^2} \\
                \end{matrix} \right)

This is usual Newton second order central finite difference approximation of accuracy \mathcal{O}(h), ie:

f^{(2)}(x) = \frac{f(x-h) - 2 \cdot f(x) + f(x+h)}{h^2}

Also l_{x-h}^{(1)}(x-h) = \frac{-3}{2 \cdot h} so:

l^{(2)}_j(x-h) &= \left( \begin{matrix}
                - \sum_{k \neq i} l^{(2)}_k(x_i),
                & \frac{ - 2 \cdot \frac{1}{-h^2}}{h \cdot \frac{1}{2h^2}}
                        \cdot \left(\frac{1}{h} + \frac{-3}{2 \cdot h} \right),
                & \frac{ - 2 \cdot \frac{1}{2h^2}}{h \cdot \frac{1}{2h^2}}
                        \cdot \left(\frac{1}{2 \cdot h} + \frac{-3}{2 \cdot h} \right) \\
                \end{matrix} \right) \\
             &= \left( \begin{matrix}
                - \sum_{k \neq i} l^{(2)}_k(x_i),
                & \frac{4}{h} \cdot \frac{-1}{2 \cdot h},
                & \frac{-2}{h} \cdot \frac{-1}{2 \cdot h} \\
                \end{matrix} \right) \\
             &= \left( \begin{matrix}
                & \frac{-2}{h^2},
                & \frac{1}{h^2} \\
                \end{matrix} \right)

This is usual Newton second order forward finite difference approximation of accuracy \mathcal{O}(h), ie:

f^{(2)}(x - h) = \frac{f(x-h) - 2 \cdot f(x) + f(x+h)}{h^2}

Both results are in accordance with table 25.2 page 914 in [AbramowitzStegun].

[SadiqViswanath] study the fact that the accuracy sometimes is 1 higher than specified by Lagrange, ie the accuracy is boosted. One of their results are:

Corollary 7 page 14:

Look at the grid values (x_i)_{i=0 \dotsc n} = (h \cdot z_i)_{i=0 \dotsc n}. If the grid relative grid values (z_i)_{i=0 \dotsc n} are symmetric about 0 (in other words z is a relative grid value if and only if −z is a relative grid value) and n − m (m is the order of the derivative) is odd, the order of accuracy is boosted by 1.

So according to the corollary the Newton second order central finite difference approximation is actually of accuracy \mathcal{O}(h^2) since (x_i)_{i = 0 \dotsc 2} = (x - h, x, x + h) has symmetric relative grid values and n - m = 3 - 1 = 1 is odd. On the other hand the accuracy of the Newton second order forward finite difference approximation is still \mathcal{O}(h). This is confirmed by table 25.2 page 914 in [AbramowitzStegun].

Decision, the finance package:

In the finance package the derivatives numericcally will approximated such that the accuracy \mathcal{O}(0.0001^4) leading to an accuracy down to the sixteenth decimal.