Friday, August 25, 2017

Convolution


I was wondering if convolutional neural network had anything to do with convolution in mathematics. They do. But just as an introduction, here's what convolution in maths is all about.

Laplace transforms

These are useful in solving differential equations. Boaz defines them as:

L(f) = ∫f(t) e-pt dt = F(p)

If you use the key f(t), you can lookup the solution. For instance, if

f(t) =  e-at

then

F(t) = 1/(a + p)

You can work this out by simple integration, look it up in a table like this:
or use software like Sympy:

from sympy.abc import t, s
from sympy import *
import sympy

a = symbols('a', positive=True)
p = symbols('p')
print laplace_transform(sympy.exp(-a * t), t, p) 

which prints out:

(1/(a + p), 0, True)

Where this gets interesting is when we get recursive. Let f(t) = y and for brevity let y' = dy/dt etc. Let's plug this into the definition of a Laplace transformation at the top of the page:

L(y')y' e-pt dt

by a standard integration by parts (that says ∫u dv = uv - ∫v du) then with u=e-pt and dv=dy:

L(y') = y' e-pt dt = e-pt dy = e-pt y|0 - (-p)y e-pt dt
     = -y(0) + pL(y)

There was no particular reason to chose f(t) = y. In fact, let's chose f(t) = y' and do the same all over again. We get:

L(y'')  = -y'(0) + pL(y')

Taking these two equations, we can cancel out L(y') giving us:

L(y'')  = pL(y) - py(0) - y'(0)

and so on for any number of derivatives of y. We have our table of equations for L(y) so we only need plug that in.

Convolution

Now, as an example, take the equation:

A y''(t) + B y'(t) + C y(t) = f(t) where y(0) = y'(0) = 0

which is a typical physics equation (where a particle is at rest and the force applied to it start at at t=0). YMMV.

then applying the Laplace transform to all terms:

p2 L(y) + B p L (y) + C L(y) = L(f)

Re-arranging:

L(y) =      L(f)        =      L(f)
      (p+ B p + C)     A(p + a)(p + b) 

where a and b are chosen to make the constants B and C disappear.

But, as it happens, this factor is also a Laplace transform. From Boaz' table:

Let's call it T(p).

So, now we have:

L(y) = T(p) L(f)

and we can conclude that y "is the inverse transform of two functions whose inverse we know".

Ok, that's an example. Let's see the general case. Let G(p) and H(p) be transforms of g(t) and h(t) respectively. Then:

G(p) H(p) = g(σ) e-pσ dσ h(τ) e-pτ 

from the definition of a Laplace transformation (where we are using σ and τ to avoid confusion with a duplicated t).

Let's do another change of variables and have σ=t-τ for fixed τ which means that dσ=dt but the limits slightly change. So:

G(p) H(p) = τg(t-τ) e-p(t-τ)dt h(τ) e-pτ 
          = t=τ τ=0 e-pt g(t-τ) h(τ) dτ dt

Since τ ranges from 0 to ∞ and t from 0 to τ, the area we integrate over is the same as if t ranges from 0 to ∞ and τ from 0 to t. Therefore:

G(p) H(p) = L (t0 g(t-τh(τ) dτ ) = L ( g * h )

Note that τ introduces a sliding window that will be used in convolutional neural nets (part 2).


No comments:

Post a Comment