Backprop Examples

We will look at three different ways of computing gradients, all of which rely on the same basic principle (backprop).

  1. The first method is to code a function and its derivatives manually using the rules of backprop.
  2. The second method is to use automatic differentiation using a tool called autograd.
  3. The third method is to use Tensorflow, a powerful machine learning toolkit from Google, which also includes methods to automatically differentiate an expression defined through Tensforflow primitives.

More about autograd

autograd is a Python package for algorithmic differentiation. It allows you to automatically compute the derivative of functions written in (nearly) native code. This makes it really easy to compute derivatives. Under the hood, it is also using reverse mode autodiff (backprop).

Mote about Tensorflow

Tensorflow is a powerful machine learning package from Google.

You can define functions as "computation graphs" using tensor flow operations, and it can automatically perform backprop (aka reverse mode automatic differentiation) to compute the gradient for you.

Note: You will need to install TensorFlow if you want to run these examples. It is not hard to install, but takes a bit more work to be able to call from a Jupyter notebook.

Backprop Example 1

We will see three different ways to compute the gradient of

$$f(x,y,z) = (2x + y)z$$

Manual backprop

In [8]:
import numpy as np

# Backprop example

# Compute f(x,y,z) = (2*x+y)*z
x = 1.
y = 2.
z = 3.

# Forward pass
q = 2.*x + y   # Node 1
f = q*z        # Node 2

# Backward pass
f_bar = 1
q_bar = z * f_bar  # Node 2 input
z_bar = q * f_bar  # Node 2 input
x_bar = 2 * q_bar  # Node 1 input
y_bar = 1 * q_bar  # Node 1 input

grad = np.array([x_bar, y_bar, z_bar])

print(f)
print(grad)
12.0
[6. 3. 4.]

Autograd

In [9]:
import autograd.numpy as np  # Thinly wrapped version of numpy
from autograd import grad

def f(args):
    x,y,z = args
    return (2*x + y)*z

f_grad = grad(f)  # magic: returns a function that computes the gradient of f

x = 1.
y = 2.
z = 3.

print(f([x, y, z]))
print(f_grad([x, y, z]))
12.0
[array(6.), array(3.), array(4.)]

Backprop Example 2

Here is a slighly more complex example:

$$f(x) = 10*\exp(\sin(x)) + \cos^2(x)$$

Manual backprop

In [12]:
import numpy as np

# Backprop example
# f(x) = 10*np.exp(np.sin(x)) + np.cos(x)**2

# Forward pass
x = 1000
a = np.sin(x)   # Node 1
b = np.cos(x)   # Node 1
c = b**2        # Node 3
d = np.exp(a)   # Node 4
f = 10*d + c    # Node 5 (final output)

# Backward pass
f_bar = 1
d_bar = 10 * f_bar            # Node 5 input
c_bar = 1  * f_bar            # Node 5 input
a_bar = np.exp(a) * d_bar     # Node 4 input
b_bar = 2*b * c_bar           # Node 3 input
x_bar =  np.cos(x) * a_bar - np.sin(x) * b_bar  # Node 2 and 1 input

print (f, x_bar)
23.1780070835713 11.92692295225547

Autograd

In [16]:
import autograd.numpy as np  # Thinly wrapped version of numpy
from autograd import grad

def f(args):
    x,y = args
    return y**3*10*np.exp(np.sin(x)) + np.cos(x)**2

f_grad = grad(f)

x = 2.
y = 3.

print(f([x, y]))
print(f_grad([x, y]))
670.4691647536183
[array(-278.18475186), array(670.29598656)]