This notebook introduces PyTorch as a jump off from Numpy and shows tensors in forward propogation as a basis in building neural networks.

Creating tensors in PyTorch

PyTorch has constructs to create and operate on tensors variables in the same way as Numpy n-dimension arrays.

Tensors are arrays with arbitrary dimensions. Parameters in neural networks are usually initialized to random weights, which are held in tensors.

import torch
# Set seed for reproducible results
torch.random.manual_seed(42)

# Create a random 3x3 tensor
my_tensor = torch.rand(3,3)

# Print tensor elements and size of tensor
print(f"tensor size aka shape: {my_tensor.shape}")
print("tensor elements:")
print(my_tensor)
tensor size aka shape: torch.Size([3, 3])
tensor elements:
tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009],
        [0.2566, 0.7936, 0.9408]])

Multiplication

PyTorch has different tensor multiplication methods, including matrix multiplication and element-wise multiplication.

Matrix multiplication is done by torch.matmul()

Element-wise multiplication is provided by the binary operator *

Recall you get the same matrix back when multiplying a matrix by its identity matrix.

In element by element multiplication, you multiply elements $a_{i,j}, b_{i,j}$ of matrices $A$ and $B$ to get $c_{i,j}$ for product matrix $C$

# Use torch.ones(n,m) to create a nxm matrix of all 1s
tensor_of_ones = torch.ones(3,3)

# Use torch.eye(n,m) to create identity matrix
identity_tensor = torch.eye(3,3)

print(f"tensor of ones:\n{tensor_of_ones}")
print(f"identity tensor:\n{identity_tensor}")


print('"Regular" matrix multiplication')
print(torch.matmul(tensor_of_ones, identity_tensor))

# Element-wise matrix multiplication
print('"Element-wise" multiplication of matrices')
print(tensor_of_ones*identity_tensor)
tensor of ones:
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
identity tensor:
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
"Regular" matrix multiplication
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
"Element-wise" multiplication of matrices
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

Forward propogration

Forward propogation, simply put, is giving data a single pass through the neural network layers.

This step, also called forward pass, is an fundamental in training and eval of classifiers. It takes input data, does operations on the data in the inner layers, and outputs the transformed data. Each layer takes values from the previous layer.

The cell below does forward propogation of input values. Ihe input nodes are a, b, c, and d. Node e is the sum of a and b, node f is the multiplication of c and d, and output node g is the max of e and f. The inner layer are nodes e and f.

a: 5
b: 7
c: 6
d: 9

e: 12 = 5 + 7
f: 54 = 6 * 9

g: 54 = max(12, 54)

While this is a very simplistic example, we’ll see later that PyTorch uses computational graphs to make computation of derivatives and gradients easier.

torch.random.manual_seed(42)

a = torch.randint(high=10, size=(1,1))
b = torch.randint(high=10, size=(1,1))
c = torch.randint(high=10, size=(1,1))
d = torch.randint(high=10, size=(1,1))

print("Input nodes a, b, c")
print(a)
print(b)
print(c)
print(d)

e = a + b
f = c * d

print("inner layer e and f")
print(e)
print(f)

g = torch.max(e,f)
print("output g")
print(g)
Input nodes a, b, c
tensor([[5]])
tensor([[7]])
tensor([[6]])
tensor([[9]])
inner layer e and f
tensor([[12]])
tensor([[54]])
output g
tensor([[54]])

Another forward pass example but using larger rank tensors.

Note torch.rand_like(input) is equivalent to torch.rand(input.size(), dtype=input.dtype, layout=input.layout, device=input.device). This let’s us create a random tensor that is has same properties as the source tensor.

a = torch.rand(100, 100)
b = torch.rand_like(a)
c = torch.rand_like(a)
print(f"tensor a\n{a}")
print(f"tensor b\n{b}")
print(f"tensor c\n{c}")

d = torch.matmul(a, b)
print(f"tensor d\n{d}")

e = c * d
mean_e = torch.mean(e)
print(f"tensor e\n{e}")
print(f"size of e {e.shape}")
print(f"average of elements in e {mean_e}")

tensor a
tensor([[0.0711, 0.2805, 0.4312,  ..., 0.2972, 0.9673, 0.6259],
        [0.9810, 0.3363, 0.6546,  ..., 0.3229, 0.6846, 0.2764],
        [0.0909, 0.4161, 0.9713,  ..., 0.0207, 0.7353, 0.2551],
        ...,
        [0.4391, 0.2498, 0.8701,  ..., 0.6475, 0.1560, 0.5830],
        [0.3359, 0.9406, 0.2543,  ..., 0.9428, 0.6894, 0.0765],
        [0.0550, 0.0198, 0.2295,  ..., 0.9022, 0.4749, 0.0778]])
tensor b
tensor([[0.1096, 0.4660, 0.8143,  ..., 0.8509, 0.9870, 0.5720],
        [0.5736, 0.4108, 0.6739,  ..., 0.2601, 0.8542, 0.7942],
        [0.2526, 0.6742, 0.7890,  ..., 0.7083, 0.5828, 0.0388],
        ...,
        [0.6246, 0.9720, 0.6070,  ..., 0.1586, 0.7316, 0.8186],
        [0.2942, 0.1779, 0.7140,  ..., 0.3927, 0.6736, 0.3706],
        [0.2667, 0.3007, 0.7462,  ..., 0.0731, 0.5366, 0.2679]])
tensor c
tensor([[0.6993, 0.6038, 0.8744,  ..., 0.7262, 0.3757, 0.8747],
        [0.2356, 0.3697, 0.1620,  ..., 0.3008, 0.3923, 0.5617],
        [0.1525, 0.7515, 0.8436,  ..., 0.2641, 0.8710, 0.2220],
        ...,
        [0.2834, 0.8367, 0.4250,  ..., 0.9593, 0.3301, 0.7327],
        [0.6308, 0.4047, 0.8923,  ..., 0.7597, 0.8182, 0.6766],
        [0.3003, 0.1302, 0.5621,  ..., 0.3625, 0.8835, 0.0587]])
tensor d
tensor([[25.8951, 23.9216, 23.6148,  ..., 25.6046, 26.3310, 24.9499],
        [23.2617, 22.3270, 23.1557,  ..., 23.6731, 25.1142, 25.8058],
        [25.4751, 24.8487, 22.5181,  ..., 25.1717, 26.1975, 25.5598],
        ...,
        [26.9132, 24.6827, 23.6633,  ..., 24.8387, 26.5065, 26.3920],
        [25.5985, 22.9188, 22.4873,  ..., 25.2619, 26.3397, 27.3959],
        [26.5511, 27.1494, 26.0090,  ..., 27.3040, 28.2337, 27.7980]])
tensor e
tensor([[18.1081, 14.4431, 20.6483,  ..., 18.5930,  9.8915, 21.8248],
        [ 5.4804,  8.2544,  3.7505,  ...,  7.1199,  9.8522, 14.4943],
        [ 3.8854, 18.6749, 18.9955,  ...,  6.6485, 22.8174,  5.6746],
        ...,
        [ 7.6275, 20.6509, 10.0563,  ..., 23.8266,  8.7502, 19.3364],
        [16.1487,  9.2757, 20.0664,  ..., 19.1908, 21.5513, 18.5373],
        [ 7.9739,  3.5353, 14.6193,  ...,  9.8965, 24.9449,  1.6316]])
size of e torch.Size([100, 100])
average of elements in e 12.640617370605469

You’ve now read through elemental components of PyTorch and how PyTorch works with computational graphs in forward propogation, a fundamental step in building a neural network. Stay tuned for what happens next in training a neural network. 🐱‍💻