Fitting a sin curve with 3-Degree Polynomial using PyTorch
In this blog, I will take you through the basics of PyTorch and then explain how to fit a sin curve with 3-degree polynomial.
Let’s start with basics.
Tensors
Tensors are data structures similar to arrays and matrices. PyTorch uses tensors to encode the inputs, outputs and paramaters of the model. These Tensor run on GPU or specific hardware devices to improvise the performance
For the rest to tutorial lets follow the below:
Initialising a tensor:
Let
data = [[1, 2],[3, 4]]
Initializing directly from data
Initializing using numpy
Initilaising using another tensor
Attributes of Tensors
tensor = torch.rand(3,4) #assume
Shape of tensor: tensor.shape
Datatype of tensor: tensor.dtype
Device tensor is stored on: tensor.device
Operations on tensors
If GPU is available to be used we move the tensor to GPU
Indexing and Slicing are similar to numpy
Joining matrices (or) Concatenating
Multiplication
tensor * tensor or tensor.matmul(tensor) returns the same tensor that is formed as a result of element wise multiplication
In-Place Operations
In-place operations on tensors is simple. It is acheived by suffixing a function with _ like x.copy_(y), x.add_()
Bridge between numpy and Tensors
creating a tensor from numpy array:
let
Any changes done on numpy array reflects on the tensor. This is due to the logic that both tensor and numpy array share the same memory locations.
Let’s see how a numpy array is created from tensor.
Any changes done on tensor reflects on the numpy array . This is due to the logic that both tensor and numpy array share the same memory locations.
Why PyTorch
1) It used GPU and other accelarators to power up computations 50x.
2) Automatic differentiation Library for Nueral Networks.
Fitting a sin curve on a 3 degree polynomial
Let’s Learn PyTorch with an example of fitting a sin curve on a 3 degree polynomial.
When you observe, sin fuction between -pi and +pi looks something similar to a 3 degree polynomial. In this tutorial, we shall try to arrive at a 3-degree polynomial which fits sin curve in -pi to +pi.
So, our approach is to start with a random polynomial with co-efficients a,b,c and d. Our loss function is going to be sum of squares of difference between predicted and actual values, with learmning rate being 1e-6 and using deep-learning to arrive at the best fit. Try to understand what is batch size and epoch for the discussed model below. Learn difference-between-a-batch-and-an-epoch
The below GIF actually visualises how we arrive at the best fitting curve after assuming a random 3-degree polynomial.
NumPy approach:
Now, we try to implement this using tensors. Here we mention the device which the tensor should be loaded on for performance.
Auto-grad
As mentioned earlier, PyTorch provides us the automatic differentiation also called auto-grad in short.
It is pretty simple to use in practice. Each Tensor represents a node in a computational graph. If x is a Tensor that has x.requires_grad=True then x.grad is another Tensor holding the gradient of x with respect to some scalar value.
So, we can take a breather by using this auto-grad provided by PyTorch to caluculate the complex back-propagation steps.
Now, Think which part of the above code is going to change.
1) Since a,b,c and d are learnable tensors, they must be provided with an attribute requires_grad as true.
2) Once a,b,c,d requires_grad as true, PyTorch automatically caluculates the all the gradiants required for back-propagation.
This is done by just calling loss.backward(). Now we have gradiants of a,b,c and d in a.grad,b.grad,c.grad and d.grad.
So we update our our learnable parameters as in previous cases and set the grad attributes of the learnable tensors to None for re-use.
So our code looks like this:
Till now, We have used auto-grad functions which are pre-defined.
Now, we are going to define our own auto-grad functions. We have forward and backward functions that operate on tensors.Let’s look into the implementation of this custom auto-grad functions. In this example we define our model as y=a+b*P3(c+dx) instead of y=a+bx+cx^2+dx^3, where P3(x)=1/2(5x^3−3x) is the Legendre polynomial of degree three.
Now, we have defined our custom auto-grad functions, we can directly use them as in the above examples.
nn Module
When building neural networks we frequently think of arranging the computation into layers, some of which have learnable parameters which will be optimized during learning. PyTorch provides higher-level abstractions over raw computational graphs that are useful for building neural networks.
Here, the nn package serves this. It defines a set of Modules, which are roughly equivalent to neural network layers. It also defines a set of useful loss functions that are commonly used when training neural networks
Let’s see this by an example of fitting a sin curve on 3rd degree polynomial:
PyTorch : optim
The optim package in PyTorch abstracts the idea of an optimization algorithm and provides implementations of commonly used optimization algorithms. Till now, we have updated the weights of our models by manually mutating the Tensors holding learnable parameters with torch.no_grad().
Let us see where we can enhance the code by replacing things which are being done manually.
Similar to above discussed auto-grad, we have custom nn Modules as-well.
You can define your own Modules by subclassing nn.Module and defining a forward which receives input Tensors and produces output Tensors using other modules or other autograd operations on Tensors.
Let’s look into it in high level using the same examples as earlier
We arrive at the end of the this tutorial about the basic’s of pytorch. I wish to write a blog running a model and improvising the model intutively. Don’t wait for that. Anyway, go ahead and explore on your own. Lots and lots of cool stuff out on the internet.