This proct is about to implement a pytorch code to desing a neural network in order to do classification based on the dataset FasionMNIST.

The first step then is to download the training and test sets of the dataset using the module datasets in pytorch.

Also, to avoid unwanted overfittings I have divided the train set into train and validation as below:

Alright, now I have three sets, training validation and test. During the traiing phase of the algorithm we will use the training and validation sets and finally we apply the resulting model on the test set to verify how well int can classify untrained date.

But before that, let's take a look at the type of fata that we have. Using the fuctio iter in python I make the train set iterable and select the first elemet of the iterable obect using next function.

Note that since the elements of the training set consist of tuples of images and class labels, I have unpacked the extracted element into two valiables sample and label. Using the attribute shape we see the size of the sample. The first number in the list indicates the number of channels of the image, and since here we have a gray scale image there is only one channel. Also the size of the image is 28$*$28 (28 pixels in row and 28 in column)

Using the imshow of pyplot module we can visualize the image. As you can see, it seems to a kid of boot.

So to have an idea bout the existig classes in the dataset, I have hard coded them in below.

In fact , the labels in the dataset are categorical data from 0 to 9 each one idicateng one of the classes. Numer 0 refers to "T_shirt/top", number one to "Trouser" and so on.

For instance, the lable in the example above indicates the number 9 which is refer to the last class "Ankle Boot".

To ake the learning process faster, I am going to use the batch ;earning. So I use the DataLoader module in torch.utils.data to divide the data ito batches of size 32. It is possible to choose different numbers but 32 is the one that seems ok in our case. Also to improve performance of learning (avoid overfit and underfit), I put the shuffle for my traning data true, which means that after each epoch of training phase, the construction of batches will be differet in the training data.

Model Definition

Ok, now it is time to define the model. I am going to use a sequential neural network with three layers. Sincethe images are of size 28$*$28, the nput size would be 28$*$28. For the first hidden layer, I am using 300 neurons and for the second one I a going to use 100 neurons and since we have 10 classes, the last layer consists of 10 neurons (for classes one to nine). These parameters are as well as the learning rate are given as below.

The following snippet illustrates the structure of the model that I costruct. In fact, I and used the Relu activation function for each layer of the neural network.

By typing the name of the model, we can somehow see the summary of the model structure.

The other things that should be specified before starting the training phase, is the loss function and the optimier. Due to the nature of our problem, which is multiple classification, I have chosen the cross entropy loss function, which is available in torch.nn. Also to minimize the loss function, I have used the stochastic gradient decsent algoritm as optimizer.

Training and Validation

Now we have all the tools to start training. I will do the training using 30 epochs, the number of steps is equal to number of training batches and the total nuber of steps for validation is equal to number of validation batches. Also I have used a dictionary to save some of the metrics like loss and accuracy during the training phase. These metrics can be used to visualize the performance of algorithm.

Using the information registered in the dictionary, now I can see the performance of the model using loss value and accuracy for both the training set and validation set.

Applying the model on test set