Image Classification

akshitha singareddy
Oct 21, 2022
4 min read

Updated: Oct 21, 2022

The aim of the assignment is to build a image classifier to classify three objects. They are:

1. airplane 2. motorbike 3. schooner

This assignment is completed using python as programming language.

Libraries used are: 1. TensorFlow - Deep learning module which helps in building deep learning models 2. Matplotlib - Python module used for Visualizations 3. NumPy - Python module used to array transformations while building models To accomplish this assignment, we have used convolutional neural networks, otherwise referred as CNN’s. CNN is a deep learning algorithm mainly used for image inputs. Convolutional layers are the primary building blocksof a CNN. Convolution layertypically takes an image input(a matrix ) and do convolution operations with the kernel and forward the output matrixto next layer.Here, kernel is the learnable parameters whichare learned over the trainingtime. At each epoch, as the trainingimages are fed, based on the loss these kernelsare learnt duringback propagation. Typically, a Convolutional NeuralNetwork has three different types of layers. They are: 1. Convolutional Layer - Given an input, it performs the convolutional operations on input followed by activation function and passes the result as an output.Typically the activation function is RELU which stands for rectified linear units. 2. Pooling layer - poolinglayers are used in networkto decrease the dimensionality of the featuremap. There are different types of poolinglayers. Broadly, different pooling layers are MaxPooling, AvergaePooling, GlobalAveragepooling 3. Fully ConnectedLayer - After a series of convolution and max poolinglayers, input is flattened out and sent to fully connected layers.In this type of layers,unlike CNN, everynode is connected to every node in next layer. For building a model, we are using TensorFlow module.It is an opensource moduledeveloped by google which eases the process of preparing data, building model and evaluation model. Classifier was programmed in colab using GPU provided by colab. Below are the steps involvedin building image classifiers: 1. Splitting data 2. Building data pipelines 3. Data Visualization

4. Create Model 5. Train Model 6. Evaluate Model 7. Save Results Four experimentations have been performed in the processof tuning the model. Splitting data. Stratified splitting has been done on the whole dataset into train, test and validation in the ratio of 70,20 and 10 percentages respectively using a package called split-folders which is purely written in python. Building data pipelines Keraswhich is a part of TensorFlow providean interface to load Imagedirectory datasets into tensors. Below code demonstrates the process of loading dataset using keras data pipelines.

Batch size is set to 32 which indicates 32 images are fed into the network at once before weights are updated during backpropagation. Imagesare resized to decrease the load on GPU. Shuffling has been done on train and test to decrease the bias. Data Visualization

Once the data pipelines are bult, randomly10 images were drawn and visualized using the matplotlib library

Model Creation This is the phase where experimentation/tuning of hyperparameters has to be done in-order to get best model. To start with, a basic model has been built with two series of convolution followed by maxpoollayers and two dense layers with a sigmoid activation. Four different trails are performed to compare. Trail -I:

Below is the model summaryof shapes and parameters

Keras and tensorflow provides an interface to define callbacks. Using callbacks, we can define what to happen at the end of each epoch. I have added two callbacks one to early stop and other to save best model at each epoch.

Oncecallbacks are defined,model is compiledusing Adam optimizer and Loss used is Categorical Cross Entropy. Tensorflow providesimplementation of optimizerand loss. Now training is done.

Above are the training accuracyand training loss plots over 15 epochs. Evaluate Model

Once training is completed, model is being evaluated using test datasetcreated in the beginning, Below are the metrics for test dataset. Test Loss: 0.07167135179042816 Test Accuracy:0.9940476417541504 Confusion Matrix:

Now,as trail -2 Another convolutional layer is addedfollowed my maxpooling layer. Here is the summaryof the model in Trail-2

Metrics on test Dataset for trail 2: Test Loss: 0.09147582948207855 Test Accuracy:0.9940476417541504 For trail 3, Another conv layer followedby max layer is added to existingmodel. Below is the model summary.

Test Loss: 0.10367380827665329 Test Accuracy:0.9940476417541504 For Trail 4, data augmentation is done usingkeras data generator

Test Metrics: Test Loss: 0.1958087533712387 Test Accuracy:0.9583333134651184 Training Accuracy was 99% in all of the trails. Bar charts for test accuracies, test losses and training times are plotted for four trails as shown below