Beginners: Keras Mnist Data Classification using CNN (Part 2)


This is 2nd part of this(Part 1) tutorial series. Read more about ไฮโลออนไลน์ ได้เงินจริง. If you have not gone through First part, then please check out first part, because it is continous part of that series.

In the previous part, we have done data loading and preprocessing steps now in this part we are going to train our model. So let’s get started:

4- Declaring model layers

First we need to declare keras sequential model layers to build a model graph. We declare a sequential object with name model, then we add our first convolutional layer by calling a function model.add and in that we call Convolution2D function which takes first parameter as number of filters that we are going to apply on our input image. Second argument takes size of filter, we use 3*3 filter in this problem which is mostly suited to other problems as well. Activation function Relu which transforms the output of convolutional operation in such a way that all negative values becomes zero and positive values remains the same.

Then after first convolutional layer we add max poling layer which is kind of reduce the spatial size of image. Then we Flatten out the output in one-d vector and append neural layer containing 32 neurons. We also add dropout 50% because model sometimes tends to overfit. So to avoid overfiting we apply dropout and then another dense layer with size 10 which are our number of classes. So model will predict 10 small numbers represented by probabilities all sum up to 1 becuase of softmax activation function in the last layer.

#CNN model
model = Sequential()
model.add(Convolution2D(32,3,data_format='channels_last',activation='relu',input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(32))
model.add(Dropout(0.5))
model.add(Dense(10))   
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

After building the graph with sequential layers, we now compile the model with model.compile function taking parameters of loss function which in our case is “categorical crossentropy” and optimizer adam with metric accuracy.

We use categorical cross entropy because our problem is multi-class classification so we need to have such a loss function that can be optimized easily. You can read more about this loss function here.

5- Training the model

Alright! we have done all the basic steps, now is time to train the model. we will call model.fit function which takes our training data x_train and y_train, validation split parameter indicates that how much sample model will keep during training for validation from training set. Batch_size parameter tells us how many images model is going to process at a time in a single batch and epochs are number of times model will iterate through whole dataset.

#training without callbacks
history = model.fit(x_train,y_train,validation_split=0.2 , batch_size=256,epochs=100)

Training will take some time, if you are training it on Google Colab Gpu then it will take 1 to 2 minutes and on CPu it will take 5 to 8 minutes.

After training the model, we will get around 99% train data accuracy and 98% validation data accuracy. To see the history of training and validation loss and accuracy, we will plot the graphs using matplotlib function. Now let’s visualize the loss and accuracy curve of both training and validation set.

#this function will plot accuracy and loss curve
import matplotlib.pyplot as plt

def plot_curve(train,val,string1,location):
  plt.plot(train,'b-')
  plt.plot(val,'g-')
  plt.title('model '+string1)
  plt.ylabel(string1)
  plt.xlabel('epoch')
  plt.legend(['train', 'val'], loc=location)
  plt.show()

'upper left'
plot_curve(history.history['accuracy'],history.history['val_accuracy'],'Accuracy','lower right')
plot_curve(history.history['loss'],history.history['val_loss'],'Loss','upper right')

Model Accuracy

Model Loss

if we look at loss curves, validation loss curve green starts to hype upwards after certain number of iterations which means model starts ovefitting because training curve still going downwards. That means we need to stop the training when the model gets to minimal validation loss value. Otherwise we will get the overfitted model. Also we need to save the best model weights when validation loss is minimum. To do this we need to add callbacks.

Callbacks are very handy in keras api when it comes to saving optimal model based on some metric and also stop training process when model starts overfitting and validation loss does not improve after number of epochs. So we add Modelcheckpoint and earlystopping callbacks so that model stop training when validation loss is not increasing. Let’s add callbacks and train the model again.

#Training with callbacks
my_callbacks = [
    keras.callbacks.EarlyStopping(patience=7,monitor='val_loss'),
    keras.callbacks.ModelCheckpoint(filepath='best_model.h5', save_best_only=True),
]

#training
history = model.fit(x_train,y_train,validation_split=0.2 , batch_size=256,epochs=100,callbacks=my_callbacks)

After adding callbacks and training the model lets print visualize the history object again and see how it works.

Model Accuracy

Model Loss

We can see that model gets stopped just after training over 30 epochs because validation loss starts increasing again. By looking at curve we can observe this phenomenon. Now we have trained the model and saved the best model as well. Now is the time to load the best model and use it to evaluate on testing dataset.

model.load_weights('best_model.h5')
loss,acc = model.evaluate(x_test,y_test)
print("Testing Accuracy: ",round(acc*100,2))
print("Testing Loss: ",round(loss,2))

We achieved around 98% accuracy on testing data which is quite high but this dataset gives more than 99.9% accuracy on testing data which can be achieved by improving layers further. We will improve the model in our next part of this tutorial series.

Code Link: https://github.com/usmanali414/Keras-Code/blob/master/Beginners_Keras_Mnist_Classification_using_Convolutional_neural_network(CNN)_Part_1.ipynb

1 reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *