Handwritten Digit Recognition – Building a Neural Network in Python

Welcome back, tech enthusiasts! In today’s article, we will explore the exciting world of neural networks by diving into a Python project that focuses on recognizing handwritten digits.

If you’re interested in machine learning and eager to get started with neural networks, this project is perfect for beginners. We will use the “mnist” dataset, which provides us with a collection of handwritten digits in a 28 x 28 pixel format, along with their corresponding labels. This dataset will serve as our training data for the neural network. After training the network, we will test its accuracy and even provide our own handwritten images for prediction.

Handwritten Digit Recognition - Building a Neural Network in Python
Handwritten Digit Recognition – Building a Neural Network in Python

Getting Started

To start, we need to install a few libraries: numpy, opencv-python, matplotlib, and tensorflow. These libraries will provide the necessary tools for image processing, data visualization, and machine learning. Once installed, we can import the required modules and begin working on our neural network.

import numpy as np
import cv2
import matplotlib.pyplot as plt
import tensorflow as tf
import os

Loading and Preprocessing the Data

The first step is to load the “mnist” dataset, which can be easily accessed through the tf.keras.datasets module. This dataset is already split into training and testing data, making our job easier. We extract the pixel data (x) and the corresponding labels (y) for both the training and testing sets.

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

Before training the neural network, we need to normalize the pixel values to a range between 0 and 1. This step ensures that all values are scaled uniformly and makes it easier for the network to process the data.

x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)

Building the Neural Network

Now, it’s time to create our neural network model. We will use a simple architecture called a sequential model, which allows us to add layers one after another.

Further reading:  The Intriguing Battle: Humans vs. AI in Decision Making

The first layer we add is a “flattened” layer. This layer is responsible for converting the 28 x 28 pixel input into a 784-pixel array.

Next, we add two dense layers, which are the basic building blocks of neural networks. The first dense layer has 128 units and uses the ReLU activation function. The second dense layer has the same configuration.

Finally, we add an output layer with 10 units (representing the digits 0-9) and use the softmax activation function. Softmax ensures that the outputs of the 10 neurons sum up to 1, allowing us to interpret the results as probabilities.

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='softmax'))

Compiling and Training the Model

Before we can train the model, we need to compile it. We specify the optimizer, loss function, and metrics to be used during training. For this project, we will use the Adam optimizer, sparse categorical cross-entropy loss, and accuracy as the metric.

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Now, it’s time to train the model using our training data. We fit the model to the training set, specifying the number of epochs (iterations) to be performed. In this case, we use 3 epochs.

model.fit(x_train, y_train, epochs=3)

Once the model is trained, we can evaluate its performance on the testing data. This will give us insights into how well the model generalizes to unseen data. We print the loss and accuracy metrics to assess the model’s effectiveness.

loss, accuracy = model.evaluate(x_test, y_test)
print(f"Loss: {loss}, Accuracy: {accuracy}")

Predicting Handwritten Digits

Now comes the fun part! We can test the model’s ability to recognize handwritten digits by providing our own images. We will use the OpenCV library to read and process these images.

Further reading:  The Dangers of Artificial Intelligence: A Look into the Future

To integrate OpenCV, we iterate through a directory containing our digit images and apply the necessary transformations. We convert the images into arrays, invert the color scheme, and normalize them. Then, we feed the preprocessed image into the model for prediction.

image_number = 1
while os.path.isfile(f"digits/digit{image_number}.png"):
    try:
        image = cv2.imread(f"digits/digit{image_number}.png", 0)
        image = np.invert(image)
        image = np.array(image)
        prediction = model.predict(np.array([image]))
        digit = np.argmax(prediction)
        print(f"This digit is probably a {digit}")
        plt.imshow(image, cmap=plt.cm.binary)
        plt.show()
    except:
        print("Error occurred while predicting the digit.")
    image_number += 1

And there you have it! You can now experiment with your own handwritten digits and see how accurately the neural network classifies them.

Conclusion

In this article, we explored the fascinating field of neural networks by building a Python project for handwritten digit recognition. We trained a neural network using the “mnist” dataset, evaluated its performance, and predicted handwritten digits of our own. By understanding the basics of neural networks, you now have a solid foundation to explore more complex applications and develop your machine learning skills.

Remember, practice makes perfect. Feel free to experiment with different hyperparameters, add more layers, or try alternative datasets to further enhance your neural network. And as always, stay curious and keep learning!

Check out the “Techal” brand for more insightful articles and tech-related content.

YouTube video
Handwritten Digit Recognition – Building a Neural Network in Python