Skip to content
/ QEA Public

Self-directed final class project to build a convolutional neural net to recognize MNIST handwritten digits, with a focus on diving into the math involved.

Notifications You must be signed in to change notification settings

hsharriman/QEA

Repository files navigation

Building Neural Networks Using Numpy

Hwei-Shin Harriman

Originally created for Olin College of Engineering's Quantitative Engineering Analysis Final Project, Fall 2018

About this Project

This project is meant to be a deep exploration into the math behind neural networks, specifically convolutional neural networks. As such, there is no use of high-level packages such as Keras or Tensorflow, since the goal was to understand the math that drives these powerful tools. Part of the original assignment was to create a homework assignment that other students in the class could complete to gain an understanding of some aspect of the technical material necessary for the project. The pdf of this homework assignment can be found here, and the source code that accompanies the assignment can be found here.

Additionally, the final deliverable for this project was a technical write-up detailing the necessary concepts, as well as a breakdown of the process. The writeup for this project can be found here.

Running the Networks

  • Before running this program, make sure that you have Python 3 installed, as well as Numpy, matplotlib, and pickle
  • Also follow this link to download the MNIST data set of handwritten digits in .csv format
  • Clone this repository to your computer

Running the Feedforward Network

  1. Go to the bottom of ff2.py and uncomment stoch.run()
  2. From your terminal: $ python ff2.py
  3. (Optional) To continue training with the most recently saved weights: At the bottom of the file, set restore=True then re-run the program
Guidelines for Changing Hyperparameters
  • At bottom of file, can change number of epochs, batch size, and learning rate:
    • more epochs = longer training time but generally better network results
    • learning rate:
      • too small: gradient too small, likely to get stuck at local min
      • too big: gradient scaled up too much, likely to diverge
    • batch size:
      • too small: gradient updates too often, likely to converge too early
      • too big: gradient not updated enough, can't learn the details

Results

Feedforward Network Results

Running the feedforward network results in the following training and test results:

Convolutional Network Results

Running the convolutional network in its current state on the full MNIST set causes inconsistent behaviors (the network is quite sensitive, potentially due to the way that the weights are initiated, and is prone to diverging). To illustrate this, these are the results of the network learning to recognize 2 MNIST source images. The second figure shows the results of the network learning to recognize 5 MNIST source images.

Next Steps

I would like to refactor the convolutional neural network into classes and further examine what factors may be causing the network to be prone to diverging. I would also like to experiment with different types of convolutions to see if I can make the network more robust.

About

Self-directed final class project to build a convolutional neural net to recognize MNIST handwritten digits, with a focus on diving into the math involved.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published