An algorithm that facilitates communication between a speech-impaired person and someone who doesn't understand sign language.
Training set: 1080 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (180 pictures per number).
Test set: 120 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (20 pictures per number).
Here are examples for each number, and corresponding labels converted to one-hot.
Architecture:
- Input is an image of size 64x64x3 (RGB), which is flattened to shape 12288 and normalized it by dividing it by 255
- Hidden layers of size (12288 -> 25 -> 12 -> 6)
- The output of last hidden layer gives a probability of the image belonging to one of the six classes
- RELU activation function. Cross entropy cost. Adam optimizer
- Mini-batch gradient descent with minibatch_size of 32
The model is LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX.
Outcome:
- Training cost graph-
- Train Accuracy - 0.999074
Test Accuracy - 0.716667 - TODO- to overcome overfitting, add L2 or dropout regularization