Constant Variance Weight Initialisation

This repo houses the code for the paper Constant Variance Weight Initialisation.

Abstract

Weight initialisation is a necessary first step in all neural networks: This work reviews the currently popular methods of weight initialisation without using input data and proposes Constant Variance Weight Initialisation. Applied to small neural networks, Constant Variance initialisation is shown to result in an increase in training speed compared to Xavier initialisation, a result which fails to generalise to larger neural networks. However, equivalent performance can be achieved on larger neural networks either by scaling the range of the S-shaped activation function or by reducing the standard deviation of the input and the standard deviation of the forward propagation. Constant Variance initialisation is then compared to He initialisation, where it shows no significant difference in training speed when applied to small neural networks, but results in an improved training speed when applied to larger neural networks.

Prerequisites

A python environment with jupyter notebook is required. Below are the libraries used in this project (See requirements.txt for exact requirements):

PyTorch
- Torch
- Torchvision
numpy
matplotlib
scikit-learn (sklearn)
pillow

Running

The code has two main parts, generating the activation function coefficients, and testing the activation function coefficients on the chosen problems.

First, to generate the activation function coefficients:

Run the ActivationCoefficients.ipynb notebook

This generates two files, coeffs.csv and coeffs_uniform.csv which contain the coefficients used when applying constant variance weight initialisation for each activation function.

After this, the results can be generated by running:

XavierFashionMNIST.ipynb | Section 3.1
- Constant Variance compared to Xavier Initialisation applied to the FashionMNIST dataset
CIFARClassification.ipynb | Section 3.2
- Constant Variance compared to Xavier initialisation applied to the CIFAR dataset.
HeFashionMNIST.ipynb | Section 4.1
- Constant Variance compared to He initialisation applied to the FashionMNIST dataset.
ReLUCIFARClassification.ipynb | Section 4.2
- Constant Variance compared to He initialisation applied to the CIFAR dataset.

This will put the results in four different folders which must exist before running the code. These folders containing the raw results are provided in this repository. Respectively these folders are:

XavierFashionMNIST_test
XavierCIFAR_test
HeConstVarFashionMNIST_test
HeCIFAR_test

Generating the plots/figures based on these experiments are mostly included in the same notebook. There are two additional notebooks for plotting some results:

ReLUCIFARPlotter2.ipynb | Figure 6
- Plots the results for Constant Variance initialisation compared to He initialisation.
XavierComparason5.ipynb | Figure 3
- Plots the forward propagated variance on each layer

Where the plotting is included in the same notebook, there is a cell to load in the raw data before generating the plot. This can be found after the cells for generating the raw results. Run this cell to load in the raw results and then run the subsequent cells to generate plots based on the loaded data.

The generated plots/figures have also been included in the PaperImages folder.

By default, the notebooks will run each experiment 5 times. This can be changed by changing the value in the range where the results are generated and saved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Constant Variance Weight Initialisation

Abstract

Prerequisites

Running

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
HeCIFAR_test		HeCIFAR_test
HeConstVarFashionMNIST_test		HeConstVarFashionMNIST_test
PaperImages		PaperImages
XavierCIFAR_test		XavierCIFAR_test
XavierFashionMNIST_test		XavierFashionMNIST_test
.gitignore		.gitignore
ActivationCoefficients.ipynb		ActivationCoefficients.ipynb
CIFARClassification.ipynb		CIFARClassification.ipynb
HeFashionMNIST.ipynb		HeFashionMNIST.ipynb
LICENSE.txt		LICENSE.txt
README.md		README.md
ReLUCIFARClassification.ipynb		ReLUCIFARClassification.ipynb
ReLUCIFARPlotter2.ipynb		ReLUCIFARPlotter2.ipynb
XavierComparason5.ipynb		XavierComparason5.ipynb
XavierFashionMNIST.ipynb		XavierFashionMNIST.ipynb
coeffs.csv		coeffs.csv
coeffs_uniform.csv		coeffs_uniform.csv
requirements.txt		requirements.txt

License

WizardOhio24/ConstantVarianceWeightInitialisation

Folders and files

Latest commit

History

Repository files navigation

Constant Variance Weight Initialisation

Abstract

Prerequisites

Running

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages