HyperNOMAD¶
HyperNOMAD is a Python package dedicated to the hyperparameter optimization of deep neural networks. The package contains a blackbox specifically designed for this problematic and provides a link with the NOMAD software used for the optimization. The blackbox takes as inputs a list of hyperparameters, builds a corresponding deep neural network in order to train, validate and test it on a specific data set before returning the test error as a mesure of performance. NOMAD is then used to minimize this error. The following appendix provides an overview of how to use the HyperNOMAD package.

Prerequisites¶
In order to run HyperNOMAD correctly, please make sure to have:
Additionally, hypernomad has the following Python requirements:
- Numpy
- Matplotlib
Check that the requirements are fullfiled¶
Here are simple tests to check that everything is set correctly before installing HyperNOMAD.
Pytorch¶
In order to test that Pytorch is correctly installed try the following command
python
>>> import torch
NOMAD¶
Try running the following command
nomad -info
For more help on the installation of NOMAD, please refer to the user_guide.
Step by step installation of HyperNOMAD¶
The installation of HyperNOMAD can start once the package is downloaded and the prerequisites installed. The package contains a Makefile responsible for builinding the binaries. To start the installation, you need to execute the following command:
make
building HyperNOMAD ...
To be able to run the example
the HYPERNOMAD_HOME environment variable
must be set to the HyperNOMAD home directory
When the compilation is successful, a message appears asking to set an environment variable ‘HYPERNOMAD_HOME’. This can be done by adding a line in the file .profile or .bashrc :
export HYPERNOMAD_HOME=hypernomad_directory
Check that the installation is successful¶
The executable hypernomad.exe is located in the bin directory. You can check that the installation is successful by trying to run the commad
$HYPERNOMAD_HOME/bin/./hypernomad.exe -i
which should return the following informations:
--------------------------------------------------
HyperNomad - version 1.0
--------------------------------------------------
Using Nomad version 3.9.0 - www.gerad.ca/nomad
--------------------------------------------------
Run : hypernomad.exe parameters_file
Info : hypernomad.exe -i
Help : hypernomad.exe -h
Version : hypernomad.exe -v
Usage : hypernomad.exe -u
Neighboors : hypernomad.exe -n parameters_file
Basic usage¶
HyperNOMAD is a library that aims at optimizing the hyperparameters of a deep neural network for a given application. Note that, at this stage, HyperNOMAD is tailored for convolutional networks only.
In order to start an optimization, a few informations must be provided in order to specify the dataset used and optionnally some informations on the search space. This is done in a parameter file which is passed as an argument to hypernomad.exe, as is shown in the following example:
$HYPERNOMAD_HOME/bin/./hypernomad.exe parameter_file.txt
Choosing a dataset¶
The parameter file must contain some mandatory informations such as the name of the data set and the number of times the blackbox is called, which corresponds to the number of different configurations HyperNOMAD is allowed to try. This package comes with the data sets that are listed in the table below.
Dataset | Training data | validation data | test data | Number of classes |
MNIST | 40000 | 10000 | 10000 | 10 |
Fashion MNIST | 40000 | 10000 | 10000 | 10 |
EMNIST | 40000 | 10000 | 10000 | 10 |
KMNIST | 40000 | 10000 | 10000 | 10 |
CIFAR10 | 40000 | 10000 | 10000 | 10 |
CIFAR100 | 40000 | 10000 | 10000 | 100 |
STL10 | 4000 | 1000 | 8000 | 10 |
HyperNOMAD also offers the possibily of using a personnal data set in which case the user is responsible for providing the necessary informations to the blackbox. The necessary instructions to do so are provided in the Advanced usage section.
Specifying the search space¶
The user can choose to provide additionnal informations on the search space considered. HyperNOMAD allows for a good flexibility of tuning a convolutional network by considering multiple aspects of a network at once such as the architecture, the dropout rate, the choice of the optimizer and the hyperparameters related to the optimization aspect (learning rate, weight decay, momentum, …), the batch size, etc. The user can choose to optimize all these aspects or select a few and fixe the others to certain values. The user can also change the default range of each hyperparameter.
This information is passed through the parameter file by using a specific synthax:
KEYWORD INITIAL_VALUE LOWER_BOUND UPPER_BOUND FIXED/VAR
This table lists all the possible keywords, their interpretation and the default values and ranges for each one
Name | Description | Default | Range |
DATASET | name of the dataset used | no default | From previous table, or CUSTOM |
NUMBER_OF_CLASSES | number of classes of the problem | no default | Only use if dataset is CUSTOM |
MAX_BB_EVALS | number of configurations to try | no default | Integer > 1 |
NUM_CON_LAYERS | number of convolutional layers | 2 | [0, 100] |
OUTPUT_CHANNELS | number of output channels of the layer | 6 | [1, 100] |
KERNELS | size of the kernels applied | 5 | [1, 20] |
STRIDES | step of the kernels | 1 | [1, 3] |
PADDINGS | size of the padding | 0 | [0, 2] |
DO_POOL | weather apply a pooling or not | 0 | 0, 1 |
NUM_FC_LAYERS | number of fully connected layers | 2 | [0, 500] |
SIZE_FC_LAYER | size of the fully connected layer | 128 | [1, 1000] |
BATCH_SIZE | the batch size | 128 | [1, 400] |
OPTIMIZER_CHOICE | from SGD, Adam, Adagrad, RMSProp | 3 | [1, 4] |
OPT_PARAM_1 | learning rate | 0.1 | [0, 1] |
OPT_PARAM_2 | second parameter of the optimizer | 0.9 | [0, 1] |
OPT_PARAM_3 | third parameter of the optimizer | 0.005 | [0, 1] |
OPT_PARAM_4 | fourth parameter of the optimizer | 0 | [0, 1] |
DROPOUT_RATE | probability of dropping a node | 0.5 | [0, 0.95] |
ACTIVATION_FUNCTION | choice from ReLU, Sigmoid or Tanh | 1 | [1, 3] |
REMAINING_HYPERPARAMETERS | use to fixe or vary those not listed in the parameter file | VAR | FIXED, VAR |
Example of a parameter file¶
Here is an example of an acceptable parameter file. First, the dataset MNIST is choosen and we specify that HyperNOMAD is allowed to try a maximum of 100 configurations. Then, the number of convolutional layers is fixed throught the optimization to 5, the two ‘-’ appearing after the ‘5’ mean that the default lower and upper bounds are not changed. The kernels, number of fully connected layers and activation function are respectively initialized at 3, 6, and 2 (Sigmoid) and the dropout rate is initialized at 0.6 with a new lower bound of 0.3 and upper bound of 0.8 Finally, all the remaining hyperparameters from Table~ref{tab:keywords} that are not explicitly mentioned in this file are fixed to their default values during the optimization.
# Mandatory information
DATASET MNIST
MAX_BB_EVAL 100
# Optional information
NUM_CON_LAYERS 5 - - FIXED
KERNELS 3
NUM_FC_LAYERS 6
ACTIVATION_FUNCTION 2
DROPOUT_RATE 0.6 0.3 0.8
REMAINING_HPS FIXED
This parameter file is provided in the directory ‘examples’ from where we can execute the following command in order to run HyperNOMAD on this search space
$HYPERNOMAD_HOME/bin/./hypernomad.exe $HYPERNOMAD_HOME/examples/mnist_first_example.txt
Using X0 as a starting point¶
The main advantage of choosing this method of initialization rather than the previous one, which relies on using keywords, is that defining an X0 allows for more flexiblity since one can choose a value for each parameter of each layer. For example, using a keyword such as ‘KERNELS’ means that all the kernels applied on every convolutional layer will have the same initial value. Whereas an X0 allows to initialize each kernel individually.
The order and meaning of the variables in X0 is hardcoded in HyperNOMAD. Let’s use the following parameter file as an example :
DATASET MNIST
MAX_BB_EVAL 100
HYPER_DISPLAY 3
# [ CONVOLUTION BLOCK ] [ FULLY CONNECTED BLOCK ] [BATCH] [ OPTIMIZER BLOCK ] [DROPOUT][ACTIVATION]
X0 ( 2 6 5 1 0 1 16 5 1 0 1 2 128 84 128 3 0.1 0.9 0.0005 0 0.2 1 )
#LOWER_BOUND ( 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 0 0 0 0 0 1 )
#UPPER_BOUND ( 100 1000 20 3 2 1 1000 20 3 2 1 500 1000 1000 400 4 1 1 1 1 1 3 )
DROPOUT_RATE 0.5 - - FIXED
KERNELS 10 - - FIXED
REMAINING_HYPERPARAMETERS VAR
Analysis of the example¶
First, ‘HYPER_DISPLAY’ allows set the level of details on the steps of HyperNOMAD. The default value is 1, and the maximum is 3. Then, X0 is presented as a list of parameters that are respectively categorised into the convolutional block, the fully connected block, the batch size, the optimizer block, the dropout rate and the activate function.
The blocks for the batch size, dropout rate and activate function contain each one single value which that of the corresponding hyperparameter.
The first variable of the convolutional block indicates the number of convolutional layers : 2 in this example. Each convolutional layer has 5 associated variables : (number of output channeles, kernel, stride, padding, do pooling). Therefor, the first convolutional layer has 6 output channels, a (5,5) kernel, a stride of 1, no padding and performs a pooling afterwards. The same goes for the second layer.
The first variable of the fully connected block corresponds to the number of fully connected layers. The following variables indicate the size of each fully connected layer.
The first variable of the optimizer block indicates which optimizer is used, here is it Adagrad. The optimizer block always has 4 associated variables whose meaning change according to the optimizer chosen. For example in the case of SGD, the first variable is the learning rate followed by the momentum, the dampening and the weight decay.
Advantage of using X0¶
In addition to being able to initialize each hyperparameter on it’s own, we can also define specific lower and upper bounds for each single hyperparameter as is shown in the previous example.
Note
Note that X0 takes precedence over the other keywords, therefore the tags KERNELS and DROPOUT_RATE will not affect this initial starting point.
Using a personnal dataset¶
In addition to the datasets embedded in HyperNOMAD, the user can choose to use a personnal dataset by specifying the following informations in the parameter file:
DATASET CUSTOM
NUMBER_OF_CLASSES 20
When using a CUSTOM dataset, it is mandatory to provide HyperNOMAD with the number of classes. The user is also responsible of plugging the 3 datasets (training, validation and testing) into the blackbox. In the file blackbox.py, the lines 80 to 84 must be completed with the adequate information.
# Load the data
print('> Preparing the data..')
if dataset is not 'CUSTOM':
dataloader = DataHandler(dataset, batch_size)
image_size, number_classes = dataloader.get_info_data
trainloader, validloader, testloader = dataloader.get_loaders()
else:
# Add here the adequate information
image_size = None
number_classes = None
trainloader = None
validloader = None
testloader = None
The image size is a tuple of the form : (number_input_channels, length_image, width_image). In the case of MNIST, the image size is (1, 28, 28).
The trainload, validloader and testloader must be instances of ‘torch.utils.data.dataloader.DataLoader’.