Making a Keras model portable

Authors

Soufiane Fadel & Eric T-K Chou

Project description

The goal of this project is to make ML models produced through using tensorflow/keras, more portable. Since a trained model's main application is to make predictions, it does not require the whole tensorflow/keras library to run it. In fact, the goal of this project is for you to be able to run your trained model in Python using only Numpy. Obviously, you will need your model.h5 file. We named this project Keras minimizer (or kerasmizer).

Methodology

The main stages of this project fall under the 3 components:

Code a function that extracts the information needed to build the model.
Code from scratch (numpy / cython) custom layers and activation functions.
Build the deployment model from the extracted information and custom layers.

The code is compatible with models using the following functions:

load_model_from_h5: Return a dictionary with the main information (weights and configuration) from an HDF5/h5 file, which is a file format to store structured data. Keras saves models in this format as it can easily store the weights and model configuration in a single file.
Custom layers:
- Conv2d: the 2D convolution layer creates a convolution kernel that is convolved with the layer input to produce an array of outputs.
- Pool2d: this layer Downsamples the input representation by taking the maximum/average value over the window defined by pool_size for each dimension along the features axis.
- Dense: It implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer.
- Flatten: Flattens the input.
- Zeropadding2D: This layer can add rows and columns of zeros at the top, bottom, left and right side of an image array.
- BatchNormalization: Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1.
- Add: It takes as input a list of arrays, all of the same shape, and returns a single tensor (also of the same shape).
Activations: ReLU, sigmoid, softmax, softplus, softsing, and tanh.
Run_model: This function combines the output of the costume layers and the extracted dictionary by making the inference on the layers.

Results

In this section, we benchmark the performance of our Keras minimizer using a simple model between using Keras, numpy, cython+cnumpy, and cython. There are two main reasons to deploy the model with numpy/cython instead of tensorflow:

The execution time on the CPU is faster with numpy/cython compared to TensorFlow / keras.
Reduce the size of the model storage.

We test our code using the Resnet152 and the VGG16 models. We were able to successfully reproduce the results using our code. On both models, our code loads the models faster than Keras. With ResNet 152, our code performs faster on the first predictions than Keras. For subsequently predictions, Keras seems to be able to predict at a much faster rate. This is due to the fact that the Keras prediction function is only created once during the first call. Therefore, we believe there's subsequent room for improvement in our code. More to come! Meanwhile,

+---------------+---------------+---------------+---------------+--------------+
|    Name of    | first_predict | second_predic | load_model_ti | prediction_o |
|    model:     | ions_time_ave | tions_time_av |  me_average   |    utput     |
|  resnet152v2  |     rage      |     erage     |               |              |
+===============+===============+===============+===============+==============+
| Keras         | 3.540         | 0.264         | 7.290         | 0.942        |
+---------------+---------------+---------------+---------------+--------------+
| Kerasmin      | 0.781         | 0.781         | 0.371         | 0.942        |
+---------------+---------------+---------------+---------------+--------------+

+---------------+---------------+---------------+---------------+--------------+
|    Name of    | first_predict | second_predic | load_model_ti | prediction_o |
| model: vgg16  | ions_time_ave | tions_time_av |  me_average   |    utput     |
|               |     rage      |     erage     |               |              |
+===============+===============+===============+===============+==============+
| Keras         | 0.261         | 0.167         | 0.220         | 0.165        |
+---------------+---------------+---------------+---------------+--------------+
| Kerasmin      | 0.685         | 0.685         | 0.027         | 0.165        |
+---------------+---------------+---------------+---------------+--------------+

Challenges

This project involves in understanding the workaround of Neuron Network models. From packaging and unpackaging Keras NN models, we translate a model from Keras to a standalone Python + Numpy code. We tested its accuracy as well as it's efficiency.

This project enhances one understanding of NN functions, instructs one to write codes in different languages including Cython, and to do benchmarking tests.

The source code can be obtained through Githubs::fade070/keras_min.

Additional machine learning papers produced by Soufiane can be found at Githubs::fade070/papers.

Disclaimer

The purpose of this website is to promote the use of Machine Learning in identifying health related issues that may be of interest to others. The content on this website is provided for educational purposes only. This website is not intended as and does not constitute medical advice and should not be acted on as such. Use at your own risk: “none of the authors or anyone else connected with this site, in any way whatsoever, can be responsible for your use of the information and tools contained in or linked or generated from these web pages.”

Making a Keras model portable

Authors Soufiane Fadel & Eric T-K Chou

Disclaimer

Authors

Soufiane Fadel & Eric T-K Chou