Continual and Transfer Learning¶
The continual learning module leverages uncertainty guided continual learning to incrementally train assembly case studies with the learning rate of each parameter fine-tuned by parameter standard deviation
The Transfer Learning module is built considering the fact that across different case studies in manufacturing there will be similar deviation patterns that occur in each case study. Within sheet metal assembly systems clamp movement, repositioning errors, joining errors have a similar deviation signature that once learned can be leveraged across different case studies. Similar logic follows for other manufacturing applications such as stamping, machining, additive manufacturing etc. Initial tests done by us have given extremely promising results when transferring case studies between sheet metal manufacturing systems.
Given the success of Transfer Learning in fields such as radiology, medical scan segmentation using state of the art architectures such as Mask R-CNN, the work inspires to do the same when it comes to 3D point cloud learning.
The work aims to reproduce similar results on manufacturing 3D cloud-of-point using key 3D CNN architectures developed for Object detection and Medical segmentation by leveraging architectures such as Voxnet, 3D U-Net
- Currently the dlmfg library is integrated with three transfer learning modes, which can be be specified in the model configuration file within the transfer learning dictionary:
Full Fine Tune (full_fine_tune): Replaces the final layer with the required output of the target case study and then fine tunes all the weights (Convolution Layers, Dense Layers) the whole model on the new small dataset of the target case study
Variable Learning Rate (variable_lr): Replaces the final layer with the required output of the target case study and then fine tunes the convolution layer and dense layer at different learning rates. This is done using a Learning Rate Multiplier extension (Refer: https://pypi.org/project/keras-lr-multiplier/) which integrated a learning rate multiplier for each layer in the network. Two additional parameters are given as input for this case (can be changed in the model configuration file), conv_layer_m: Convolution layer multiplier (default value: 0.1), which restricts the learning rate of convolution layers to 10% of the overall learning rate, dense_layer_m: Dense Layer Multiplier (default value:1), which trains the dense layers at the network learning rate
Feature Extractor (feature_extractor): Replaces the final layer with the required output of the target case study and then freezes the convolution layer to make them feature extractors
Refer the following for more details:
A survey on Deep Learning Advances on Different 3D Data Representations (https://arxiv.org/pdf/1808.01462.pdf)
VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition (https://www.ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf)
3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation (https://arxiv.org/abs/1606.06650)
Contains core classes and methods for initializing a Transfer Learning Class and running transfer learning using a pre-trained model and a small training dataset, the inputs are provided in assemblyconfig file in utilities
-
class
dlmfg.transfer_learning.tl_core.
TransferLearning
(tl_type, tl_base, tl_app, model_type, output_dimension, optimizer, loss_function, regularizer_coeff, output_type)[source]¶ Transfer Learning Class
- Parameters
tl_type (str (required)) – Type of transfer learning to be done, full fine-tune, partial fine-tune, feature extraction
tl_base (str (required)) – The base model to be used for model
tl_app (str (required)) – The application for transfer learning
model_type (str (required)) – The type of model, regression or classification
output_dimension (int (required)) – The number of KCCs for the case study to which the pre-trained model is to be transfered, to be used to reinitialize the last layer
optimizer (keras.optimizer (required)) – The optimizer to be used for model training (https://keras.io/optimizers/)
loss_function (keras.losses (required)) – The loss function to be used for model training (https://keras.io/losses/)
regularizer_coeff (float (required)) – The regularization coefficient for L2 norm regularization of the fully connected layer (https://keras.io/regularizers/)
output_type (str (required)) – The type of model, regression or classification
-
build_transfer_model
(model)[source]¶ The build_transfer_function takes the pre-trained model removes the final layer and adds another layer based on the new case study parameters, which is trained on a small dataset obtained from the new case study
- Parameters
model (keras.model (required)) – keras model with preset parameters
- Returns
Updated model with new final layer
- Return type
keras.model
-
get_trained_model
()[source]¶ Imports the pre-trained model based on the object initialization, currently supports Keras modelname.h5 format (refer https://keras.io/models/model/ for more information on keras model)
- Returns
Pre-trained model with weights
- Return type
keras.model
-
set_fixed_train_params
(model)[source]¶ The set_fixed_params function is used to freeze the weights of the convolution layer, hence the initial part of the network is to be used only as a feature extractor
- Parameters
model (keras.model (required)) – keras model with preset parameters
- Returns
Updated model with non trainable convolution layers
- Return type
keras.model
-
set_variable_learning_rates
(model, conv_layer_m, dense_layer_m)[source]¶ The set_fixed_params function is used to freeze the weights of the convolution layer if the initial part of the network is to be used only as a feature extractor
- Parameters
- Returns
Updated model with variable learning rates
- Return type
keras.model