How you can Save Educated Mannequin in Python

When engaged on real-world machine studying (ML) use circumstances, finding the best algorithm/model is just not the tip of your duties. It’s essential to save lots of, retailer, and bundle these fashions for his or her future use and deployment to manufacturing.

These practices are wanted for numerous causes:

  • Backup: A skilled mannequin could be saved as a backup in case the unique information is broken or destroyed. 
  • Reusability & reproducibility: Constructing ML fashions is time-consuming by nature. To save lots of price and time, it turns into important that your mannequin will get you a similar outcomes each time you run it. Saving and storing your mannequin the suitable approach takes care of this.
  • Deployment: When deploying a trained model in a real-world setting, it turns into essential to bundle it for simple deployment. This makes it attainable for different programs and purposes to make use of the identical mannequin with out a lot problem.

To reiterate, whereas saving and storing ML fashions enable ease of sharing, reusability, and reproducibility; packaging the fashions permits fast and painless deployment. These 3 operations work in concord to simplify the entire mannequin administration course of. 

On this article, you’ll find out about totally different strategies of saving, storing, and packaging a skilled machine-learning mannequin, together with the professionals and cons of every technique. However earlier than that, you have to perceive the excellence between these three phrases. 

Save vs bundle vs retailer ML fashions

Though all these phrases look comparable, they aren’t the identical. 

Saving vs Storing vs Packaging ML Fashions | Supply: Creator

Saving a mannequin refers back to the means of saving the mannequin’s parameters, weights, and so on., to a file. Often, all ML and DL fashions present some sort of technique (eg. for saving the fashions. However you should be conscious that save is a single motion and offers solely a mannequin binary file, so you continue to want code to make your ML software production-ready.

Packaging, however, refers back to the means of bundling or containerizing the required elements of a mannequin, such because the mannequin file, dependencies, configuration information, and so on., right into a single deployable bundle. The aim of a bundle is to make it simpler to distribute and deploy the ML mannequin in a manufacturing surroundings. 

As soon as packaged, a mannequin could be deployed throughout totally different environments, which permits the mannequin for use in varied manufacturing settings reminiscent of net purposes, cell purposes, and so on. Docker is among the instruments which lets you do that.

Storing the ML mannequin refers back to the means of saving the skilled mannequin information in a centralized storage that may be accessed anytime when wanted. When storing a mannequin, you usually select some kind of storage from the place you possibly can fetch your mannequin and use it anytime. The mannequin registry is a class of instruments that resolve this problem for you.

Now let’s see how we are able to save our mannequin.

How you can save a skilled mannequin in Python?

On this part, you will note alternative ways of saving machine studying (ML) in addition to deep studying (DL) fashions. To start with, let’s create a easy classification mannequin utilizing probably the most well-known Iris-dataset

Be aware: The main focus of this text is to not present you how one can create one of the best ML mannequin however to elucidate how successfully it can save you skilled fashions. 

You first have to load the required dependencies and the iris dataset as follows:

import pandas as pd 

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix

url = "iris.information"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

dataset = pd.read_csv(url, names=names) 


Subsequent, it is advisable to break up the info into coaching and testing units and apply the required preprocessing phases, reminiscent of characteristic standardization. 

X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values 

X_train, X_test, y_train, y_test = train_test_split(X, 
                                                    y, test_size=0.20) 

scaler = StandardScaler()

X_train = scaler.rework(X_train)
X_test = scaler.rework(X_test) 

Lastly, it is advisable to practice a classification mannequin (be happy to decide on any) on coaching information and test its efficiency on testing information. 

mannequin = KNeighborsClassifier(n_neighbors=5)
mannequin.match(X_train, y_train) 

y_predict = mannequin.predict(X_test)

print(confusion_matrix(y_test, y_predict))
print(classification_report(y_test, y_predict)) 

Iris Classification Results
Iris classification outcomes | Supply: Creator

Now you will have an ML mannequin that you simply need to save for future use. The primary strategy to save an ML mannequin is by utilizing the pickle file. 

Saving skilled mannequin with pickle

The pickle module can be utilized to serialize and deserialize the Python objects. Pickling is the method of changing a Python object hierarchy right into a byte stream, whereas Unpickling is the method of changing a byte stream (from a binary file or different object that seems to be made from bytes) again to an object hierarchy.

For saving the ML fashions used as a pickle file, it is advisable to use the Pickle module that already comes with the default Python set up. 

To save lots of your iris classifier mannequin you merely have to resolve on a filename and dump your mannequin to a pickle file like this:

import pickle

model_pkl_file = "iris_classifier_model.pkl"  

with open(model_pkl_file, 'wb') as file:  
    pickle.dump(mannequin, file)

As you possibly can see the file is opened in wb (write binary) mode for saving the mannequin as bytes. Additionally, the dump() technique shops the mannequin within the given pickle file. 

You can too load this mannequin utilizing the load() technique of the pickle module. Now it is advisable to open the file in rb (learn binary) mode to load the saved mannequin.

with open(model_pkl_file, 'rb') as file:  
    mannequin = pickle.load(file)

y_predict = mannequin.predict(X_test)

print(classification_report(y_test, y_predict)) 

As soon as loaded you need to use this mannequin to make predictions. 

Another Iris Classification Result
Iris classification outcome | Supply: Creator

Professionals of the Python pickle strategy 

  • 1
    Pickling comes as the usual module in Python which makes it simple to make use of for saving and restoring ML fashions.
  • 2
    Pickle information can deal with most Python objects together with customized objects, making it a flexible strategy to save fashions.
  • 3
    For small fashions, pickle strategy is sort of quick and environment friendly. 
  • 4
    When an ML mannequin is unpickled, it’s restored to its earlier state, together with any variables or configurations. This makes Python pickle information among the finest options for saving ML fashions. 

Cons of the Python Pickle Method

  • 1
    Should you unpickle untrusted information, pickling might pose a safety risk. Unpickling an object can execute malicious code, so it’s essential to solely unpickle info from dependable sources.
  • 2
    Pickled objects’ use could also be constrained in some circumstances since they can’t be transferred between totally different Python variations or working programs.
  • 3
    For fashions with a giant reminiscence footprint, pickling may end up in the creation of big information, which could be problematic.
  • 4
    Pickling could make it tough to trace adjustments to a mannequin over time, particularly if the mannequin is up to date steadily and it isn’t possible to create a number of pickle information for various variations of fashions that you simply strive. 

Pickle is most suited to small-size fashions and likewise has some safety points, these causes are sufficient to search for one other different for saving the ML fashions. Subsequent, let’s talk about Joblib to save lots of and cargo ML fashions. 

Be aware: Within the upcoming sections you will note the identical iris classifier mannequin to be saved utilizing totally different strategies. 

Saving skilled mannequin with Joblib

Joblib is a set of instruments (sometimes a part of the Scipy ecosystem) that present light-weight pipelining in Python. It majorly focuses on disk-caching, memoization, and parallel computing and is used for saving and loading Python objects. Joblib has been particularly optimized for NumPy arrays to make it quick and dependable for ML fashions which have lots of parameters.

To save lots of massive fashions with Joblib, it is advisable to use the Python Joblib module that comes preinstalled with Python.

import joblib 

filename = 'joblib_model.sav'
joblib.dump(mannequin, filename)

To save lots of the mannequin, it is advisable to outline a filename with a ‘.sav’ or ‘.pkl’ extension and name the dump() technique from Joblib. 

Much like pickle, Joblib supplies the load() technique to load the saved ML mannequin. 

loaded_model = joblib.load(filename)

y_predict = mannequin.predict(X_test)

print(classification_report(y_test, y_predict)) 

After loading the mannequin with Joblib you might be free to apply it to the info to make predictions. 

Iris classification results
Iris classification outcomes | Supply: Creator

Professionals of saving ML fashions with Joblib 

  • 1
    Quick and efficient efficiency is a key part of Joblib, particularly for fashions with substantial reminiscence necessities.
  • 2
    The serialization and deserialization course of could be parallelized by way of Joblib, which may improve efficiency on multi-core machines.
  • 3
    For fashions that demand lots of reminiscence, Joblib employs a memory-mapped file format to cut back reminiscence utilization.
  • 4
    Joblib presents varied safety features, reminiscent of a whitelist of safe capabilities that may be utilized throughout deserialization, to help safeguard towards untrusted information.

Cons of Saving ML Fashions with Joblib 

  • 1
    Joblib is optimized for numpy arrays, and should not work as nicely with different object sorts.
  • 2
    Joblib presents much less flexibility than Pickle as a result of there are fewer choices obtainable for configuring the serialization course of.
  • 3
    In comparison with Pickle, Joblib is much less well-known, which may make it tougher to find assist and documentation round it.

Though Joblib solves the main points confronted by pickle, it has some points by itself. Subsequent, you will note how one can manually save and restore the fashions utilizing JSON. 

Saving skilled mannequin with JSON

Once you need to have full management over the save and restore process of your ML mannequin, JSON comes into play. Not like the opposite two strategies, this technique doesn’t immediately dump the ML mannequin to a file; as an alternative, it is advisable to explicitly outline the totally different parameters of your mannequin to save lots of them. 

To make use of this technique, it is advisable to use the Python json module that once more comes together with the default Python set up. Utilizing the JSON technique requires extra effort to write down all parameters that an ML mannequin comprises. To save lots of the mannequin utilizing JSON, let’s create a operate like this: 

import json 

def save_json(mannequin, filepath, X_train, y_train): 
    saved_model = {}
    saved_model["algorithm"] = mannequin.get_params()['algorithm'],
    saved_model["max_iter"] = mannequin.get_params()['leaf_size'],
    saved_model["solver"] = mannequin.get_params()['metric'],
    saved_model["metric_params"] = mannequin.get_params()['metric_params'],
    saved_model["n_jobs"] = mannequin.get_params()['n_jobs'],
    saved_model["n_neighbors"] = mannequin.get_params()['n_neighbors'],
    saved_model["p"] = mannequin.get_params()['p'],
    saved_model["weights"] = mannequin.get_params()['weights'],
    saved_model["X_train"] = X_train.tolist() if X_train is not None else "None",
    saved_model["y_train"] = y_train.tolist() if y_train is not None else "None"
    json_txt = json.dumps(saved_model, indent=4)
    with open(filepath, "w") as file: 

file_path = 'json_model.json'
save_json(mannequin, file_path, X_train, y_train)

You see how it is advisable to outline every mannequin parameter and the info to retailer it in JSON. Completely different fashions have totally different strategies to take a look at the parameter particulars. For instance, the get_params() for KNeighboursClassifier provides the checklist of all of the hyperparameters within the mannequin. You might want to save all these hyperparameters and information values in a dictionary which is then dumped right into a file with the ‘.json’ extension. 

To learn this JSON file you simply have to open it and entry the parameters as follows:

def load_json(filepath): 
    with open(filepath, "r") as file:
        saved_model = json.load(file)
    return saved_model

saved_model = load_json('json_model.json')

Within the above code, a operate load_json() is created that opens the JSON file in learn mode and returns all of the parameters and information as a dictionary. 

JSON Loaded Model
JSON Loaded Mannequin | Supply: Creator

Sadly, you cannot use the saved mannequin immediately with JSON, it is advisable to learn these parameters and information to retrain the mannequin all by your self. 

Professionals of saving ML fashions with JSON 

  • 1
    Fashions that should be exchanged between varied programs could be achieved so utilizing JSON, which is a conveyable format that may be learn by all kinds of programming languages and platforms.
  • 2
    JSON is a text-based format that’s simple to learn and perceive, making it a good selection for fashions that should be inspected or edited by people.
  • 3
    Compared to Pickle or Joblib, JSON is a light-weight format that creates smaller information, which could be essential for fashions that should be transferred over the web.
  • 4
    Not like pickle, which executes code throughout deserialization, JSON is a safe format that minimizes safety threats.

Cons of Saving ML Fashions with JSON

  • 1
    As a result of JSON solely helps a small variety of information sorts, it couldn’t be appropriate with refined machine studying fashions that make use of distinctive information sorts.
  • 2
    Specifically, for big fashions, JSON serialization and deserialization could be slower than different codecs.
  • 3
    In comparison with different codecs, JSON presents much less flexibility and should take extra effort to tailor the serialization process.
  • 4
    JSON is a lossy format that will not protect all the info within the authentic mannequin, which could be a drawback for fashions that require precise replication.

To make sure safety and JSON/pickle advantages, it can save you your mannequin to a devoted database. Subsequent, you will note how one can save an ML mannequin in a database. 

Saving deep studying mannequin with TensorFlow Keras

TensorFlow is a well-liked framework for coaching DL-based fashions, and Keras is a wrapper for TensorFlow. A neural community design with quite a few layers and a set of labeled information are used to coach deep studying fashions. These fashions have two main elements, Weights and Community structure, that it is advisable to save to revive them for future use. Usually there are two methods to save lots of deep studying fashions:

  1. Save the mannequin structure in a JSON or YAML file and weights in an HDF5 file. 
  2. Save each mannequin and structure each in HDF5, protobuf, or tflite file. 

You’ll be able to check with any a method to do that, however the broadly used technique is to save lots of the mannequin weights and structure collectively in an HDF5 file. 

To save lots of a deep studying mannequin in TensorFlow Keras, you need to use the save() technique of the Keras Mannequin object. This technique saves your entire mannequin, together with the mannequin structure, optimizer, and weights, in a format that may be loaded later to make predictions.

Right here’s an instance code snippet that reveals how one can save a TensorFlow Keras-based DL mannequin:

from tensorflow.keras.fashions import Sequential, model_from_json
from tensorflow.keras.layers import Dense

mannequin = Sequential()
mannequin.add(Dense(12, input_dim=4, activation='relu'))
mannequin.add(Dense(8, activation='relu'))
mannequin.add(Dense(1, activation='sigmoid'))

mannequin.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

mannequin.match(X_train, y_train, epochs=150, batch_size=10, verbose=0)'mannequin.h5')

That is it, you simply have to outline the mannequin structure, practice the fashions with acceptable settings, and at last put it aside utilizing the save() technique. 

Loading the saved fashions with Keras is as simple as studying a file in Python. You simply have to name the load_model() technique by offering the mannequin file path and your mannequin will likely be loaded. 

from tensorflow.keras.fashions import load_model

mannequin = load_model('mannequin.h5')


Your mannequin is now loaded to be used. 

Tensorflow loaded model
Tensorflow loaded mannequin | Supply: Creator

Professionals of saving fashions with TensorFlow Keras 

  • 1
    Saving and loading fashions in TensorFlow Keras may be very simple utilizing the save() and load_model() capabilities. This makes it simple to save lots of and share fashions with others or to deploy them to manufacturing.
  • 2
    The entire mannequin structure, optimizer, and weights are saved in a single file once you save a Keras mannequin. Without having to trouble about loading the structure and weights individually, it’s easy to load the mannequin and generate predictions.
  • 3
    TensorFlow Keras helps a number of file codecs for saving fashions, together with the HDF5 format (.h5), the TensorFlow SavedModel format (.pb), and the TensorFlow Lite format (.tflite). This provides you flexibility in selecting the format that most closely fits your wants.

Cons of Saving Fashions with TensorFlow Keras 

  • 1
    Once you save a Keras mannequin, the ensuing file could be fairly massive, particularly in case you have a lot of layers or parameters. This could make it difficult to share or deploy the mannequin, particularly in conditions the place bandwidth or cupboard space is proscribed.
  • 2
    Fashions saved with one model of TensorFlow Keras couldn’t work with one other. Should you attempt to load a mannequin that was saved with a special model of Keras or TensorFlow, this will end in issues. 
  • 3
    Though it’s easy to save lots of a Keras mannequin, you’re solely in a position to make use of the options that Keras presents for storing fashions. A special framework or technique could also be required when you require extra flexibility in the way in which fashions are saved or loaded.

There may be yet one more broadly used framework named Pytorch for coaching the DL-based fashions. Let’s test how one can save Pytorch-based deep studying fashions with Python. 

Saving deep studying mannequin with Pytorch

Developed by Fb, Pytorch is among the extremely used frameworks for creating DL-based options. It supplies a dynamic computational graph, which lets you modify your mannequin on-the-fly, making it superb for analysis and experimentation. It makes use of ‘.pt’ and ‘.pth’ file codecs to save lots of mannequin structure and its weights. 

To save lots of a deep studying mannequin in PyTorch, you need to use the save() technique of the PyTorch torch.nn.Module object. This technique saves your entire mannequin, together with the mannequin structure and weights, in a format that may be loaded later to make predictions.

Right here’s an instance code snippet that reveals how one can save a PyTorch mannequin:

import torch
import torch.nn as nn
import numpy as np

X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)

class NeuralNetworkClassificationModel(nn.Module):
    def __init__(self,input_dim,output_dim):
        self.input_layer    = nn.Linear(input_dim,128)
        self.hidden_layer1  = nn.Linear(128,64)
        self.output_layer   = nn.Linear(64,output_dim)
        self.relu = nn.ReLU()
    def ahead(self,x):
        out =  self.relu(self.input_layer(x))
        out =  self.relu(self.hidden_layer1(out))
        out =  self.output_layer(out)
        return out

input_dim  = 4 
output_dim = 3
mannequin = NeuralNetworkClassificationModel(input_dim,output_dim)

learning_rate = 0.01
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(mannequin.parameters(),lr=learning_rate)

def train_network(mannequin,optimizer,criterion,X_train,y_train,X_test,y_test,num_epochs,train_losses,test_losses):
    for epoch in vary(num_epochs):
        output_train = mannequin(X_train)

        loss_train = criterion(output_train, y_train)


        output_test = mannequin(X_test)
        loss_test = criterion(output_test,y_test)

        train_losses[epoch] = loss_train.merchandise()
        test_losses[epoch] = loss_test.merchandise()

        if (epoch + 1) % 50 == 0:
            print(f"Epoch { epoch+1 }/{ num_epochs }, Prepare Loss: { loss_train.merchandise():.4f }, Check Loss: {loss_test.merchandise():.4f}")

num_epochs = 1000
train_losses = np.zeros(num_epochs)
test_losses  = np.zeros(num_epochs)
train_network(mannequin,optimizer,criterion,X_train,y_train,X_test,y_test,num_epochs,train_losses,test_losses), '')

Not like Tensorflow, Pytorch lets you have extra management over the mannequin coaching, as seen within the above code. After coaching the mannequin, it can save you the weights and their structure utilizing save() technique. 

Loading the saved mannequin with Pytorch requires the usage of load() technique. 

mannequin = torch.load('')


Pytorch loaded model
Pytorch loaded mannequin | Supply: Creator

Professionals of saving fashions with Pytorch 

  • 1
    The computational graph utilized by PyTorch is dynamic, that means it’s constructed as this system is run. This enables for extra flexibility in modifying the mannequin throughout coaching or inference.
  • 2
    For dynamic fashions, reminiscent of these with variable-length inputs or outputs, that are frequent in pure language processing (NLP) and pc imaginative and prescient, PyTorch presents improved assist.
  • 3
    Provided that PyTorch is written in Python and capabilities nicely with different Python libraries like NumPy and pandas, manipulating information each earlier than and after coaching is straightforward.

Cons of Saving Fashions with Pytorch 

  • 1
    Despite the fact that PyTorch supplies an accessible API, there could also be a steep studying curve for newcomers to deep studying or Python programming.
  • 2
    Since PyTorch is basically a framework for analysis, it may not have as many instruments for manufacturing deployment as different deep studying frameworks like TensorFlow or Keras.

This isn’t it, you need to use mannequin registry platforms to save lots of DL-based fashions as nicely, specifically those with massive measurement. This makes it simple to deploy and keep them with out requiring additional effort from builders. 

You’ll find the dataset and code used on this article here

How you can bundle ML fashions?

An ML mannequin is usually optimized for efficiency on the coaching dataset and the particular surroundings through which it’s skilled. However, relating to deploying the fashions in numerous environments, reminiscent of a manufacturing surroundings, there may very well be varied challenges.

These challenges are however not restricted to variations in {hardware}, software program, and information inputs. Packaging the mannequin makes it simpler to deal with these drawback, because it permits the mannequin to be exported or serialized into a normal format that may be loaded and utilized in varied environments.

There are numerous choices obtainable for packaging proper now. By packaging the mannequin in a normal format reminiscent of PMML (Predictive Model Markup Language), ONNX, TensorFlow SavedModel format, and so on. it turns into simpler to share and collaborate on a mannequin with out worrying about totally different libraries and instruments utilized by totally different groups. Now, let’s test just a few examples of packaging an ML mannequin with totally different frameworks in Python.

Be aware: For this part as nicely, you will note the identical iris-classification instance.

Packaging fashions with PMML

Utilizing the PMML library in Python, you possibly can export your machine studying fashions to PMML format after which deploy that as an internet service, a batch processing system, or an information integration platform. This could make it simpler to share and collaborate on machine studying fashions, in addition to to deploy them in varied manufacturing environments.

To bundle an ML mannequin utilizing PMML you need to use totally different modules like sklearn2pmml, jpmml-sklearn, jpmml-tensorflow, and so on.

Be aware: To make use of PMML, you have to have Java Runtime put in in your system.

Right here is an instance code snippet that lets you bundle the skilled iris classifier mannequin utilizing PMML. 

from sklearn2pmml import PMMLPipeline, sklearn2pmml


Within the above code, you merely have to create a PMML pipeline object by passing your mannequin object. Then it is advisable to save the PMML object utilizing sklearn2pmml() technique. That’s it, now you need to use this “iris_model.pmml” file throughout totally different environments.  

Professionals of utilizing PMML 

  • 1
    Since PMML is a platform-independent format, PMML fashions could be built-in with quite a few information processing platforms and utilized in quite a lot of manufacturing conditions.
  • 2
    PMML can scale back vendor lock-in because it permits customers to export and import fashions from totally different machine-learning platforms.
  • 3
    PMML fashions could be simply deployed in manufacturing environments as they are often built-in with varied information processing platforms and programs.

Cons of utilizing PMML

  • 1
    Some machine studying fashions and algorithms might not have the ability to be exported in PMML format on account of the restricted assist. 
  • 2
    PMML is an XML-based format that may be verbose and rigid, which can make it tough to switch or replace fashions after they’ve been exported in PMML format.
  • 3
    It could be tough to create PMML fashions, particularly for classy fashions with a number of options and interactions.

Packaging fashions with ONNX

Developed by Microsoft and Fb, ONNX (Open Neural Community Change) is an open format for representing machine studying fashions. It permits for interoperability between totally different deep-learning frameworks and instruments. 

ONNX fashions could be deployed effectively on quite a lot of platforms, together with cell units, edge units, and the cloud. It helps quite a lot of runtimes, together with Caffe2, TensorFlow, PyTorch, and MXNet, which lets you deploy your fashions on totally different units and platforms with minimal effort.

To save lots of the mannequin utilizing ONNX, it is advisable to have onnx and onnxruntime packages downloaded in your system.

Right here is an instance of how one can convert the prevailing ML mannequin to ONNX format.

import onnxmltools
import onnxruntime

onnx_model = onnxmltools.convert_sklearn(mannequin)

onnx_file = "iris_knn.onnx"
onnxmltools.utils.save_model(onnx_model, onnx_file)

You simply have to import the required modules and use the convert_sklearn() technique to corvet the sklearn mannequin to the ONNX mannequin. As soon as the conversion is completed, utilizing the save_model() technique, you possibly can retailer the ONNX mannequin in a file with the “.onnx” extension. Though right here you see an instance of an ML mannequin, ONNX is majorly used for DL fashions. 

You can too load this mannequin utilizing the ONNX Runtime module.

sess = onnxruntime.InferenceSession(onnx_file)

input_data = {"X": X_test[:10].astype('float32')}
output =, input_data)

You might want to create a session utilizing InferenceSession() technique to load the ONNX mannequin from a file after which use technique to make predictions from the mannequin. 

Professionals of utilizing ONNX

  • 1
    With little effort, ONNX fashions can simply be deployed on numerous platforms, together with cell units and the cloud. It’s easy to deploy fashions on varied {hardware} and software program platforms because of ONNX’s assist for a variety of runtimes. 
  • 2
    ONNX fashions are optimized for efficiency, which signifies that they’ll run quicker and devour fewer sources than fashions in different codecs.

Cons of utilizing ONNX 

  • 1
    ONNX is primarily designed for deep studying fashions and might not be appropriate for different kinds of machine studying fashions.
  • 2
    ONNX fashions might not be appropriate with all variations of various deep studying frameworks, which can require extra effort to make sure compatibility.

Packaging fashions with Tensorflow SavedModel

Tensorflow’s SavedModel format lets you simply save and cargo your deep studying fashions, and it ensures compatibility with different Tensorflow instruments and platforms. Moreover, it supplies a streamlined and environment friendly strategy to deploy our fashions in manufacturing environments. 

SavedModel helps a variety of deployment eventualities, together with serving fashions with Tensorflow Serving, deploying fashions to cell units with Tensorflow Lite, and exporting fashions to different ML libraries reminiscent of ONNX.

 It supplies a easy and streamlined strategy to save and cargo Tensorflow fashions. The API is straightforward to make use of and well-documented, and the format is designed to be environment friendly and scalable.

Be aware: You need to use the identical TensorFlow mannequin skilled within the above part.

To save lots of the mannequin in SavedModel format, you need to use the next strains of code:

import tensorflow as tf, "my_model")

You can too load the mannequin with load() technique. 

loaded_model = tf.saved_model.load("my_model")

Professionals of utilizing Tensorflow SavedModel 

  • 1
    SavedModel is platform-independent and version-compatible, which makes it simple to share and deploy fashions throughout totally different platforms and variations of TensorFlow.
  • 2
    A wide range of deployment eventualities are supported by SavedModel, together with exporting fashions to different ML libraries like ONNX, serving fashions with TensorFlow Serving, and distributing fashions to cell units utilizing TensorFlow Lite.
  • 3
    SavedModel is optimized for coaching and inference, with assist for distributed coaching and the flexibility to make use of GPUs and TPUs to speed up coaching.

Cons of utilizing Tensorflow SavedModel

  • 1
    SavedModel information could be massive, notably for advanced fashions, which may make them tough to retailer and switch.
  • 2
    Provided that SavedModel is unique to TensorFlow, its compatibility with different ML libraries and instruments could also be constrained.
  • 3
    The saved mannequin is a binary file that may be tough to examine, making it more durable to grasp the small print of the mannequin’s structure and operation.

Now that you’ve got seen a number of methods of packaging ML and DL fashions, you have to additionally bear in mind that there are numerous instruments obtainable that present infrastructure to bundle, deploy and serve these fashions. Two of the favored ones are BentoML and MLFlow.


BentoML is a versatile framework for constructing and deploying production-ready machine studying companies. It permits information scientists to packaging their skilled fashions, their dependencies, and the infrastructure code required to serve the mannequin right into a reusable bundle known as a “Bento”.

BentoML helps varied machine studying frameworks and deployment platforms and supplies a unified API for managing the lifecycle of the mannequin. As soon as a mannequin is packaged as a Bento, it may be deployed to numerous serving platforms like AWS Lambda, Kubernetes, or Docker. BentoML additionally presents an API server that can be utilized to serve the mannequin by way of a REST API. You’ll be able to know extra about it here


MLFlow is an open-source platform for managing the end-to-end machine studying lifecycle. It supplies a complete set of instruments for monitoring experiments, packaging code, and dependencies, and deploying fashions. 

MLFlow permits information scientists to simply bundle their fashions in a normal format that may be deployed to numerous platforms like AWS SageMaker, Azure ML, and Google Cloud AI Platform. The platform additionally supplies a mannequin registry to handle mannequin variations and observe their efficiency over time. Moreover, MLFlow presents a REST API for serving fashions, which could be simply built-in into net purposes or different companies.

How you can retailer ML fashions?

Now that we learn about saving fashions let’s see how we are able to retailer them to facilitate their fast and straightforward retrieval.

Storing ML fashions in a database

There may be additionally scope so that you can save your ML fashions in relational databases PostgreSQL, MySQL, Oracle SQL, and so on. or NoSQL databases like MongoDB, Cassandra, and so on. The selection of database completely depends upon elements reminiscent of the kind and quantity of knowledge being saved, the efficiency and scalability necessities, and the particular wants of the applying. 

PostgreSQL is a well-liked selection when engaged on ML fashions that present assist for storing and manipulating structured information. Storing ML fashions in PostgreSQL supplies a simple strategy to hold observe of various variations of a mannequin and handle them in a centralized location. 

Moreover, it permits for simple sharing of fashions throughout a group or group. Nevertheless, it’s essential to notice that storing massive fashions in a database can improve database measurement and question occasions, so it’s essential to think about the storage capability and efficiency of your database when storing fashions in PostgreSQL.

To save lots of an ML mannequin in a database like PostgreSQL, it is advisable to first Convert the skilled mannequin right into a serialized format, reminiscent of a byte stream (pickle object) or JSON.

import pickle

model_bytes = pickle.dumps(mannequin)

Then open a connection to the database and create a desk or assortment to retailer the serialized mannequin. For this, it is advisable to use the psycopg2 library of Python, which helps you to hook up with the PostgreSQL database. You’ll be able to obtain this library with the assistance of the Python bundle installer. 

$ pip set up psycopg2-binary

Then it is advisable to set up a connection to the database to retailer the ML mannequin like this:

import psycopg2

conn = psycopg2.join(
  database="database-name", person=user-name, password='your-password', host="", port= '5432'

To carry out any operation on the database, it is advisable to create a cursor object that can assist you to to execute queries in your Python program.

With the assistance of this cursor, now you can execute the CREATE TABLE question to create a brand new desk.

cur.execute("CREATE TABLE fashions (id INT PRIMARY KEY NOT NULL, title CHAR(50), mannequin BYTEA)")

Be aware: Guarantee that the mannequin object sort is BYTEA. 

Lastly, you possibly can retailer the mannequin and different metadata info utilizing the INSERT INTO command. 

cur.execute("INSERT INTO fashions (id, title, mannequin) VALUES (%s, %s, %s)", (1, 'iris-classifier', model_bytes))


As soon as all of the operations are achieved, shut the cursor and connection to the database. 

Lastly, to learn the mannequin from the database, you need to use the SELECT command by filtering the mannequin both on title or id. 

import psycopg2
import pickle

conn = psycopg2.join(
  database="database-name", person=user-name, password='your-password', host="", port= '5432'

cur = conn.cursor()
cur.execute("SELECT mannequin FROM fashions WHERE title = %s", ('iris-classifier',))
model_bytes = cur.fetchone()[0]

mannequin = pickle.hundreds(model_bytes)


As soon as the mannequin is loaded from the database, you need to use it to make predictions as follows:

y_predict = mannequin.predict(X_take a look at)

print(classification_report(y_take a look at, y_predict)) 

That is it, you will have the mannequin saved and loaded from the database. 

Professionals of storing ML fashions in a database 

  • 1
    Storing ML fashions in a database supplies a centralized storage location that may be simply accessed by a number of purposes and customers.
  • 2
    Since most organizations have already got databases in place, integrating ML fashions into the prevailing infrastructure turns into simpler.
  • 3
    Databases are optimized for information retrieval, which signifies that retrieving the ML fashions is quicker and extra environment friendly.
  • 4
    Databases are designed to offer strong safety features reminiscent of authentication, authorization, and encryption. This ensures that the saved ML fashions are safe.

Cons of storing ML fashions in a database

  • 1
    Databases are designed for storing structured information and aren’t optimized for storing unstructured information reminiscent of ML fashions. Consequently, there could also be limitations by way of mannequin measurement, file codecs, and different features of ML fashions that can’t be accommodated by databases.
  • 2
    Storing ML fashions in a database could be advanced and requires experience in each database administration and machine studying.
  • 3
     If the ML fashions are massive, storing them in a database might result in scalability points. Moreover, the retrieval of enormous fashions might affect the efficiency of the database.

Whereas pickle, joblib, and JSON are widespread methods to save lots of machine studying fashions, they’ve limitations relating to versioning, sharing, and managing machine studying fashions. Right here ML mannequin registries come to the rescue and resolve all the problems confronted by the options. 

Subsequent, you will note how saving ML fashions within the mannequin registry will help you obtain reproducibility and reusability. 

Storing ML fashions in mannequin registry

  • A model registry is a central repository that may retailer, model, and handle machine studying fashions. 
  • It sometimes contains options like model versioning, metadata management, evaluating mannequin runs, and so on. 
  • When engaged on any ML or DL tasks, it can save you and retrieve the fashions and their metadata from the mannequin registry anytime you need. 
  • Above all, mannequin registries allow excessive collaboration amongst group members. 

Try this article to learn more about model registry.  

There are numerous choices for the mannequin registry, for instance,, Mlflow, Kubeflow, and so on. Though all these platforms have some distinctive options on their very own, it’s slightly clever to decide on a registry that may offer you a large set of options. 

On this instance, I’ll use the Neptune. It has a mannequin registry performance developed for organizing, storing, and managing machine studying fashions. It’s an ideal choice for information scientists and ML engineers that have to handle their skilled fashions, as a result of it provides them collaboration options, user-friendly interface, and mannequin versioning capabilities. 

You’ll be able to set up a free account here or learn more about the tool here.

Register a mannequin to Neptune registry

After you have created a free account,  you possibly can click on on the New Venture button to begin a brand new undertaking. 

Creating a new project in Neptune
Create a brand new undertaking in Neptune | Supply: Creator

As soon as the undertaking creation is completed, you will note a web page with totally different configurations for saving the mannequin. With Neptune, you possibly can work with totally different frameworks like Scikit-Learn, Keras, Tensorflow, Pytorch, and more.

To retailer fashions within the Neptune mannequin registry, it is advisable to install the library:

Be aware: Just be sure you have dumped your skilled mannequin in a file utilizing the pickle or joblib module to retailer it within the mannequin registry. 

As soon as the dependency is put in it is advisable to import it into your program and initialize the Neptune model by offering it a reputation, a novel key (in capital letters), and your Neptune credentials. You’ll find all of this info within the mannequin metadata tab of a Neptune undertaking.

import neptune

mannequin = neptune.init_model(
title="Prediction mannequin",

Within the above code, Neptune dependency is imported and a mannequin (which you need to retailer and observe with Neptune’s mannequin registry) is initialized with the Neptune credentials. Then it is advisable to assign the classification mannequin metadata to the Neptune mannequin object. 

model_info = {"size_limit": 7.09, "size_units": "KB"}
mannequin["model"] = model_info

Lastly, you possibly can add the mannequin to the Neptune mannequin registry utilizing the add() technique like this:


Moreover, you possibly can observe the dataset model utilizing the track_files() technique supplied by neptune. 



That is it, your mannequin and the dataset at the moment are saved to the registry. Additionally, don’t overlook to shut the session with the cease() technique. 

Stored model in Neptune
Examine saved mannequin in Neptune | Supply: Creator

Model a mannequin

Once you work on a real-world ML undertaking, you occur to strive lots of fashions and a mixture of parameters and hyperparameters. Should you don’t hold observe of this information, you may not know all issues you will have tried out, and there may very well be attainable rework. 

That is the place the Neptune mannequin registry helps you, as you possibly can register different versions of a model with only some strains of code. To start, it is advisable to initialize a ModelVersion object as follows:

import neptune
model_version = neptune.init_model_version(

Then you possibly can optionally save the mannequin and different metadata particulars in every mannequin model that you will register within the neptune registry. 

parameters = {
"algorithm": clf_model.get_params()['algorithm'],
"max_iter": clf_model.get_params()['leaf_size'],
"solver": clf_model.get_params()['metric'],
"metric_params": clf_model.get_params()['metric_params'],
"n_jobs": clf_model.get_params()['n_jobs'],
"n_neighbors": clf_model.get_params()['n_neighbors'],
"p": clf_model.get_params()['p'],
"weights": clf_model.get_params()['weights'],

model_version["model/parameters"] = parameters
model_version["validation/acc"] = 0.93


As soon as achieved, you possibly can cease the session with the cease() technique. 

Mannequin variations view within the Neptune app | Supply:

Question mannequin and metadata from registry 

Lastly, it is advisable to access this saved model and metadata when wanted. You’ll be able to load any particular mannequin model that you’ve got saved to the registry. For this, it is advisable to initialize a ModelVersion object by offering it with the mannequin model id. 

import neptune
import pickle

version_id = 'IR-IRMOD-1' 

model_version = neptune.init_model_version(

As soon as achieved, you possibly can entry totally different mannequin objects just like the mannequin, metadata, dataset, and so on. particulars that you simply had registered. To start with, let’s obtain the mannequin from the registry and put it aside regionally to check its efficiency on take a look at information.

if model_version.exists("mannequin/binary"):

with open(f"mannequin/{version_id}_model.pkl", 'rb') as file: 
clf_model_2 = pickle.load(file)

y_predict = clf_model_2.predict(X_test)

You can too test the mannequin metadata info that you’ve got saved in Neptune. 

Checking the model metadata information saved in Neptune
Checking the mannequin metadata info saved in Neptune | Supply: Creator

To obtain this metadata regionally, you need to use the next code:


That is it, you now learn about storing and loading particular fashions from the Neptune mannequin registry.

You’ll be able to learn extra in regards to the Neptune mannequin registry here.

If you wish to see how somebody does all these issues stay in Neptune, test this short model registry demo prepared by one of Neptune’s DevRels.

You can too take a look at this live example project by your self. It showcases varied options of Neptune (together with mannequin registry). It’s an open undertaking, so you possibly can play with the Neptune app earlier than you register to it.

Professionals of storing fashions with mannequin registry 

  • 1
    A centralized location for managing, storing, and version-controlling machine studying fashions.
  • 2
    Metadata concerning fashions, reminiscent of their model, efficiency metrics, and so on. are steadily included in mannequin registries, making it easier to observe adjustments and comprehend the mannequin’s previous.
  • 3
    Mannequin registries enable group members to collaborate on fashions and share their work simply.
  • 4
    Some mannequin registries present automated deployment choices, which may simplify the method of deploying fashions to manufacturing environments.
  • 5
    Mannequin registries typically present safety features reminiscent of entry management, encryption, and authentication, making certain that fashions are saved safe and solely accessible to approved customers.

Cons of storing fashions with mannequin registry 

  • 1
    A paid subscription is important for some mannequin registries, which raises the price of machine studying applications.
  • 2
    Mannequin registries typically have a studying curve, and it might take time to rise up to hurry with their performance and options.
  • 3
    Utilizing a mannequin registry might require integrating with different instruments and programs, which may create extra dependencies.

You’ve gotten now seen alternative ways of saving an ML mannequin (mannequin registry being probably the most optimum one), that is time to test some methods to save lots of the Deep Studying (DL) primarily based fashions. 

Greatest practices

On this part, you will note a number of the greatest practices for saving the ML and DL fashions. 

  • Guarantee Library Variations: Utilizing totally different library variations for saving and loading the fashions might create compatibility points as there may very well be some structural adjustments with the library replace. You have to make sure that library variations whereas loading the machine studying fashions must be the identical because the library variations used to save lots of the mannequin. 
  • Guarantee Python Variations: It’s a good observe to make use of the identical Python model throughout all phases of your ML pipeline growth. Typically adjustments within the Python model can create execution points, for instance, TensorflowV1 is supported up until Python 3.7, and when you attempt to use it with later variations, you’ll face the errors. 
  • Save Each Mannequin Structure and Weights: Within the case of DL-based fashions, when you save solely mannequin weight however not structure, then you cannot reconstruct the mannequin. Saving the mannequin structure together with the skilled weights ensures that the mannequin could be absolutely reconstructed and used in a while.
  • Doc the Mannequin: The aim, inputs, outputs, and anticipated efficiency of the mannequin must be documented. This could assist others in understanding the capabilities and constraints of the mannequin.
  • Use Mannequin Registry: Use a mannequin registry like to maintain observe of fashions, their variations, and metadata and to collaborate with group members. 
  • Maintain the Saved Mannequin Safe: Maintain the saved mannequin safe by encrypting it or storing it in a safe location, particularly if it comprises delicate information.


In conclusion, saving machine studying fashions is a crucial step within the growth course of, because it lets you reuse and share your fashions with others. There are a number of methods to save lots of machine studying fashions, every with its personal benefits and downsides. Some well-liked strategies embrace utilizing pickle, Joblib, JSON, TensorFlow save, and PyTorch save.

It is very important select the suitable file format to your particular use case and to observe greatest practices for saving and documenting fashions, reminiscent of model management, making certain language and library variations, and testing the saved mannequin. By following the practices mentioned on this article, you possibly can make sure that your machine-learning fashions are saved appropriately, are simple to reuse and deploy, and could be successfully shared with others.



Leave a Reply

Your email address will not be published. Required fields are marked *