Deploying machine studying (ML) fashions into manufacturing can usually be a fancy and resource-intensive job, particularly for patrons with out deep ML and DevOps experience. Amazon SageMaker Canvas simplifies mannequin constructing by providing a no-code interface, so you may create extremely correct ML fashions utilizing your present knowledge sources and with out writing a single line of code. However constructing a mannequin is simply half the journey; deploying it effectively and cost-effectively is simply as essential. Amazon SageMaker Serverless Inference is designed for workloads with variable visitors patterns and idle durations. It mechanically provisions and scales infrastructure primarily based on demand, assuaging the necessity to handle servers or pre-configure capability.
On this publish, we stroll by means of easy methods to take an ML mannequin inbuilt SageMaker Canvas and deploy it utilizing SageMaker Serverless Inference. This resolution may help you go from mannequin creation to production-ready predictions rapidly, effectively, and with out managing any infrastructure.
Resolution overview
To display serverless endpoint creation for a SageMaker Canvas skilled mannequin, let’s discover an instance workflow:
- Add the skilled mannequin to the Amazon SageMaker Model Registry.
- Create a brand new SageMaker mannequin with the right configuration.
- Create a serverless endpoint configuration.
- Deploy the serverless endpoint with the created mannequin and endpoint configuration.
You can even automate the method, as illustrated within the following diagram.

On this instance, we deploy a pre-trained regression mannequin to a serverless SageMaker endpoint. This manner, we are able to use our mannequin for variable workloads that don’t require real-time inference.
Stipulations
As a prerequisite, you need to have entry to Amazon Simple Storage Service (Amazon S3) and Amazon SageMaker AI. In the event you don’t have already got a SageMaker AI area configured in your account, you additionally want permissions to create a SageMaker AI domain.
You need to even have a regression or classification mannequin that you’ve got skilled. You possibly can practice your SageMaker Canvas mannequin as you usually would. This contains creating the Amazon SageMaker Data Wrangler movement, performing needed knowledge transformations, and selecting the mannequin coaching configuration. In the event you don’t have already got a skilled mannequin, you may observe one of many labs within the Amazon SageMaker Canvas Immersion Day to create one earlier than persevering with. For this instance, we use a classification mannequin that was skilled on the canvas-sample-shipping-logs.csv sample dataset.
Save your mannequin to the SageMaker Mannequin Registry
Full the next steps to save lots of your mannequin to the SageMaker Mannequin Registry:
- On the SageMaker AI console, select Studio to launch Amazon SageMaker Studio.
- Within the SageMaker Studio interface, launch SageMaker Canvas, which is able to open in a brand new tab.

- Find the mannequin and mannequin model that you simply need to deploy to your serverless endpoint.
- On the choices menu (three vertical dots), select Add to Mannequin Registry.

Now you can exit SageMaker Canvas by logging out. To handle prices and forestall extra workspace charges, you can even configure SageMaker Canvas to automatically shut down when idle.
Approve your mannequin for deployment
After you’ve gotten added your mannequin to the Mannequin Registry, full the next steps:
- Within the SageMaker Studio UI, select Fashions within the navigation pane.
The mannequin you simply exported from SageMaker Canvas needs to be added with a deployment standing of Pending guide approval.
- Select the mannequin model you need to deploy and replace the standing to Permitted by selecting the deployment standing.

- Select the mannequin model and navigate to the Deploy tab. That is the place you will see that the knowledge associated to the mannequin and related container.
- Choose the container and mannequin location associated to the skilled mannequin. You possibly can establish it by checking the presence of the atmosphere variable
SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT.

Create a brand new mannequin
Full the next steps to create a brand new mannequin:
- With out closing the SageMaker Studio tab, open a brand new tab and open the SageMaker AI console.
- Select Fashions within the Inference part and select Create mannequin.
- Identify your mannequin.
- Depart the container enter possibility as Present mannequin artifacts and inference picture location and used the
CompressedModel kind.
- Enter the Amazon Elastic Container Registry (Amazon ECR) URI, Amazon S3 URI, and atmosphere variables that you simply positioned within the earlier step.
The atmosphere variables will probably be proven as a single line in SageMaker Studio, with the next format:
SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT: textual content/csv, SAGEMAKER_INFERENCE_OUTPUT: predicted_label, SAGEMAKER_INFERENCE_SUPPORTED: predicted_label, SAGEMAKER_PROGRAM: tabular_serve.py, SAGEMAKER_SUBMIT_DIRECTORY: /decide/ml/mannequin/code
You may need completely different variables than these within the previous instance. All variables out of your atmosphere variables needs to be added to your mannequin. Guarantee that every atmosphere variable is by itself line when creating you new mannequin.

- Select Create mannequin.
Create an endpoint configuration
Full the next steps to create an endpoint configuration:
- On the SageMaker AI console, select Endpoint configurations to create a brand new mannequin endpoint configuration.
- Set the kind of endpoint to Serverless and set the mannequin variant to the mannequin created within the earlier step.

- Select Create endpoint configuration.
Create an endpoint
Full the next steps to create an endpoint:
- On the SageMaker AI console, select Endpoints within the navigation pane and create a brand new endpoint.
- Identify the endpoint.
- Choose the endpoint configuration created within the earlier step and select Choose endpoint configuration.
- Select Create endpoint.

The endpoint may take a couple of minutes to be created. When the standing is up to date to InService, you may start calling the endpoint.
The next pattern code demonstrates how one can name an endpoint from a Jupyter pocket book positioned in your SageMaker Studio atmosphere:
import boto3
import csv
from io import StringIO
import time
def invoke_shipping_prediction(options):
sagemaker_client = boto3.consumer('sagemaker-runtime')
# Convert to CSV string format
output = StringIO()
csv.author(output).writerow(options)
payload = output.getvalue()
response = sagemaker_client.invoke_endpoint(
EndpointName="canvas-shipping-data-model-1-serverless-endpoint",
ContentType="textual content/csv",
Settle for="textual content/csv",
Physique=payload
)
response_body = response['Body'].learn().decode()
reader = csv.reader(StringIO(response_body))
end result = listing(reader)[0] # Get first row
# Parse the response right into a extra usable format
prediction = {
'predicted_label': end result[0],
'confidence': float(end result[1]),
'class_probabilities': eval(end result[2]),
'possible_labels': eval(end result[3])
}
return prediction
# Options for inference
features_set_1 = [
"Bell",
"Base",
14,
6,
11,
11,
"GlobalFreight",
"Bulk Order",
"Atlanta",
"2020-09-11 00:00:00",
"Express",
109.25199890136719
]
features_set_2 = [
"Bell",
"Base",
14,
6,
15,
15,
"MicroCarrier",
"Single Order",
"Seattle",
"2021-06-22 00:00:00",
"Standard",
155.0483856201172
]
# Invoke the SageMaker endpoint for characteristic set 1
start_time = time.time()
end result = invoke_shipping_prediction(features_set_1)
# Print Output and Timing
end_time = time.time()
total_time = end_time - start_time
print(f"Whole response time with endpoint chilly begin: {total_time:.3f} seconds")
print(f"Prediction for characteristic set 1: {end result['predicted_label']}")
print(f"Confidence for characteristic set 1: {end result['confidence']*100:.2f}%")
print("nProbabilities for characteristic set 1:")
for label, prob in zip(end result['possible_labels'], end result['class_probabilities']):
print(f"{label}: {prob*100:.2f}%")
print("---------------------------------------------------------")
# Invoke the SageMaker endpoint for characteristic set 2
start_time = time.time()
end result = invoke_shipping_prediction(features_set_2)
# Print Output and Timing
end_time = time.time()
total_time = end_time - start_time
print(f"Whole response time with heat endpoint: {total_time:.3f} seconds")
print(f"Prediction for characteristic set 2: {end result['predicted_label']}")
print(f"Confidence for characteristic set 2: {end result['confidence']*100:.2f}%")
print("nProbabilities for characteristic set 2:")
for label, prob in zip(end result['possible_labels'], end result['class_probabilities']):
print(f"{label}: {prob*100:.2f}%")
Automate the method
To mechanically create serverless endpoints every time a brand new mannequin is accredited, you should utilize the next YAML file with AWS CloudFormation. This file will automate the creation of SageMaker endpoints with the configuration you specify.
This pattern CloudFormation template is supplied solely for inspirational functions and isn’t meant for direct manufacturing use. Builders ought to completely check this template in line with their group’s safety pointers earlier than deployment.
AWSTemplateFormatVersion: "2010-09-09"
Description: Template for creating Lambda perform to deal with SageMaker mannequin
bundle state adjustments and create serverless endpoints
Parameters:
MemorySizeInMB:
Sort: Quantity
Default: 1024
Description: Reminiscence measurement in MB for the serverless endpoint (between 1024 and 6144)
MinValue: 1024
MaxValue: 6144
MaxConcurrency:
Sort: Quantity
Default: 20
Description: Most variety of concurrent invocations for the serverless endpoint
MinValue: 1
MaxValue: 200
AllowedRegion:
Sort: String
Default: "us-east-1"
Description: AWS area the place SageMaker assets will be created
AllowedDomainId:
Sort: String
Description: SageMaker Studio area ID that may set off deployments
NoEcho: true
AllowedDomainIdParameterName:
Sort: String
Default: "/sagemaker/serverless-deployment/allowed-domain-id"
Description: SSM Parameter identify containing the SageMaker Studio area ID that may set off deployments
Assets:
AllowedDomainIdParameter:
Sort: AWS::SSM::Parameter
Properties:
Identify: !Ref AllowedDomainIdParameterName
Sort: String
Worth: !Ref AllowedDomainId
Description: SageMaker Studio area ID that may set off deployments
SageMakerAccessPolicy:
Sort: AWS::IAM::ManagedPolicy
Properties:
Description: Managed coverage for SageMaker serverless endpoint creation
PolicyDocument:
Model: "2012-10-17"
Assertion:
- Impact: Enable
Motion:
- sagemaker:CreateModel
- sagemaker:CreateEndpointConfig
- sagemaker:CreateEndpoint
- sagemaker:DescribeModel
- sagemaker:DescribeEndpointConfig
- sagemaker:DescribeEndpoint
- sagemaker:DeleteModel
- sagemaker:DeleteEndpointConfig
- sagemaker:DeleteEndpoint
Useful resource: !Sub "arn:aws:sagemaker:${AllowedRegion}:${AWS::AccountId}:*"
- Impact: Enable
Motion:
- sagemaker:DescribeModelPackage
Useful resource: !Sub "arn:aws:sagemaker:${AllowedRegion}:${AWS::AccountId}:model-package/*/*"
- Impact: Enable
Motion:
- iam:PassRole
Useful resource: !Sub "arn:aws:iam::${AWS::AccountId}:position/service-role/AmazonSageMaker-ExecutionRole-*"
Situation:
StringEquals:
"iam:PassedToService": "sagemaker.amazonaws.com"
- Impact: Enable
Motion:
- ssm:GetParameter
Useful resource: !Sub "arn:aws:ssm:${AllowedRegion}:${AWS::AccountId}:parameter${AllowedDomainIdParameterName}"
LambdaExecutionRole:
Sort: AWS::IAM::Function
Properties:
AssumeRolePolicyDocument:
Model: "2012-10-17"
Assertion:
- Impact: Enable
Principal:
Service: lambda.amazonaws.com
Motion: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:coverage/service-role/AWSLambdaBasicExecutionRole
- !Ref SageMakerAccessPolicy
ModelDeploymentFunction:
Sort: AWS::Lambda::Perform
Properties:
Handler: index.handler
Function: !GetAtt LambdaExecutionRole.Arn
Code:
ZipFile: |
import os
import json
import boto3
sagemaker_client = boto3.consumer('sagemaker')
ssm_client = boto3.consumer('ssm')
def handler(occasion, context):
print(f"Acquired occasion: {json.dumps(occasion, indent=2)}")
attempt:
# Get particulars straight from the occasion
element = occasion['detail']
print(f'element: {element}')
# Get allowed area ID from SSM Parameter Retailer
parameter_name = os.environ.get('ALLOWED_DOMAIN_ID_PARAMETER_NAME')
attempt:
response = ssm_client.get_parameter(Identify=parameter_name)
allowed_domain = response['Parameter']['Value']
besides Exception as e:
print(f"Error retrieving parameter {parameter_name}: {str(e)}")
allowed_domain = '*' # Default fallback
# Verify if area ID is allowed
if allowed_domain != '*':
created_by_domain = element.get('CreatedBy', {}).get('DomainId')
if created_by_domain != allowed_domain:
print(f"Area {created_by_domain} not allowed. Allowed: {allowed_domain}")
return {'statusCode': 403, 'physique': 'Area not licensed'}
# Get the mannequin bundle ARN from the occasion assets
model_package_arn = occasion['resources'][0]
# Get the mannequin bundle particulars from SageMaker
model_package_response = sagemaker_client.describe_model_package(
ModelPackageName=model_package_arn
)
# Parse mannequin identify and model from ModelPackageName
model_name, model = element['ModelPackageName'].cut up('/')
serverless_model_name = f"{model_name}-{model}-serverless"
# Get all container particulars straight from the occasion
container_defs = element['InferenceSpecification']['Containers']
# Get the execution position from the occasion and convert to correct IAM position ARN format
assumed_role_arn = element['CreatedBy']['IamIdentity']['Arn']
execution_role_arn = assumed_role_arn.exchange(':sts:', ':iam:')
.exchange('assumed-role', 'position/service-role')
.rsplit('/', 1)[0]
# Put together containers configuration for the mannequin
containers = []
for i, container_def in enumerate(container_defs):
# Get atmosphere variables from the mannequin bundle for this container
environment_vars = model_package_response['InferenceSpecification']['Containers'][i].get('Surroundings', {}) or {}
containers.append({
'Picture': container_def['Image'],
'ModelDataUrl': container_def['ModelDataUrl'],
'Surroundings': environment_vars
})
# Create mannequin with all containers
if len(containers) == 1:
# Use PrimaryContainer if there's just one container
create_model_response = sagemaker_client.create_model(
ModelName=serverless_model_name,
PrimaryContainer=containers[0],
ExecutionRoleArn=execution_role_arn
)
else:
# Use Containers parameter for a number of containers
create_model_response = sagemaker_client.create_model(
ModelName=serverless_model_name,
Containers=containers,
ExecutionRoleArn=execution_role_arn
)
# Create endpoint config
endpoint_config_name = f"{serverless_model_name}-config"
create_endpoint_config_response = sagemaker_client.create_endpoint_config(
EndpointConfigName=endpoint_config_name,
ProductionVariants=[{
'VariantName': 'AllTraffic',
'ModelName': serverless_model_name,
'ServerlessConfig': {
'MemorySizeInMB': int(os.environ.get('MEMORY_SIZE_IN_MB')),
'MaxConcurrency': int(os.environ.get('MAX_CONCURRENT_INVOCATIONS'))
}
}]
)
# Create endpoint
endpoint_name = f"{serverless_model_name}-endpoint"
create_endpoint_response = sagemaker_client.create_endpoint(
EndpointName=endpoint_name,
EndpointConfigName=endpoint_config_name
)
return {
'statusCode': 200,
'physique': json.dumps({
'message': 'Serverless endpoint deployment initiated',
'endpointName': endpoint_name
})
}
besides Exception as e:
print(f"Error: {str(e)}")
elevate
Runtime: python3.12
Timeout: 300
MemorySize: 128
Surroundings:
Variables:
MEMORY_SIZE_IN_MB: !Ref MemorySizeInMB
MAX_CONCURRENT_INVOCATIONS: !Ref MaxConcurrency
ALLOWED_DOMAIN_ID_PARAMETER_NAME: !Ref AllowedDomainIdParameterName
EventRule:
Sort: AWS::Occasions::Rule
Properties:
Description: Rule to set off Lambda when SageMaker Mannequin Bundle state adjustments
EventPattern:
supply:
- aws.sagemaker
detail-type:
- SageMaker Mannequin Bundle State Change
element:
ModelApprovalStatus:
- Permitted
UpdatedModelPackageFields:
- ModelApprovalStatus
State: ENABLED
Targets:
- Arn: !GetAtt ModelDeploymentFunction.Arn
Id: ModelDeploymentFunction
LambdaInvokePermission:
Sort: AWS::Lambda::Permission
Properties:
FunctionName: !Ref ModelDeploymentFunction
Motion: lambda:InvokeFunction
Principal: occasions.amazonaws.com
SourceArn: !GetAtt EventRule.Arn
Outputs:
LambdaFunctionArn:
Description: ARN of the Lambda perform
Worth: !GetAtt ModelDeploymentFunction.Arn
EventRuleArn:
Description: ARN of the EventBridge rule
Worth: !GetAtt EventRule.Arn
This stack will restrict automated serverless endpoint creation to a selected AWS Area and area. You will discover your area ID when accessing SageMaker Studio from the SageMaker AI console, or by working the next command: aws sagemaker list-domains —area [your-region]
Clear up
To handle prices and forestall extra workspace charges, just remember to have logged out of SageMaker Canvas. In the event you examined your endpoint utilizing a Jupyter pocket book, you may shut down your JupyterLab occasion by selecting Cease or configuring automated shutdown for JupyterLab.

On this publish, we confirmed easy methods to deploy a SageMaker Canvas mannequin to a serverless endpoint utilizing SageMaker Serverless Inference. Through the use of this serverless strategy, you may rapidly and effectively serve predictions out of your SageMaker Canvas fashions while not having to handle the underlying infrastructure.
This seamless deployment expertise is only one instance of how AWS companies like SageMaker Canvas and SageMaker Serverless Inference simplify the ML journey, serving to companies of various sizes and technical proficiencies unlock the worth of AI and ML. As you proceed exploring the SageMaker ecosystem, you’ll want to try how one can unlock data governance for no-code ML with Amazon DataZone, and seamlessly transition between no-code and code-first model development utilizing SageMaker Canvas and SageMaker Studio.
Concerning the authors
Nadhya Polanco is a Options Architect at AWS primarily based in Brussels, Belgium. On this position, she helps organizations seeking to incorporate AI and Machine Studying into their workloads. In her free time, Nadhya enjoys indulging in her ardour for espresso and touring.
Brajendra Singh is a Principal Options Architect at Amazon Internet Companies, the place he companions with enterprise prospects to design and implement revolutionary options. With a robust background in software program growth, he brings deep experience in Knowledge Analytics, Machine Studying, and Generative AI.