Use Amazon Bedrock tooling with Amazon SageMaker JumpStart fashions

As we speak, we’re excited to announce a brand new functionality that permits you to deploy over 100 open-weight and proprietary fashions from Amazon SageMaker JumpStart and register them with Amazon Bedrock, permitting you to seamlessly entry them by way of the highly effective Amazon Bedrock APIs. Now you can use Amazon Bedrock options resembling Amazon Bedrock Knowledge Bases and Amazon Bedrock Guardrails with fashions deployed by way of SageMaker JumpStart.

SageMaker JumpStart helps you get began with machine studying (ML) by offering totally customizable options and one-click deployment and fine-tuning of greater than 400 common open-weight and proprietary generative AI fashions. Amazon Bedrock is a completely managed service that gives a single API to entry and use numerous high-performing basis fashions (FMs). It additionally provides a broad set of capabilities to construct generative AI functions. The Amazon Bedrock Converse API is a runtime API that gives a constant interface that works with totally different fashions. It permits you to use superior options in Amazon Bedrock such because the playground, guardrails, and gear use (perform calling).

SageMaker JumpStart has lengthy been the go-to service for builders and information scientists looking for to deploy state-of-the-art generative AI fashions. By means of this integration, now you can mix the pliability of internet hosting fashions on SageMaker JumpStart with the totally managed expertise of Amazon Bedrock, together with superior safety controls, scalable infrastructure, and complete monitoring capabilities.

On this submit, we present you the right way to deploy FMs by way of SageMaker JumpStart, register them with Amazon Bedrock, and invoke them utilizing Amazon Bedrock APIs.

Resolution overview

The Converse API standardizes interplay with Amazon Bedrock FMs, enabling builders to write down code one time and use it throughout numerous fashions without having to regulate for model-specific variations. It helps multi-turn conversations by way of conversational historical past as a part of the API request, and builders can carry out duties that require entry to exterior APIs by way of the utilization of tools (perform calling). Moreover, the Converse API permits you to block inappropriate inputs or generated content material by including a guardrail in your API calls. To assessment the entire record of supported fashions and mannequin options, confer with Supported models and model features.

This new characteristic extends the capabilities of the Converse API right into a single interface that builders can use to name FMs deployed in SageMaker JumpStart. This enables builders to make use of the identical API to invoke fashions from Amazon Bedrock and SageMaker JumpStart, streamlining the method to combine fashions into their generative AI functions. Now you possibly can construct on high of a good bigger library of world-class open supply and proprietary fashions by way of a single API. To view the total record of Bedrock Prepared fashions accessible from SageMaker JumpStart, confer with the Bedrock Marketplace documentation. You can even use Amazon Bedrock Marketplace to find and deploy these fashions to SageMaker endpoints.

On this submit, we stroll by way of the next steps:

Deploy the Gemma 2 9B Instruct mannequin utilizing SageMaker JumpStart.
Register the mannequin with Amazon Bedrock.
Take a look at the mannequin with pattern prompts utilizing the Amazon Bedrock playground.
Use the Amazon Bedrock RetrieveAndGenerate API to question the Amazon Bedrock information base.
Arrange Amazon Bedrock Guardrails to assist block dangerous content material and personally identifiable data (PII) information.
Invoke fashions with Converse APIs to indicate an end-to-end Retrieval Augmented Technology (RAG) pipeline.

Stipulations

You’ll be able to entry and deploy pretrained fashions from SageMaker JumpStart within the Amazon SageMaker Studio UI. To entry SageMaker Studio on the AWS Management Console, you must arrange an Amazon SageMaker area. SageMaker makes use of domains to arrange consumer profiles, functions, and their related sources. To create a site and arrange a consumer profile, confer with Guide to getting set up with Amazon SageMaker.

You additionally want an AWS Identity and Access Management (IAM) function with applicable permissions. To get began with this instance, you should use the AmazonSageMakerFullAccess, AmazonBedrockFullAccess, AmazonOpenSearchAccess managed insurance policies to offer the required permissions to SageMaker JumpStart and Amazon Bedrock. For a extra scoped down set of permissions, confer with the next:

{
  "Model": "2012-10-17",
  "Assertion": [
    {
      "Sid": "BedrockEndpointManagementMutatingOperations",
      "Action": [
        "sagemaker:AddTags",
        "sagemaker:CreateEndpoint",
        "sagemaker:CreateEndpointConfig",
        "sagemaker:CreateModel",
        "sagemaker:DeleteEndpoint",
        "sagemaker:UpdateEndpoint",
        "sagemaker:DeleteTags"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:sagemaker:*",
      "Situation": {
        "StringEquals": {
            "aws:ViaAWSService": "bedrock.amazonaws.com"
        }
       }
    },
    {
      "Sid": "BedrockEndpointManagementNonMutatingOperations",
      "Motion": [
        "sagemaker:DescribeEndpoint",
        "sagemaker:DescribeEndpointConfig",
        "sagemaker:DescribeModel",
        "sagemaker:ListEndpoints",
        "sagemaker:ListTags"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:sagemaker:*",
      "Situation": {
        "StringEquals": {
            "aws:ViaAWSService": "bedrock.amazonaws.com"
        }
       }
    },
    {
      "Sid": "BedrockEndpointInvokingOperations",
      "Motion": [
        "sagemaker:InvokeEndpoint",
        "sagemaker:InvokeEndpointWithResponseStream"      
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:sagemaker:*",
      "Situation": {
        "StringEquals": {
            "aws:ViaAWSService": "bedrock.amazonaws.com"
         }
       }
    },
    {
      "Sid": "AllowDiscoveringPublicModelDetails",
      "Motion": [
        "sagemaker:DescribeHubContent"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:sagemaker:*:aws:hub-content/SageMakerPublicHub/Mannequin/*"
    },
    {
      "Sid": "AllowListingPublicModels",
      "Motion": [
        "sagemaker:ListHubContents"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:sagemaker:*:aws:hub/SageMakerPublicHub"
    },
    {
      "Sid": "RetrieveSubscribedMarketplaceLicenses",
      "Motion": [
        "license-manager:ListReceivedLicenses"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:license-manager:*"
    },
    {
      "Sid" : "PassRoleToSagemaker",
      "Impact" : "Enable",
      "Motion" : [
        "iam:PassRole"
      ],
      "Useful resource" : "arn:aws:iam::*:function/*AmazonSageMaker*",
      "Situation" : {
        "StringEquals" : {
        "iam:PassedToService" : [
            "sagemaker.amazonaws.com"
          ]
        }
      }
    },
    {
      "Sid" : "BedrockAll",
      "Impact" : "Enable",
      "Motion" : [ "bedrock:*" ],
      "Useful resource" : "*" 
    },
    {
      "Sid" : "AmazonOpenSearchAccess",
      "Impact" : "Enable",
      "Motion" : [ "aoss:*" ],
      "Useful resource" : "*",
      "Situation": {
                "StringEquals": {
                    "aws:ResourceAccount": "${aws:PrincipalAccount}"
                }
      }
    },
  ]
}

After making use of the related permissions, establishing a SageMaker area, and creating consumer profiles, you might be able to deploy your SageMaker JumpStart mannequin and register it with Amazon Bedrock.

Deploy a mannequin with SageMaker JumpStart and register it with Amazon Bedrock

This part gives a walkthrough of deploying a mannequin utilizing SageMaker JumpStart and registering it with Amazon Bedrock. On this walkthrough, you’ll deploy and register the Gemma 2 9B Instruct mannequin provided by way of Hugging Face in SageMaker JumpStart. Full the next steps:

On the SageMaker console, select Studio within the navigation pane.
Select the related consumer profile on the dropdown menu and select Open Studio.

In SageMaker Studio, select JumpStart within the navigation pane.

Right here, you will notice a listing of the accessible SageMaker JumpStart fashions. Fashions that may be registered to Amazon Bedrock after they’ve been deployed by way of SageMaker JumpStart have a Bedrock prepared tag.

The Gemma 2 9B Instruct mannequin for this instance is offered by Hugging Face, so select the Hugging Face mannequin card.

To filter the record of fashions to view which fashions are supported by Amazon Bedrock, choose Bedrock Prepared underneath Motion.
Seek for Gemma 2 9B Instruct and select the mannequin card for Gemma 2 9B Instruct.

You’ll be able to assessment the mannequin card for Gemma 2 9B Instruct to study extra concerning the mannequin.

To deploy the mannequin, select Deploy.
Overview the Finish Consumer License Settlement for Gemma 2 9B Instruct and choose I settle for the Finish Consumer License Settlement (EULA) and skim the phrases and circumstances.
Go away the endpoint settings with their default values and select Deploy.

The endpoint deployment course of will take a couple of minutes.

Below Deployments within the navigation pane, select Endpoints to view your accessible endpoints.

After a couple of minutes, the mannequin can be deployed to the endpoint and its standing will change to In service, indicating that the endpoint is able to serve visitors. You should utilize the Refresh icon on the backside of the endpoint display screen to get the most recent data.

When your endpoint is in service, select it to go to the endpoint particulars web page.

Select Use with Bedrock to start out the registration course of.

You’ll be redirected to the Amazon Bedrock console.

On the Register endpoint web page, the SageMaker endpoint Amazon Useful resource Identify (ARN) and mannequin ARN have already been prepopulated. Overview these values and select Register.

Your SageMaker endpoint can be registered with Amazon Bedrock in a couple of minutes.

After your SageMaker endpoint is registered with Amazon Bedrock, you possibly can invoke it utilizing the Converse API. Then you possibly can take a look at your endpoint within the Amazon Bedrock playground.

Within the navigation pane on the Amazon Bedrock console, select Market deployments underneath Basis fashions.
From the record of managed deployments, choose your registered mannequin, then select Open in playground.

You’ll now be within the Amazon Bedrock playground for Chat/textual content. The Chat/textual content playground permits to you take a look at your mannequin with a single immediate, or gives chat functionality for conversational use circumstances. As a result of this instance can be an interactive chat session, go away the Mode because the default Chat. The chat functionality within the playground must be set to check your Gemma 2 9B Instruct mannequin.

Now you possibly can take a look at your SageMaker endpoint by way of Amazon Bedrock! Use the next immediate to check summarizing a gathering transcript, and assessment the outcomes:

Assembly transcript:
Miguel: Hello Brant, I need to talk about the workstream for our new product launch
Brant: Positive Miguel, is there something specifically you need to talk about?
Miguel: Sure, I need to discuss how customers enter into the product.
Brant: Okay, in that case let me add in Namita.
Namita: Hey everybody
Brant: Hello Namita, Miguel desires to debate how customers enter into the product.
Miguel: its too sophisticated and we must always take away friction.  for instance, why do I must fill out further types?  I additionally discover it troublesome to search out the place to entry the product after I first land on the touchdown web page.
Brant: I might additionally add that I feel there are too many steps.
Namita: Okay, I can work on the touchdown web page to make the product extra discoverable however brant can you're employed on the additonal types?
Brant: Sure however I would wish to work with James from one other workforce as he must unblock the join workflow.  Miguel are you able to doc some other issues in order that I can talk about with James solely as soon as?
Miguel: Positive.

From the assembly transcript above, Create a listing of motion objects for every particular person.

Enter the immediate into the playground, then select Run.

You’ll be able to view the response within the chat generated by your deployed SageMaker JumpStart mannequin by way of Amazon Bedrock:

Here is a breakdown of motion objects from the assembly transcript:

**Miguel:**

* **Doc:** Listing out any further issues relating to consumer entry into the product. Share these with Brant for his dialogue with James.

**Brant:**

* **Collaborate with James:**  Work with James from one other workforce to simplify the extra types concerned within the consumer sign-up workflow.
* **Overview Documentation:** Overview Miguel's documented issues about consumer entry to organize for the dialogue with James.

**Namita:**

* **Touchdown Web page Redesign:**  Enhance the touchdown web page to make the product extra discoverable for brand spanking new customers.

Let me know if you would like me to elaborate on any of those motion objects!

You can even take a look at the mannequin with your personal prompts and use circumstances.

Use Amazon Bedrock APIs with the deployed mannequin

This part demonstrates utilizing the AWS SDK for Python (Boto3) and Converse APIs to invoke the Gemma 2 9B Instruct mannequin you deployed earlier by way of SageMaker and registered with Amazon Bedrock. The total supply code related to this submit is offered within the accompanying GitHub repo. For extra Converse API examples, confer with Converse API examples.

On this code pattern, we additionally implement a RAG structure along side the deployed mannequin. RAG is the method of optimizing the output of a giant language mannequin (LLM) so it references an authoritative information base exterior of its coaching information sources earlier than producing a response.

Use the deployed SageMaker mannequin with the RetrieveAndGenerate API provided by Amazon Bedrock to question a information base and generate responses based mostly on the retrieved outcomes. The response solely cites sources which can be related to the question. For data on making a Data Base, confer with Creating a Knowledge Base. For extra code samples, confer with RetrieveAndGenerate.

The next diagram illustrates the RAG workflow.

Full the next steps:

To invoke the deployed mannequin, you must go the endpoint ARN of the deployed mannequin within the modelId parameter of the Converse API.

To acquire the ARN of the deployed mannequin, navigate to the Amazon Bedrock console. Within the navigation pane, select Market deployments underneath Basis fashions. From the record of managed deployments, select your registered mannequin to view extra particulars.

You’ll be directed to the mannequin abstract on the Mannequin catalog web page underneath Basis fashions. Right here, one can find the small print related along with your deployed mannequin. Copy the mannequin ARN to make use of within the following code pattern.

import boto3

bedrock_runtime = boto3.consumer("bedrock-runtime")

# Add your bedrock endpoint arn right here.
endpoint_arn = "arn:aws:sagemaker:<AWS::REGION>:<AWS::AccountId>:endpoint/<Endpoint_Name>"

# Base inference parameters to make use of.
inference_config = {
        "maxTokens": 256,
        "temperature": 0.1,
        "topP": 0.999,
}

# Extra inference parameters to make use of.
additional_model_fields = {"parameters": {"repetition_penalty": 0.9, "top_k": 250, "do_sample": True}}


response = bedrock_runtime.converse(
    modelId=endpoint_arn,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "text": "What is Amazon doing in the field of generative AI?",
                },
            ]
        },
    ],
    inferenceConfig=inference_config,
    additionalModelRequestFields=additional_model_fields,
)

Invoke the SageMaker JumpStart mannequin with the RetrieveAndGenerate API. The generation_template and orchestration_template parameters within the retrieve_and_generate API are mannequin particular. These templates outline the prompts and directions for the language mannequin, guiding the era course of and the combination with the information retrieval element.

import boto3

bedrock_agent_runtime_client = boto3.consumer("bedrock-agent-runtime")

# Present your Data Base Id 
kb_id = "" 

response = bedrock_agent_runtime_client.retrieve_and_generate(
    enter={
        "textual content": "What's Amazon doing within the subject of generative AI?"
    },
    retrieveAndGenerateConfiguration={
        "kind": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "generationConfiguration": {
                "inferenceConfig": {
                    "textInferenceConfig": {
                        "maxTokens": 512,
                        "temperature": 0.1,
                        "topP": 0.9
                    }
                },
                "promptTemplate": {
                    "textPromptTemplate": generation_template
                }
            },
            "knowledgeBaseId": kb_id,
            "orchestrationConfiguration": {
                "inferenceConfig": {
                    "textInferenceConfig": {
                        "maxTokens": 512,
                        "temperature": 0.1,
                        "topP": 0.9
                    }
                },
                "promptTemplate": {
                    "textPromptTemplate": orchestration_template
                },
            },
            "modelArn": endpoint_arn,
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            }
        }
    }
)

Now you possibly can implement guardrails with the Converse API to your SageMaker JumpStart mannequin. Amazon Bedrock Guardrails lets you implement safeguards to your generative AI functions based mostly in your use circumstances and accountable AI insurance policies. For data on creating guardrails, confer with Create a Guardrail. For extra code samples to implement guardrails, confer with Include a guardrail with Converse API.

Within the following code pattern, you embrace a guardrail in a Converse API request invoking a SageMaker JumpStart mannequin:

import boto3

bedrock_agent_runtime_client = boto3.consumer("bedrock-agent-runtime")

# Present your Data Base Id
kb_id = "" 

relevant_documents = bedrock_agent_runtime_client.retrieve(
    retrievalQuery= {
        "textual content": "What's Amazon doing within the subject of generative AI?"
    },
    knowledgeBaseId=kb_id,
    retrievalConfiguration= {
        "vectorSearchConfiguration": {
            "numberOfResults": 1
        }
    }
)

def invoke_model(immediate, supply, inference_config=None, additional_model_field=None):
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "guardContent": {
                        "text": {
                            "text": source,
                            "qualifiers": ["grounding_source"],
                        }
                    }
                },
                {
                    "guardContent": {
                        "textual content": {
                            "textual content": immediate,
                            "qualifiers": ["query"],
                        }
                    }
                },
            ],
        }
    ]
    if not inference_config:
        # Base inference parameters to make use of.
        inference_config = {
                "maxTokens": 256,
                "temperature": 0.1,
                "topP": 0.999,
        }
    
    if not additional_model_field:
        # Extra inference parameters to make use of.
        additional_model_fields = {"parameters": {"repetition_penalty": 0.9, "top_k": 250, "do_sample": True}}


    response = bedrock_runtime.converse(
        modelId=endpoint_arn,
        messages=messages,
        inferenceConfig=inference_config,
        additionalModelRequestFields=additional_model_fields,
        guardrailConfig={
            'guardrailIdentifier': guardrail_identifier,
            'guardrailVersion': guardrail_version
        },
    )
    
    return response["output"]["message"]["content"][0]["text"]

invoke_model(immediate="What's Amazon doing within the subject of generative AI?", supply=relevant_documents["retrievalResults"][0]["content"]["text"]) 
# Content material is Blocked 
invoke_model(immediate="Ought to I purchase bitcoin?", supply=relevant_documents["retrievalResults"][0]["content"]["text"])

Clear up

To scrub up your sources, use the next code:

import boto3

from knowledge_base import KnowledgeBasesForAmazonBedrock

kb = KnowledgeBasesForAmazonBedrock()
kb.delete_kb(knowledge_base_name, delete_s3_bucket=True, delete_iam_roles_and_policies=True)

bedrock.delete_guardrail(guardrailIdentifier=guardrail_identifier)

The SageMaker JumpStart mannequin you deployed will incur value in the event you go away it working. Delete the endpoint if you wish to cease incurring prices. Deleting the endpoint may also de-register the mannequin from Amazon Bedrock. For extra particulars, see Delete Endpoints and Resources.

Conclusion

On this submit, you realized the right way to deploy FMs by way of SageMaker JumpStart, register them with Amazon Bedrock, and invoke them utilizing Amazon Bedrock APIs. With this new functionality, organizations can entry main proprietary and open-weight fashions utilizing a single API, decreasing the complexity of constructing generative AI functions with a wide range of fashions. This integration between SageMaker JumpStart and Amazon Bedrock is usually accessible in all AWS Areas the place Amazon Bedrock is offered. Do that code to make use of ConverseAPIs, Data bases and Guardrails with SageMaker.

Concerning the Writer

Vivek Gangasani is a Senior GenAI Specialist Options Architect at AWS. He helps rising GenAI corporations construct revolutionary options utilizing AWS companies and accelerated compute. Presently, he’s centered on creating methods for fine-tuning and optimizing the inference efficiency of Massive Language Fashions. In his free time, Vivek enjoys climbing, watching films and attempting totally different cuisines.

Abhishek Doppalapudi is a Options Architect at Amazon Net Providers (AWS), the place he assists startups in constructing and scaling their merchandise utilizing AWS companies. Presently, he’s centered on serving to AWS prospects undertake Generative AI options. In his free time, Abhishek enjoys enjoying soccer, watching Premier League matches, and studying.

June Won is a product supervisor with Amazon SageMaker JumpStart. He focuses on making basis fashions simply discoverable and usable to assist prospects construct generative AI functions. His expertise at Amazon additionally contains cellular buying functions and final mile supply.

Eashan Kaushik is an Affiliate Options Architect at Amazon Net Providers. He’s pushed by creating cutting-edge generative AI options whereas prioritizing a customer-centric strategy to his work. Earlier than this function, he obtained an MS in Pc Science from NYU Tandon Faculty of Engineering. Outdoors of labor, he enjoys sports activities, lifting, and working marathons.

Giuseppe Zappia is a Principal AI/ML Specialist Options Architect at AWS, centered on serving to giant enterprises design and deploy ML options on AWS. He has over 20 years of expertise as a full stack software program engineer, and has spent the previous 5 years at AWS centered on the sector of machine studying.

Bhaskar Pratap is a Senior Software program Engineer with the Amazon SageMaker workforce. He’s captivated with designing and constructing elegant methods that convey machine studying to individuals’s fingertips. Moreover, he has intensive expertise with constructing scalable cloud storage companies.