Meta SAM 2.1 is now accessible in Amazon SageMaker JumpStart

This weblog publish is co-written with George Orlin from Meta.
At present, we’re excited to announce that Meta’s Segment Anything Model (SAM) 2.1 imaginative and prescient segmentation mannequin is publicly accessible via Amazon SageMaker JumpStart to deploy and run inference. Meta SAM 2.1 supplies state-of-the-art video and picture segmentation capabilities in a single mannequin. This cutting-edge mannequin helps long-context processing, complicated segmentation situations, and fine-grained evaluation, making it splendid for automating processes for varied industries comparable to medical imaging in healthcare, satellite tv for pc imagery for surroundings monitoring, and object segmentation for autonomous techniques. Meta SAM 2.1 is nicely fitted to zero-shot object segmentation and correct object detection based mostly on easy prompts comparable to level coordinates and bounding containers in a body for video monitoring and picture masking.
This mannequin was predominantly skilled on AWS, and AWS may also be the primary cloud supplier to make it accessible to prospects. On this publish, we stroll via how you can uncover and deploy the Meta SAM 2.1 mannequin utilizing SageMaker JumpStart.
Meta SAM 2.1 overview
Meta SAM 2.1 is a state-of-the-art imaginative and prescient segmentation mannequin designed for high-performance pc imaginative and prescient duties, enabling superior object detection and segmentation workflows. Constructing upon its predecessor, model 2.1 introduces enhanced segmentation accuracy, sturdy generalization throughout numerous datasets, and scalability for production-grade purposes. These options allow AI researchers and builders in pc imaginative and prescient, picture processing, and data-driven analysis to enhance duties that require detailed evaluation segmentation throughout a number of fields.
Meta SAM 2.1 has a streamlined structure that’s optimized for integration with in style model-serving frameworks like TorchServe and may be deployed on Amazon SageMaker AI to energy real-time or batch inference pipelines. Meta SAM 2.1 empowers organizations to realize exact segmentation outcomes in vision-centric workflows with minimal configuration and most effectivity.
Meta SAM 2.1 gives a number of variants—Tiny, Small, Base Plus, and Massive—accessible now on SageMaker JumpStart, balancing mannequin measurement, pace, and segmentation efficiency to cater to numerous utility wants.
SageMaker JumpStart overview
SageMaker JumpStart gives entry to a broad choice of publicly accessible basis fashions (FMs). These pre-trained fashions function highly effective beginning factors that may be deeply personalized to deal with particular use instances. Now you can use state-of-the-art mannequin architectures, comparable to language fashions, pc imaginative and prescient fashions, and extra, with out having to construct them from scratch.
With SageMaker JumpStart, you may deploy fashions in a safe surroundings. Fashions hosted on JumpStart may be provisioned on devoted SageMaker Inference cases, together with AWS Trainium and AWS Inferentia based mostly cases, and are remoted inside your digital non-public cloud (VPC). This enforces knowledge safety and compliance, as a result of the fashions function underneath your personal VPC controls, moderately than in a shared public surroundings. After deploying an FM, you may additional customise and fine-tune it utilizing the in depth capabilities of SageMaker AI, together with SageMaker Inference for deploying fashions and container logs for improved observability. With SageMaker AI, you may streamline the complete mannequin deployment course of.
Stipulations
Be sure you have the next stipulations to deploy Meta SAM 2.1 and run inference:
- An AWS account that may comprise all of your AWS assets.
- An AWS Identity and Access Management (IAM) position to entry SageMaker AI. To be taught extra about how IAM works with SageMaker AI, consult with Identity and Access Management for Amazon SageMaker AI.
- Entry to Amazon SageMaker Studio or a SageMaker pocket book occasion or an interactive improvement surroundings (IDE) comparable to PyCharm or Visible Studio Code. We suggest utilizing SageMaker Studio for simple deployment and inference.
- Entry to accelerated cases (GPUs) for internet hosting the mannequin.
Uncover Meta SAM 2.1 in SageMaker JumpStart
SageMaker JumpStart supplies FMs via two major interfaces: SageMaker Studio and the SageMaker Python SDK. This supplies a number of choices to find and use tons of of fashions in your particular use case.
SageMaker Studio is a complete IDE that provides a unified, web-based interface for performing all points of the machine studying (ML) improvement lifecycle. From getting ready knowledge to constructing, coaching, and deploying fashions, SageMaker Studio supplies purpose-built instruments to streamline the complete course of. In SageMaker Studio, you may entry SageMaker JumpStart to find and discover the in depth catalog of FMs accessible for deployment to inference capabilities on SageMaker Inference.
You may entry the SageMaker JumpStart UI via both Amazon SageMaker Unified Studio or SageMaker Studio. To deploy Meta SAM 2.1 utilizing the SageMaker JumpStart UI, full the next steps:
In SageMaker Unified Studio, on the Construct menu, select JumpStart fashions.
If you happen to’re already on the SageMaker Studio console, select JumpStart within the navigation pane.
You may be prompted to create a challenge, after which you’ll start deployment.
Alternatively, you should utilize the SageMaker Python SDK to programmatically entry and use SageMaker JumpStart fashions. This method permits for higher flexibility and integration with present AI/ML workflows and pipelines. By offering a number of entry factors, SageMaker JumpStart helps you seamlessly incorporate pre-trained fashions into your AI/ML improvement efforts, no matter your most popular interface or workflow.
Deploy Meta SAM 2.1 for inference utilizing SageMaker JumpStart
On the SageMaker JumpStart touchdown web page, you may uncover the general public pre-trained fashions provided by SageMaker AI. You may select the Meta mannequin supplier tab to find the Meta fashions accessible.
If you happen to’re utilizing SageMaker Studio and don’t see the SAM 2.1 fashions, replace your SageMaker Studio model by shutting down and restarting. For extra details about model updates, consult with Shut down and Update Studio Classic Apps.
You may select the mannequin card to view particulars in regards to the mannequin comparable to license, knowledge used to coach, and how you can use. You too can discover two buttons, Deploy and Open Pocket book, which assist you use the mannequin.
While you select Deploy, try to be prompted to the following display to decide on an endpoint title and occasion kind to provoke deployment.
Upon defining your endpoint settings, you may proceed to the following step to make use of the mannequin.
Deploy Meta SAM 2.1 imaginative and prescient segmentation mannequin for inference utilizing the Python SDK
While you select Deploy, mannequin deployment will begin. Alternatively, you may deploy via the instance pocket book by selecting Open Pocket book. The pocket book supplies end-to-end steerage on how you can deploy the mannequin for inference and clear up assets.
To deploy utilizing a pocket book, you begin by deciding on an acceptable mannequin, specified by the model_id
. You may deploy any of the chosen fashions on SageMaker AI.
You may deploy a Meta SAM 2.1 imaginative and prescient segmentation mannequin utilizing SageMaker JumpStart with the next SageMaker Python SDK code:
This deploys the mannequin on SageMaker AI with default configurations, together with default occasion kind and default VPC configurations. You may change these configurations by specifying non-default values in JumpStartModel. After it’s deployed, you may run inference towards the deployed endpoint via the SageMaker predictor. There are three duties which are accessible with this endpoint: automated masks generator, picture predictor, and video predictor. We offer a code snippet for every later on this publish. To make use of the predictor, a sure payload schema must be adopted. The endpoint has sticky classes enabled, so to start out inference, it’s essential to ship a start_session
payload:
The start_session
invocation wants an enter media kind of both picture or video and the base64 encoded knowledge of the media. This may launch a session with an occasion of the mannequin and cargo the media to be segmented.
To shut a session, ship a close_session
invocation:
If x-amzn-sagemaker-closed-session-id
exists as a header, then the session has been efficiently closed.
To proceed a session and retrieve the session ID of the prevailing session, the response header can have the x-amzn-sagemaker-session-id
key with the present session ID for any operation that’s not start_session
or close_session
. Operations that aren’t start_session
or close_session
must be invoked with a response stream. That is as a result of measurement of the ensuing payload being bigger than what SageMaker real-time endpoints can return.
This can be a primary instance of interacting with the SAM 2.1 SageMaker JumpStart endpoint with sticky classes. The next examples for every of the duties reference these operations with out repeating them. The returned knowledge is of mime kind JSONL. For extra full examples, consult with the instance notebooks for Meta SAM 2.1 on SageMaker Jumpstart.
Advisable cases and benchmarks
The next desk lists all of the Meta SAM 2.1 fashions accessible in SageMaker JumpStart together with the model_id
, default occasion varieties, and most variety of whole tokens (sum of variety of enter tokens and variety of generated tokens) supported for every of those fashions. For elevated context size, you may modify the default occasion kind within the SageMaker JumpStart UI.
Mannequin Identify | Mannequin ID | Default Occasion Kind | Supported Occasion Varieties |
Meta SAM 2.1 Tiny | meta-vs-sam-2-1-hiera-tiny | ml.g6.24xlarge (5.5 MB whole picture or video measurement) |
ml.g5.24xlarge ml.g5.48xlarge ml.g6.24xlarge ml.g6.48xlarge ml.p4d.24xlarge ml.p4de.24xlarge |
Meta SAM 2.1 Small | meta-vs-sam-2-1-hiera-small | ml.g6.24xlarge (5.5 MB whole picture or video measurement) |
ml.g5.24xlarge ml.g5.48xlarge ml.g6.24xlarge ml.g6.48xlarge ml.p4d.24xlarge ml.p4de.24xlarge |
Meta SAM 2.1 Base Plus | meta-vs-sam-2-1-hiera-base-plus | ml.g6.24xlarge (5.5 MB whole picture or video measurement) |
ml.g5.24xlarge ml.g5.48xlarge ml.g6.24xlarge ml.g6.48xlarge ml.p4d.24xlarge ml.p4de.24xlarge |
Meta SAM 2.1 Massive | meta-vs-sam-2-1-hiera-large | ml.g6.24xlarge (5.5 MB whole picture or video measurement) |
ml.g5.24xlarge ml.g5.48xlarge ml.g6.24xlarge ml.g6.48xlarge ml.p4d.24xlarge ml.p4de.24xlarge |
Meta SAM 2.1 use instances: Inference and immediate examples
After you deploy the mannequin utilizing SageMaker JumpStart, it’s best to be capable of see a reference Jupyter pocket book that references the parser and helper capabilities wanted to start utilizing Meta SAM 2.1. After you comply with these cells within the pocket book, try to be prepared to start utilizing the mannequin’s imaginative and prescient segmentation capabilities.
Meta SAM 2.1 gives help for 3 completely different duties (automated masks generator, picture predictor, video predictor) to generate masks for varied objects in photographs, together with object monitoring in movies. Within the following examples, we show how you can use the automated masks generator and picture predictor on a JPG of a truck. This truck.jpg
file is saved within the jumpstart-cache-prod
bucket; you may entry it with the next code:
After you could have your picture and it’s encoded, you may create masks for objects within the picture. To be used instances the place you need to generate masks for each object within the picture, you should utilize the automated masks generator job.
Automated masks generator
The automated masks generator is nice for AI researchers for pc imaginative and prescient duties and purposes comparable to medical imaging and diagnostics to mechanically phase areas of curiosity like tumors or particular organs to supply extra correct diagnostic help. Moreover, the automated masks generator may be notably helpful within the autonomous automobile house, during which it may well phase out components in a digicam like pedestrians, autos, and different objects. Let’s use the automated masks generator to generate masks for all of the objects in truck.jpg
.
The next code is the immediate to generate masks in your base64 encoded picture:
We obtain the next output (parsed and visualized).
Picture predictor
Moreover, you may select which objects within the supplied picture you need to create a masks for by including factors inside that object for Meta SAM 2.1 to create. A use case for the picture predictor may be precious for duties associated to design and modeling by automating processes that usually require handbook efforts. For instance, the picture predictor can automate turning 2D photographs into 3D fashions by analyzing 2D photographs of blueprints, sketches, or flooring plans and producing preliminary 3D fashions. That is one in every of many examples of how the picture predictor can act as a bridge between 2D and 3D building throughout many various duties. We use the next picture with the factors that we used to immediate Meta SAM 2.1 for masking the article.
The next code is used to immediate Meta SAM 2.1 and plot the coordinates:
We obtain the next output (parsed and visualized).
Video predictor
We now show how you can immediate Meta SAM 2.1 for object monitoring on video. One use case could be for ergonomic knowledge assortment and coaching functions. You need to use the video predictor to investigate the motion and posture of people in actual time, serving as a strategy to cut back harm and enhance efficiency by setting alarms for unhealthy posture or actions. Let’s begin by accessing the basketball-layup.mp4
file [1] from the jumpstart-cache-prod
S3 bucket outlined within the following code:
Video:
The next code exhibits how one can arrange the immediate format to trace objects within the video. The primary object will use coordinates to trace and never observe, and the second object will observe one coordinate.
We obtain the next output (parsed and visualized).
Video:
Right here we are able to see that Meta SAM 2.1 Tiny was in a position to efficiently observe the objects based mostly off the coordinates that had been supplied in immediate.
Clear up
To keep away from incurring pointless prices, whenever you’re performed, delete the SageMaker AI endpoints utilizing the next code:
Alternatively, to make use of the SageMaker AI console, full the next steps:
- On the SageMaker AI console, underneath Inference within the navigation pane, select
- Seek for the embedding and textual content era endpoints.
- On the endpoint particulars web page, select Delete.
- Select Delete once more to substantiate.
Conclusion
On this publish, we explored how SageMaker JumpStart empowers knowledge scientists and ML engineers to find, entry, and deploy a variety of pre-trained FMs for inference, together with Meta’s most superior and succesful fashions thus far. Get began with SageMaker JumpStart and Meta SAM 2.1 fashions at this time. For extra details about SageMaker JumpStart, see SageMaker JumpStart pretrained models and Getting started with Amazon SageMaker JumpStart.
Sources:
[1] Erčulj F, Štrumbelj E (2015) Basketball Shot Varieties and Shot Success in Totally different Ranges of Aggressive Basketball. PLOS ONE 10(6): e0128885. https://doi.org/10.1371/journal.pone.0128885
Concerning the Authors
Marco Punio is a Sr. Specialist Options Architect centered on generative AI technique, utilized AI options, and conducting analysis to assist prospects hyper-scale on AWS. As a member of the third Social gathering Mannequin Supplier Utilized Sciences Options Structure staff at AWS, he’s a International Lead for the Meta – AWS Partnership and technical technique. Based mostly in Seattle, WA, Marco enjoys writing, studying, exercising, and constructing purposes in his free time.
Deepak Rupakula is a Principal GTM lead within the specialists group at AWS. He focuses on growing GTM technique for giant language fashions like Meta throughout AWS companies like Amazon Bedrock and Amazon SageMaker AI. With over 15 years of expertise within the tech business, his expertise contains management roles in product administration, buyer success, and analytics.
Harish Rao is a Senior Options Architect at AWS, specializing in large-scale distributed AI coaching and inference. He empowers prospects to harness the ability of AI to drive innovation and resolve complicated challenges. Outdoors of labor, Harish embraces an lively life-style, having fun with the tranquility of mountain climbing, the depth of racquetball, and the psychological readability of mindfulness practices.
Baladithya Balamurugan is a Options Architect at AWS centered on ML deployments for inference and utilizing AWS Neuron to speed up coaching and inference. He works with prospects to allow and speed up their ML deployments on companies comparable to Amazon SageMaker AI and Amazon EC2. Based mostly in San Francisco, Baladithya enjoys tinkering, growing purposes, and constructing his homelab in his free time.
Banu Nagasundaram leads product, engineering, and strategic partnerships for Amazon SageMaker JumpStart, SageMaker AI’s machine studying and generative AI hub. She is enthusiastic about constructing options that assist prospects speed up their AI journey and unlock enterprise worth.
Naman Nandan is a software program improvement engineer at AWS, specializing in enabling large-scale AI/ML inference workloads on Amazon SageMaker AI utilizing TorchServe, a challenge collectively developed by AWS and Meta. In his free time, he enjoys taking part in tennis and occurring hikes.