TII Falcon-H1 fashions now out there on Amazon Bedrock Market and Amazon SageMaker JumpStart


This put up was co-authored with Jingwei Zuo from TII.

We’re excited to announce the supply of the Technology Innovation Institute (TII)’s Falcon-H1 fashions on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, builders and information scientists can now use six instruction-tuned Falcon-H1 fashions (0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B) on AWS, and have entry to a complete suite of hybrid structure fashions that mix conventional consideration mechanisms with State House Fashions (SSMs) to ship distinctive efficiency with unprecedented effectivity.

On this put up, we current an summary of Falcon-H1 capabilities and present the right way to get began with TII’s Falcon-H1 fashions on each Amazon Bedrock Market and SageMaker JumpStart.

Overview of TII and AWS collaboration

TII is a number one analysis institute based mostly in Abu Dhabi. As a part of UAE’s Superior Know-how Analysis Council (ATRC), TII focuses on superior know-how analysis and growth throughout AI, quantum computing, autonomous robotics, cryptography, and extra. TII employs worldwide groups of scientists, researchers, and engineers in an open and agile setting, aiming to drive technological innovation and place Abu Dhabi and the UAE as a world analysis and growth hub in alignment with the UAE National Strategy for Artificial Intelligence 2031.

TII and Amazon Net Companies (AWS) are collaborating to develop entry to made-in-the-UAE AI fashions throughout the globe. By combining TII’s technical experience in constructing giant language fashions (LLMs) with AWS Cloud-based AI and machine studying (ML) providers, professionals worldwide can now construct and scale generative AI functions utilizing the Falcon-H1 sequence of fashions.

About Falcon-H1 fashions

The Falcon-H1 structure implements a parallel hybrid design, utilizing parts from Mamba and Transformer architectures to mix the quicker inference and decrease reminiscence footprint of SSMs like Mamba with the effectiveness of Transformers’ consideration mechanism in understanding context and enhanced generalization capabilities. The Falcon-H1 structure scales throughout a number of configurations starting from 0.5–34 billion parameters and offers native assist for 18 languages. In accordance with TII, the Falcon-H1 household demonstrates notable effectivity with published metrics indicating that smaller mannequin variants obtain efficiency parity with bigger fashions. Among the advantages of Falcon-H1 sequence embrace:

  • Efficiency – The hybrid attention-SSM mannequin has optimized parameters with adjustable ratios between consideration and SSM heads, resulting in quicker inference, decrease reminiscence utilization, and powerful generalization capabilities. In accordance with TII benchmarks printed in Falcon-H1’s technical blog post and technical report, Falcon-H1 fashions exhibit superior efficiency throughout a number of scales in opposition to different main Transformer fashions of comparable or bigger scales. For instance, Falcon-H1-0.5B delivers efficiency just like typical 7B fashions from 2024, and Falcon-H1-1.5B-Deep rivals lots of the present main 7B-10B fashions.
  • Wide selection of mannequin sizes – The Falcon-H1 sequence contains six sizes: 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B, with each base and instruction-tuned variants. The Instruct fashions are actually out there in Amazon Bedrock Market and SageMaker JumpStart.
  • Multilingual by design – The fashions assist 18 languages natively (Arabic, Czech, German, English, Spanish, French, Hindi, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Romanian, Russian, Swedish, Urdu, and Chinese language) and might scale to over 100 languages in line with TII, due to a multilingual tokenizer skilled on numerous language datasets.
  • As much as 256,000 context size – The Falcon-H1 sequence allows functions in long-document processing, multi-turn dialogue, and long-range reasoning, displaying a definite benefit over rivals in sensible long-context functions like Retrieval Augmented Era (RAG).
  • Sturdy information and coaching technique – Coaching of Falcon-H1 fashions employs an modern method that introduces advanced information early on, opposite to conventional curriculum studying. It additionally implements strategic information reuse based mostly on cautious memorization window evaluation. Moreover, the coaching course of scales easily throughout mannequin sizes by means of a custom-made Maximal Replace Parametrization (µP) recipe, particularly tailored for this novel structure.
  • Balanced efficiency in science and knowledge-intensive domains – Via a rigorously designed information combination and common evaluations throughout coaching, the mannequin achieves robust basic capabilities and broad world information whereas minimizing unintended specialization or domain-specific biases.

Consistent with their mission to foster AI accessibility and collaboration, TII have launched Falcon-H1 fashions below the Falcon LLM license. It provides the next advantages:

  • Open supply nature and accessibility
  • Multi-language capabilities
  • Value-effectiveness in comparison with proprietary fashions
  • Power-efficiency

About Amazon Bedrock Market and SageMaker JumpStart

Amazon Bedrock Market provides entry to over 100 fashionable, rising, specialised, and domain-specific fashions, so you will discover the very best proprietary and publicly out there fashions on your use case based mostly on elements comparable to accuracy, flexibility, and price. On Amazon Bedrock Marketplace you’ll be able to uncover fashions in a single place and entry them by means of unified and safe Amazon Bedrock APIs. You may also choose your required variety of cases and the occasion sort to satisfy the calls for of your workload and optimize your prices.

SageMaker JumpStart helps you rapidly get began with machine studying. It offers entry to state-of-the-art mannequin architectures, comparable to language fashions, laptop imaginative and prescient fashions, and extra, with out having to construct them from scratch. With SageMaker JumpStart you’ll be able to deploy fashions in a safe setting by provisioning them on SageMaker inference cases and isolating them inside your digital personal cloud (VPC). You may also use Amazon SageMaker AI to additional customise and fine-tune the fashions and streamline the complete mannequin deployment course of.

Answer overview

This put up demonstrates the right way to deploy a Falcon-H1 mannequin utilizing each Amazon Bedrock Market and SageMaker JumpStart. Though we use Falcon-H1-0.5B for instance, you’ll be able to apply these steps to different fashions within the Falcon-H1 sequence. For assist figuring out which deployment choice—Amazon Bedrock Market or SageMaker JumpStart—most accurately fits your particular necessities, see Amazon Bedrock or Amazon SageMaker AI?

Deploy Falcon-H1-0.5B-Instruct with Amazon Bedrock Market

On this part, we present the right way to deploy the Falcon-H1-0.5B-Instruct mannequin in Amazon Bedrock Market.

Conditions

To strive the Falcon-H1-0.5B-Instruct mannequin in Amazon Bedrock Market, you need to have entry to an AWS account that may include your AWS sources.Previous to deploying Falcon-H1-0.5B-Instruct, confirm that your AWS account has adequate quota allocation for ml.g6.xlarge cases. The default quota for endpoints utilizing a number of occasion varieties and sizes is 0, so trying to deploy the mannequin and not using a increased quota will set off a deployment failure.

To request a quota improve, open the AWS Service Quotas console and seek for Amazon SageMaker. Find ml.g6.xlarge for endpoint utilization and select Request quota improve, then specify your required restrict worth. After the request is accredited, you’ll be able to proceed with the deployment.

Deploy the mannequin utilizing the Amazon Bedrock Market UI

To deploy the mannequin utilizing Amazon Bedrock Market, full the next steps:

  1. On the Amazon Bedrock console, below Uncover within the navigation pane, select Mannequin catalog.
  2. Filter for Falcon-H1 because the mannequin identify and select Falcon-H1-0.5B-Instruct.

The mannequin overview web page contains details about the mannequin’s license phrases, options, setup directions, and hyperlinks to additional sources.

  1. Assessment the mannequin license phrases, and for those who agree with the phrases, select Deploy.

  1. For Endpoint identify, enter an endpoint identify or go away it because the default pre-populated identify.
  2. To reduce prices whereas experimenting, set the Variety of cases to 1.
  3. For Occasion sort, select from the record of suitable occasion varieties. Falcon-H1-0.5B-Instruct is an environment friendly mannequin, so ml.m6.xlarge is adequate for this train.

Though the default configurations are usually adequate for fundamental wants, you’ll be able to customise superior settings like VPC, service entry permissions, encryption keys, and useful resource tags. These superior settings would possibly require adjustment for manufacturing environments to keep up compliance together with your group’s safety protocols.

  1. Select Deploy.
  2. A immediate asks you to remain on the web page whereas the AWS Identity and Access Management (IAM) function is being created. In case your AWS account lacks adequate quota for the chosen occasion sort, you’ll obtain an error message. On this case, seek advice from the previous prerequisite part to extend your quota, then strive the deployment once more.

Whereas deployment is in progress, you’ll be able to select Market mannequin deployments within the navigation pane to watch the deployment progress within the Managed deployment part. When the deployment is full, the endpoint standing will change from Creating to In Service.

Work together with the mannequin within the Amazon Bedrock Market playground

Now you can take a look at Falcon-H1 capabilities instantly within the Amazon Bedrock playground by deciding on the managed deployment and selecting Open in playground.

Now you can use the Amazon Bedrock Market playground to work together with Falcon-H1-0.5B-Instruct.

Invoke the mannequin utilizing code

On this part, we exhibit to invoke the mannequin utilizing the Amazon Bedrock Converse API.

Exchange the placeholder code with the endpoint’s Amazon Useful resource Title (ARN), which begins with arn:aws:sagemaker. You will discover this ARN on the endpoint particulars web page within the Managed deployments part.

import boto3
bedrock_runtime = boto3.shopper("bedrock-runtime")
endpoint_arn = "{ENDPOINT ARN}" # Exchange with endpoint ARN
response = bedrock_runtime.converse( modelId=endpoint_arn, messages=[{"role": "user", "content": [{"text": "What is generative AI?"}]}], inferenceConfig={"temperature": 0.1, "topP": 0.1})

print(response["output"]["message"]["content"][0]["text"])

To study extra in regards to the detailed steps and instance code for invoking the mannequin utilizing Amazon Bedrock APIs, seek advice from Submit prompts and generate response using the API.

Deploy Falcon-H1-0.5B-Instruct with SageMaker JumpStart

You may entry FMs in SageMaker JumpStart by means of Amazon SageMaker Studio, the SageMaker SDK, and the AWS Management Console. On this walkthrough, we exhibit the right way to deploy Falcon-H1-0.5B-Instruct utilizing the SageMaker Python SDK. Check with Deploy a model in Studio to discover ways to deploy the mannequin by means of SageMaker Studio.

Conditions

To deploy Falcon-H1-0.5B-Instruct with SageMaker JumpStart, you need to have the next stipulations:

  • An AWS account that may include your AWS sources.
  • An IAM function to entry SageMaker AI. To study extra about how IAM works with SageMaker AI, see Identity and Access Management for Amazon SageMaker AI.
  • Entry to SageMaker Studio with a JupyterLab house, or an interactive growth setting (IDE) comparable to Visible Studio Code or PyCharm.

Deploy the mannequin programmatically utilizing the SageMaker Python SDK

Earlier than deploying Falcon-H1-0.5B-Instruct utilizing the SageMaker Python SDK, ensure you have put in the SDK and configured your AWS credentials and permissions.

The next code instance demonstrates the right way to deploy the mannequin:

import sagemakerfrom sagemaker.jumpstart.mannequin
import JumpStartModelfrom sagemaker
import Session
import boto3
import json

# Initialize SageMaker session
session = sagemaker.Session()
function = sagemaker.get_execution_role()

# Specify mannequin parameters
model_id = "huggingface-llm-falcon-h1-0-5b-instruct"
instance_type = "ml.g6.xlarge" # Select applicable occasion based mostly in your wants

# Create and deploy the mannequin
mannequin = JumpStartModel( model_id=model_id, function=function, instance_type=instance_type, model_version="*" # Newest model)

# Deploy the mannequin
predictor = mannequin.deploy( initial_instance_count=1, accept_eula=True # Required for deploying basis fashions)

print("Endpoint identify:")
print(predictor.endpoint_name)

Carry out inference utilizing the SageMaker Python API

When the earlier code section completes efficiently, the Falcon-H1-0.5B-Instruct mannequin deployment is full and out there on a SageMaker endpoint. Observe the endpoint identify proven within the output—you’ll change the placeholder within the following code section with this worth.The next code demonstrates the right way to put together the enter information, make the inference API name, and course of the mannequin’s response:

import json
import boto3

session = boto3.Session() # Be sure your AWS credentials are configured
sagemaker_runtime = session.shopper("sagemaker-runtime")

endpoint_name = "{ENDPOINT_NAME}" # Exchange with endpoint identify from deployment output

payload = { "messages": [ { "role": "user", "content": "What is generative AI?" } ], "parameters": { "max_tokens": 256, "temperature": 0.1, "top_p": 0.1 } }

# Carry out inference
response = sagemaker_runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType="utility/json", Physique=json.dumps(payload))

# Parse the response
outcome = json.hundreds(response["Body"].learn().decode("utf-8"))generated_text = outcome["choices"][0]["message"]["content"].strip()
print("Generated Response:")
print(generated_text)

Clear up

To keep away from ongoing fees for AWS sources used whereas experimenting with Falcon-H1 fashions, make sure that to delete all deployed endpoints and their related sources while you’re completed. To take action, full the next steps:

  1. Delete Amazon Bedrock Market sources:
    1. On the Amazon Bedrock console, select Market mannequin deployment within the navigation pane.
    2. Underneath Managed deployments, select the Falcon-H1 mannequin endpoint you deployed earlier.
    3. Select Delete and make sure the deletion for those who now not want to make use of this endpoint in Amazon Bedrock Market.
  2. Delete SageMaker endpoints:
    1. On the SageMaker AI console, within the navigation pane, select Endpoints below Inference.
    2. Choose the endpoint related to the Falcon-H1 fashions.
    3. Select Delete and make sure the deletion. This stops the endpoint and avoids additional compute fees.
  3. Delete SageMaker fashions:
    1. On the SageMaker AI console, select Fashions below Inference.
    2. Choose the mannequin related together with your endpoint and select Delete.

At all times confirm that each one endpoints are deleted after experimentation to optimize prices. Check with the Amazon SageMaker documentation for added steering on managing sources.

Conclusion

The provision of Falcon-H1 fashions in Amazon Bedrock Market and SageMaker JumpStart helps builders, researchers, and companies construct cutting-edge generative AI functions with ease. Falcon-H1 fashions supply multilingual assist (18 languages) throughout numerous mannequin sizes (from 0.5B to 34B parameters) and assist as much as 256K context size, due to their environment friendly hybrid attention-SSM structure.

Through the use of the seamless discovery and deployment capabilities of Amazon Bedrock Market and SageMaker JumpStart, you’ll be able to speed up your AI innovation whereas benefiting from the safe, scalable, and cost-effective AWS Cloud infrastructure.

We encourage you to discover the Falcon-H1 fashions in Amazon Bedrock Market or SageMaker JumpStart. You should utilize these fashions in AWS Areas the place Amazon Bedrock or SageMaker JumpStart and the required occasion varieties can be found.

For additional studying, discover the AWS Machine Learning Blog, SageMaker JumpStart GitHub repository, and Amazon Bedrock User Guide. Begin constructing your subsequent generative AI utility with Falcon-H1 fashions and unlock new prospects with AWS!

Particular due to everybody who contributed to the launch: Evan Kravitz, Varun Morishetty, and Yotam Moss.


Concerning the authors

Mehran Nikoo leads the Go-to-Market technique for Amazon Bedrock and agentic AI in EMEA at AWS, the place he has been driving the event of AI programs and cloud-native options during the last 4 years. Previous to becoming a member of AWS, Mehran held management and technical positions at Trainline, McLaren, and Microsoft. He holds an MBA from Warwick Enterprise Faculty and an MRes in Laptop Science from Birkbeck, College of London.

Mustapha Tawbi is a Senior Associate Options Architect at AWS, specializing in generative AI and ML, with 25 years of enterprise know-how expertise throughout AWS, IBM, Sopra Group, and Capgemini. He has a PhD in Laptop Science from Sorbonne and a Grasp’s diploma in Information Science from Heriot-Watt College Dubai. Mustapha leads generative AI technical collaborations with AWS companions all through the MENAT area.

Jingwei Zuo is a Lead Researcher on the Know-how Innovation Institute (TII) within the UAE, the place he leads the Falcon Foundational Fashions crew. He acquired his PhD in 2022 from College of Paris-Saclay, the place he was awarded the Plateau de Saclay Doctoral Prize. He holds an MSc (2018) from the College of Paris-Saclay, an Engineer diploma (2017) from Sorbonne Université, and a BSc from Huazhong College of Science & Know-how.

John Liu is a Principal Product Supervisor for Amazon Bedrock at AWS. Beforehand, he served because the Head of Product for AWS Web3/Blockchain. Previous to becoming a member of AWS, John held numerous product management roles at public blockchain protocols and monetary know-how (fintech) corporations for 14 years. He additionally has 9 years of portfolio administration expertise at a number of hedge funds.

Hamza MIMI is a Options Architect for companions and strategic offers within the MENAT area at AWS, the place he bridges cutting-edge know-how with impactful enterprise outcomes. With experience in AI and a ardour for sustainability, he helps organizations architect modern options that drive each digital transformation and environmental accountability, remodeling advanced challenges into alternatives for progress and constructive change.

Leave a Reply

Your email address will not be published. Required fields are marked *