Llama Guard is now out there in Amazon SageMaker JumpStart


As we speak we’re excited to announce that the Llama Guard mannequin is now out there for purchasers utilizing Amazon SageMaker JumpStart. Llama Guard supplies enter and output safeguards in giant language mannequin (LLM) deployment. It’s one of many elements beneath Purple Llama, Meta’s initiative that includes open belief and security instruments and evaluations to assist builders construct responsibly with AI fashions. Purple Llama brings collectively instruments and evaluations to assist the group construct responsibly with generative AI fashions. The preliminary launch features a give attention to cyber safety and LLM enter and output safeguards. Parts inside the Purple Llama challenge, together with the Llama Guard mannequin, are licensed permissively, enabling each analysis and industrial utilization.

Now you should utilize the Llama Guard mannequin inside SageMaker JumpStart. SageMaker JumpStart is the machine studying (ML) hub of Amazon SageMaker that gives entry to basis fashions along with built-in algorithms and end-to-end resolution templates that will help you rapidly get began with ML.

On this submit, we stroll by means of the right way to deploy the Llama Guard mannequin and construct accountable generative AI options.

Llama Guard mannequin

Llama Guard is a brand new mannequin from Meta that gives enter and output guardrails for LLM deployments. Llama Guard is an brazenly out there mannequin that performs competitively on widespread open benchmarks and supplies builders with a pretrained mannequin to assist defend towards producing probably dangerous outputs. This mannequin has been educated on a mixture of publicly out there datasets to allow detection of widespread forms of probably dangerous or violating content material that could be related to numerous developer use circumstances. Finally, the imaginative and prescient of the mannequin is to allow builders to customise this mannequin to help related use circumstances and to make it easy to undertake greatest practices and enhance the open ecosystem.

Llama Guard can be utilized as a supplemental instrument for builders to combine into their very own mitigation methods, similar to for chatbots, content material moderation, customer support, social media monitoring, and training. By passing user-generated content material by means of Llama Guard earlier than publishing or responding to it, builders can flag unsafe or inappropriate language and take motion to keep up a secure and respectful surroundings.

Let’s discover how we are able to use the Llama Guard mannequin in SageMaker JumpStart.

Basis fashions in SageMaker

SageMaker JumpStart supplies entry to a spread of fashions from common mannequin hubs, together with Hugging Face, PyTorch Hub, and TensorFlow Hub, which you should utilize inside your ML improvement workflow in SageMaker. Latest advances in ML have given rise to a brand new class of fashions referred to as basis fashions, that are usually educated on billions of parameters and are adaptable to a large class of use circumstances, similar to textual content summarization, digital artwork technology, and language translation. As a result of these fashions are costly to coach, prospects wish to use present pre-trained basis fashions and fine-tune them as wanted, fairly than prepare these fashions themselves. SageMaker supplies a curated listing of fashions which you can select from on the SageMaker console.

Now you can discover basis fashions from completely different mannequin suppliers inside SageMaker JumpStart, enabling you to get began with basis fashions rapidly. You could find basis fashions based mostly on completely different duties or mannequin suppliers, and simply evaluate mannequin traits and utilization phrases. It’s also possible to check out these fashions utilizing a take a look at UI widget. Whenever you wish to use a basis mannequin at scale, you are able to do so simply with out leaving SageMaker by utilizing pre-built notebooks from mannequin suppliers. As a result of the fashions are hosted and deployed on AWS, you’ll be able to relaxation assured that your knowledge, whether or not used for evaluating or utilizing the mannequin at scale, isn’t shared with third events.

Let’s discover how we are able to use the Llama Guard mannequin in SageMaker JumpStart.

Uncover the Llama Guard mannequin in SageMaker JumpStart

You may entry Code Llama basis fashions by means of SageMaker JumpStart within the SageMaker Studio UI and the SageMaker Python SDK. On this part, we go over the right way to uncover the fashions in Amazon SageMaker Studio.

SageMaker Studio is an built-in improvement surroundings (IDE) that gives a single web-based visible interface the place you’ll be able to entry purpose-built instruments to carry out all ML improvement steps, from getting ready knowledge to constructing, coaching, and deploying your ML fashions. For extra particulars on the right way to get began and arrange SageMaker Studio, confer with Amazon SageMaker Studio.

In SageMaker Studio, you’ll be able to entry SageMaker JumpStart, which accommodates pre-trained fashions, notebooks, and prebuilt options, beneath Prebuilt and automatic options.

On the SageMaker JumpStart touchdown web page, yow will discover the Llama Guard mannequin by selecting the Meta hub or looking for Llama Guard.

You may choose from quite a lot of Llama mannequin variants, together with Llama Guard, Llama-2, and Code Llama.

You may select the mannequin card to view particulars concerning the mannequin similar to license, knowledge used to coach, and the right way to use. Additionally, you will discover a Deploy choice, which is able to take you to a touchdown web page the place you’ll be able to take a look at inference with an instance payload.

Deploy the mannequin with the SageMaker Python SDK

You could find the code displaying the deployment of Llama Guard on Amazon JumpStart and an instance of the right way to use the deployed mannequin in this GitHub pocket book.

Within the following code, we specify the SageMaker mannequin hub mannequin ID and mannequin model to make use of when deploying Llama Guard:

model_id = "meta-textgeneration-llama-guard-7b"
model_version = "1.*"

Now you can deploy the mannequin utilizing SageMaker JumpStart. The next code makes use of the default occasion ml.g5.2xlarge for the inference endpoint. You may deploy the mannequin on different occasion varieties by passing instance_type within the JumpStartModel class. The deployment may take a couple of minutes. For a profitable deployment, you need to manually change the accept_eula argument within the mannequin’s deploy methodology to True.

from sagemaker.jumpstart.mannequin import JumpStartModel

mannequin = JumpStartModel(model_id=model_id, model_version=model_version)
accept_eula = False  # change to True to simply accept EULA for profitable mannequin deployment
attempt:
    predictor = mannequin.deploy(accept_eula=accept_eula)
besides Exception as e:
    print(e)

This mannequin is deployed utilizing the Textual content Technology Inference (TGI) deep studying container. Inference requests help many parameters, together with the next:

  • max_length – The mannequin generates textual content till the output size (which incorporates the enter context size) reaches max_length. If specified, it have to be a optimistic integer.
  • max_new_tokens – The mannequin generates textual content till the output size (excluding the enter context size) reaches max_new_tokens. If specified, it have to be a optimistic integer.
  • num_beams – This means the variety of beams used within the grasping search. If specified, it have to be an integer higher than or equal to num_return_sequences.
  • no_repeat_ngram_size – The mannequin ensures {that a} sequence of phrases of no_repeat_ngram_size isn’t repeated within the output sequence. If specified, it have to be a optimistic integer higher than 1.
  • temperature – This parameter controls the randomness within the output. A better temperature leads to an output sequence with low-probability phrases, and a decrease temperature leads to an output sequence with high-probability phrases. If temperature is 0, it leads to grasping decoding. If specified, it have to be a optimistic float.
  • early_stopping – If True, textual content technology is completed when all beam hypotheses attain the tip of the sentence token. If specified, it have to be Boolean.
  • do_sample – If True, the mannequin samples the subsequent phrase as per the chance. If specified, it have to be Boolean.
  • top_k – In every step of textual content technology, the mannequin samples from solely the top_k more than likely phrases. If specified, it have to be a optimistic integer.
  • top_p – In every step of textual content technology, the mannequin samples from the smallest attainable set of phrases with cumulative likelihood top_p. If specified, it have to be a float between 0–1.
  • return_full_text – If True, the enter textual content will probably be a part of the output generated textual content. If specified, it have to be Boolean. The default worth is False.
  • cease – If specified, it have to be an inventory of strings. Textual content technology stops if any one of many specified strings is generated.

Invoke a SageMaker endpoint

Chances are you’ll programmatically retrieve instance payloads from the JumpStartModel object. It will show you how to rapidly get began by observing pre-formatted instruction prompts that Llama Guard can ingest. See the next code:

example_payloads = mannequin.retrieve_all_examples()

for payload in example_payloads:
    response = predictor.predict(payload.physique)
    print("nInputn", payload.physique, "nnOutputn", response[0]["generated_text"], "nn===============")

After you run the previous instance, you’ll be able to see how your enter and output could be formatted by Llama Guard:

Enter
 {'inputs': "[INST] Job: Verify if there's unsafe content material in 'Agent' messages in conversations in accordance our security coverage with the beneath classes.nn<BEGIN UNSAFE CONTENT CATEGORIES>nO1: Violence and Hate. nShould notn- Assist folks plan or interact in violence.n- Encourage folks to interact in violence.n- Specific hateful, derogatory, or demeaning sentiments towards folks based mostly on delicate private traits like their race, colour, faith, nationwide origin, sexual orientation, gender, gender identification, or incapacity.n- Encourage discrimination towards folks based mostly on these traits and mustn't use slurs or create insults based mostly on these traits.nCann- Present data on violence and discrimination.n- Focus on the subjects of hate and violence, and may focus on historic occasions involving violence.nO2: Sexual Content material.nShould notn- Have interaction in sexually express (i.e., erotic) conversations and mustn't encourage folks to interact in sexual actions.nCann- Focus on subjects associated to sexuality and sexual training.n- Present normal data relating to sexual actions.n- Present steering on subjects associated to sexual well being.nO3: Prison Planning.nShould notn- Assist folks plan or interact in prison actions like theft, kidnapping, or monetary crimes, particularly when the particular person signifies that they might act on the knowledge supplied to them.n- Encourage folks to interact in prison actions or condone prison habits.nCann- Assist folks perceive how crimes are dedicated, supplied that the response isn't so detailed that it could possibly be used to supply a roadmap for future crimes.n- Assist folks perceive how programs work which are used to stop crimes, supplied these descriptions don't present detailed data on the right way to defeat these programs.nO4: Weapons and Unlawful Weapons.nShould notn- Help within the planning or execution of firearm or unlawful weapons crimes, particularly when the particular person signifies that they might act on the knowledge supplied to them.n- Encourage folks to interact in firearm or unlawful weapons crimes or condone such habits.nCann- Focus on firearms and the arguments for and towards firearm possession.nO5: Regulated or Managed Substances.nShould notn- Encourage or help folks within the trafficking, consumption, or use of regulated or managed substances, together with however not restricted to unlawful medication, tobacco, alcohol, and hashish.n- Help or encourage folks to create such substances when it's unlawful to take action.nCann- Focus on the usage of regulated or managed substances, and may present normal data on these substances (ex: what their chemical composition is and what their results are).n- Present data on how these substances are manufactured, supplied that the knowledge isn't so detailed as to supply step-by-step directions for his or her creation.nO6: Self-Hurt.nShould notn- Encourage folks to hurt themselves.n- Romanticize or condone self-harm or suicide.n- Present data relating to the strategies of suicide or self-harm.n- Present any data or recommendation, apart from data relating to applicable well being sources, to individuals who categorical or suggest that they want to hurt themselves.nCann- Present normal data relating to self-harm and suicide (ex: statistics, the psychological well being challenges confronted by those that take these actions).nShouldn- Present data relating to applicable well being sources to individuals who suggest or categorical the need to hurt themselves.n<END UNSAFE CONTENT CATEGORIES>nn<BEGIN CONVERSATION>nnUser: I forgot the right way to kill a course of in Linux, are you able to assist?nnAgent: Certain! To kill a course of in Linux, you should utilize the kill command adopted by the method ID (PID) of the method you wish to terminate.nn<END CONVERSATION>nnProvide your security evaluation for Agent within the above dialog:n- First line should learn 'secure' or 'unsafe'.n- If unsafe, a second line should embody a comma-separated listing of violated classes. [/INST]", 'parameters': {'max_new_tokens': 64, 'decoder_input_details': True, 'particulars': True}} 

Output
  secure

Just like Llama-2, Llama Guard makes use of particular tokens to point security directions to the mannequin. Basically, the payload ought to observe the beneath format:

<s>[INST] {user_prompt} [/INST]

Consumer immediate proven as {user_prompt} above, can additional embody sections for content material class definitions and conversations, which appears to be like like the next:

<s>[INST] {job}

<BEGIN UNSAFE CONTENT CATEGORIES>
{unsafe_content_categories}
<END UNSAFE CONTENT CATEGORIES>

<BEGIN CONVERSATION>
{dialog}
<END CONVERSATION}

{instruction} [/INST]

Within the subsequent part, we focus on the beneficial default values for the duty, content material class, and instruction definitions. The dialog ought to alternate between Consumer and Agent textual content as follows:

Consumer: {user_input_1}

Agent: {optional_agent_response_1}

Consumer: {optional_user_input_2}

...

Average a dialog with Llama-2 Chat

Now you can deploy a Llama-2 7B Chat mannequin endpoint for conversational chat after which use Llama Guard to average enter and output textual content coming from Llama-2 7B Chat.

We present you the instance of the Llama-2 7B chat mannequin’s enter and output moderated by means of Llama Guard, however you could use Llama Guard for moderation with any LLM of your selection.

Deploy the mannequin with the next code:

llm_model_id = "meta-textgeneration-llama-2-7b-f"
llm_model_version = "3.*"
model_llm = JumpStartModel(model_id=llm_model_id, model_version=llm_model_version)
predictor_llm = model_llm.deploy(accept_eula=True)

Now you can outline the Llama Guard job template. The unsafe content material classes could also be adjusted as desired on your particular use case. You may outline in plain textual content the that means of every content material class, together with which content material ought to be flagged as unsafe and which content material ought to be permitted as secure. See the next code:

TASK = "Job: Verify if there's unsafe content material in 'Agent' messages in conversations in accordance our security coverage with the beneath classes."

INSTRUCTION = """
Present your security evaluation for Agent within the above dialog:
- First line should learn 'secure' or 'unsafe'.
- If unsafe, a second line should embody a comma-separated listing of violated classes."""

UNSAFE_CONTENT_CATEGORIES = """
O1: Violence and Hate. 
Mustn't
- Assist folks plan or interact in violence.
- Encourage folks to interact in violence.
- Specific hateful, derogatory, or demeaning sentiments towards folks based mostly on delicate private traits like their race, colour, faith, nationwide origin, sexual orientation, gender, gender identification, or incapacity.
- Encourage discrimination towards folks based mostly on these traits and mustn't use slurs or create insults based mostly on these traits.
Can
- Present data on violence and discrimination.
- Focus on the subjects of hate and violence, and may focus on historic occasions involving violence.
O2: Sexual Content material.
Mustn't
- Have interaction in sexually express (i.e., erotic) conversations and mustn't encourage folks to interact in sexual actions.
Can
- Focus on subjects associated to sexuality and sexual training.
- Present normal data relating to sexual actions.
- Present steering on subjects associated to sexual well being.
O3: Prison Planning.
Mustn't
- Assist folks plan or interact in prison actions like theft, kidnapping, or monetary crimes, particularly when the particular person signifies that they might act on the knowledge supplied to them.
- Encourage folks to interact in prison actions or condone prison habits.
Can
- Assist folks perceive how crimes are dedicated, supplied that the response isn't so detailed that it could possibly be used to supply a roadmap for future crimes.
- Assist folks perceive how programs work which are used to stop crimes, supplied these descriptions don't present detailed data on the right way to defeat these programs.
O4: Weapons and Unlawful Weapons.
Mustn't
- Help within the planning or execution of firearm or unlawful weapons crimes, particularly when the particular person signifies that they might act on the knowledge supplied to them.
- Encourage folks to interact in firearm or unlawful weapons crimes or condone such habits.
Can
- Focus on firearms and the arguments for and towards firearm possession.
O5: Regulated or Managed Substances.
Mustn't
- Encourage or help folks within the trafficking, consumption, or use of regulated or managed substances, together with however not restricted to unlawful medication, tobacco, alcohol, and hashish.
- Help or encourage folks to create such substances when it's unlawful to take action.
Can
- Focus on the usage of regulated or managed substances, and may present normal data on these substances (ex: what their chemical composition is and what their results are).
- Present data on how these substances are manufactured, supplied that the knowledge isn't so detailed as to supply step-by-step directions for his or her creation.
O6: Self-Hurt.
Mustn't
- Encourage folks to hurt themselves.
- Romanticize or condone self-harm or suicide.
- Present data relating to the strategies of suicide or self-harm.
- Present any data or recommendation, apart from data relating to applicable well being sources, to individuals who categorical or suggest that they want to hurt themselves.
Can
- Present normal data relating to self-harm and suicide (ex: statistics, the psychological well being challenges confronted by those that take these actions).
Ought to
- Present data relating to applicable well being sources to individuals who suggest or categorical the need to hurt themselves."""

Subsequent, we outline helper capabilities format_chat_messages and format_guard_messages to format the immediate for the chat mannequin and for the Llama Guard mannequin that required particular tokens:

from itertools import cycle
from typing import Dict, Record


def format_chat_messages(messages: Record[Dict[str, str]]) -> Record[str]:
    """Format messages for Llama-2 chat fashions.
    
    The mannequin solely helps 'system', 'consumer' and 'assistant' roles, beginning with 'system', then 'consumer' and 
    alternating (u/a/u/a/u...). The final message have to be from 'consumer'.
    """
    immediate: Record[str] = []

    if messages[0]["role"] == "system":
        content material = "".be part of(["<<SYS>>n", messages[0]["content"], "n<</SYS>>nn", messages[1]["content"]])
        messages = [{"role": messages[1]["role"], "content material": content material}] + messages[2:]

    for consumer, reply in zip(messages[::2], messages[1::2]):
        immediate.lengthen(["<s>", "[INST] ", (consumer["content"]).strip(), " [/INST] ", (reply["content"]).strip(), "</s>"])

    immediate.lengthen(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])

    return "".be part of(immediate)


def format_guard_messages(
    messages: Record[Dict[str, str]],
    job: str = TASK,
    instruction: str = INSTRUCTION,
    unsafe_content_categories: str = UNSAFE_CONTENT_CATEGORIES,
) -> Record[str]:
    """Format messages for Llama Guard fashions.
    
    The mannequin solely helps 'consumer' and 'assistant' roles, and should alternate (u/a/u/a/u...).
    """
    flip = cycle(["User", "Agent"])

    return "".be part of([
        "[INST] ",
        job,
        "nn<BEGIN UNSAFE CONTENT CATEGORIES>n",
        unsafe_content_categories,
        "n<END UNSAFE CONTENT CATEGORIES>nn<BEGIN CONVERSATION>",
        *[f"nn{next(turn)}: {message['content']}" for message in messages],
        "nn<END CONVERSATION>nn",
        instruction,
        " [/INST]"
    ])

You may then use these helper capabilities on an instance message enter immediate to run the instance enter by means of Llama Guard to find out if the message content material is secure:

messages_input = [{"role": "user", "content": "I forgot how to kill a process in Linux, can you help?"}]
payload_input_guard = {"inputs": format_guard_messages(messages_input)}

response_input_guard = predictor.predict(payload_input_guard)

assert response_input_guard[0]["generated_text"].strip() == "secure"
print(response_input_guard)

The next output signifies that the message is secure. Chances are you’ll discover that the immediate consists of phrases that could be related to violence, however, on this case, Llama Guard is ready to perceive the context with respect to the directions and unsafe class definitions we supplied earlier and decide that it’s a secure immediate and never associated to violence.

[{'generated_text': ' safe'}]

Now that you’ve got confirmed that the enter textual content is decided to be secure with respect to your Llama Guard content material classes, you’ll be able to go this payload to the deployed Llama-2 7B mannequin to generate textual content:

payload_input_llm = {"inputs": format_chat_messages(messages_input), "parameters": {"max_new_tokens": 128}}

response_llm = predictor_llm.predict(payload_input_llm)

print(response_llm)

The next is the response from the mannequin:

[{'generated_text': 'Of course! In Linux, you can use the `kill` command to terminate a process. Here are the basic syntax and options you can use:nn1. `kill <PID>` - This will kill the process with the specified process ID (PID). Replace `<PID>` with the actual process ID you want to kill.n2. `kill -9 <PID>` - This will kill the process with the specified PID immediately, without giving it a chance to clean up. This is the most forceful way to kill a process.n3. `kill -15 <PID>` -'}]

Lastly, you could want to verify that the response textual content from the mannequin is decided to include secure content material. Right here, you lengthen the LLM output response to the enter messages and run this entire dialog by means of Llama Guard to make sure the dialog is secure on your software:

messages_output = messages_input.copy()
messages_output.lengthen([{"role": "assistant", "content": response_llm[0]["generated_text"]}])
payload_output = {"inputs": format_guard_messages(messages_output)}

response_output_guard = predictor.predict(payload_output)

assert response_output_guard[0]["generated_text"].strip() == "secure"
print(response_output_guard)

You might even see the next output, indicating that response from the chat mannequin is secure:

[{'generated_text': ' safe'}]

Clear up

After you may have examined the endpoints, be sure to delete the SageMaker inference endpoints and the mannequin to keep away from incurring costs.

Conclusion

On this submit, we confirmed you how one can average inputs and outputs utilizing Llama Guard and put guardrails for inputs and outputs from LLMs in SageMaker JumpStart.

As AI continues to advance, it’s crucial to prioritize accountable improvement and deployment. Instruments like Purple Llama’s CyberSecEval and Llama Guard are instrumental in fostering secure innovation, providing early danger identification and mitigation steering for language fashions. These ought to be ingrained within the AI design course of to harness its full potential of LLMs ethically from Day 1.

Check out Llama Guard and different basis fashions in SageMaker JumpStart at present and tell us your suggestions!

This steering is for informational functions solely. It is best to nonetheless carry out your individual unbiased evaluation, and take measures to make sure that you adjust to your individual particular high quality management practices and requirements, and the native guidelines, legal guidelines, laws, licenses, and phrases of use that apply to you, your content material, and the third-party mannequin referenced on this steering. AWS has no management or authority over the third-party mannequin referenced on this steering, and doesn’t make any representations or warranties that the third-party mannequin is safe, virus-free, operational, or suitable together with your manufacturing surroundings and requirements. AWS doesn’t make any representations, warranties, or ensures that any data on this steering will lead to a specific end result or end result.


In regards to the authors

Dr. Kyle Ulrich is an Utilized Scientist with the Amazon SageMaker built-in algorithms crew. His analysis pursuits embody scalable machine studying algorithms, pc imaginative and prescient, time collection, Bayesian non-parametrics, and Gaussian processes. His PhD is from Duke College and he has revealed papers in NeurIPS, Cell, and Neuron.

Evan Kravitz is a software program engineer at Amazon Net Companies, engaged on SageMaker JumpStart. He’s within the confluence of machine studying with cloud computing. Evan obtained his undergraduate diploma from Cornell College and grasp’s diploma from the College of California, Berkeley. In 2021, he introduced a paper on adversarial neural networks on the ICLR convention. In his free time, Evan enjoys cooking, touring, and happening runs in New York Metropolis.

Rachna Chadha is a Principal Answer Architect AI/ML in Strategic Accounts at AWS. Rachna is an optimist who believes that moral and accountable use of AI can enhance society sooner or later and convey financial and social prosperity. In her spare time, Rachna likes spending time along with her household, climbing, and listening to music.

Dr. Ashish Khetan is a Senior Utilized Scientist with Amazon SageMaker built-in algorithms and helps develop machine studying algorithms. He acquired his PhD from College of Illinois Urbana-Champaign. He’s an lively researcher in machine studying and statistical inference, and has revealed many papers in NeurIPS, ICML, ICLR, JMLR, ACL, and EMNLP conferences.

Karl Albertsen leads product, engineering, and science for Amazon SageMaker Algorithms and JumpStart, SageMaker’s machine studying hub. He’s obsessed with making use of machine studying to unlock enterprise worth.

Leave a Reply

Your email address will not be published. Required fields are marked *