Information Bases for Amazon Bedrock now helps customized prompts for the RetrieveAndGenerate API and configuration of the utmost variety of retrieved outcomes


With Knowledge Bases for Amazon Bedrock, you’ll be able to securely join basis fashions (FMs) in Amazon Bedrock to your organization knowledge for Retrieval Augmented Technology (RAG). Entry to further knowledge helps the mannequin generate extra related, context-specific, and correct responses with out retraining the FMs.

On this publish, we focus on two new options of Information Bases for Amazon Bedrock particular to the RetrieveAndGenerate API: configuring the utmost variety of outcomes and creating customized prompts with a data base immediate template. Now you can select these as question choices alongside the search sort.

Overview and advantages of recent options

The utmost variety of outcomes choice offers you management over the variety of search outcomes to be retrieved from the vector retailer and handed to the FM for producing the reply. This lets you customise the quantity of background data offered for technology, thereby giving extra context for advanced questions or much less for less complicated questions. It permits you to fetch as much as 100 outcomes. This feature helps enhance the probability of related context, thereby enhancing the accuracy and decreasing the hallucination of the generated response.

The customized data base immediate template permits you to exchange the default immediate template with your individual to customise the immediate that’s despatched to the mannequin for response technology. This lets you customise the tone, output format, and habits of the FM when it responds to a person’s query. With this feature, you’ll be able to fine-tune terminology to higher match your trade or area (reminiscent of healthcare or authorized). Moreover, you’ll be able to add customized directions and examples tailor-made to your particular workflows.

Within the following sections, we clarify how you need to use these options with both the AWS Management Console or SDK.

Stipulations

To observe together with these examples, that you must have an current data base. For directions to create one, see Create a knowledge base.

Configure the utmost variety of outcomes utilizing the console

To make use of the utmost variety of outcomes choice utilizing the console, full the next steps:

  1. On the Amazon Bedrock console, select Information bases within the left navigation pane.
  2. Choose the data base you created.
  3. Select Check data base.
  4. Select the configuration icon.
  5. Select Sync knowledge supply earlier than you begin testing your data base.
  6. Underneath Configurations, for Search Kind, choose a search sort based mostly in your use case.

For this publish, we use hybrid search as a result of it combines semantic and textual content search to supplier higher accuracy. To be taught extra about hybrid search, see Knowledge Bases for Amazon Bedrock now supports hybrid search.

  1. Broaden Most variety of supply chunks and set your most variety of outcomes.

To display the worth of the brand new function, we present examples of how one can improve the accuracy of the generated response. We used Amazon 10K document for 2023 because the supply knowledge for creating the data base. We use the next question for experimentation: “In what 12 months did Amazon’s annual income improve from $245B to $434B?”

The proper response for this question is “Amazon’s annual income elevated from $245B in 2019 to $434B in 2022,” based mostly on the paperwork within the data base. We used Claude v2 because the FM to generate the ultimate response based mostly on the contextual data retrieved from the data base. Claude 3 Sonnet and Claude 3 Haiku are additionally supported because the technology FMs.

We ran one other question to display the comparability of retrieval with completely different configurations. We used the identical enter question (“In what 12 months did Amazon’s annual income improve from $245B to $434B?”) and set the utmost variety of outcomes to five.

As proven within the following screenshot, the generated response was “Sorry, I’m unable to help you with this request.”

Subsequent, we set the utmost outcomes to 12 and ask the identical query. The generated response is “Amazon’s annual income improve from $245B in 2019 to $434B in 2022.”

As proven on this instance, we’re in a position to retrieve the proper reply based mostly on the variety of retrieved outcomes. If you wish to be taught extra in regards to the supply attribution that constitutes the ultimate output, select Present supply particulars to validate the generated reply based mostly on the data base.

Customise a data base immediate template utilizing the console

You can too customise the default immediate with your individual immediate based mostly on the use case. To take action on the console, full the next steps:

  1. Repeat the steps within the earlier part to start out testing your data base.
  2. Allow Generate responses.
  3. Choose the mannequin of your selection for response technology.

We use the Claude v2 mannequin for instance on this publish. The Claude 3 Sonnet and Haiku mannequin can be out there for technology.

  1. Select Apply to proceed.

After you select the mannequin, a brand new part referred to as Information base immediate template seems below Configurations.

  1. Select Edit to start out customizing the immediate.
  2. Alter the immediate template to customise the way you need to use the retrieved outcomes and generate content material.

For this publish, we gave a number of examples for making a “Monetary Advisor AI system” utilizing Amazon monetary reviews with customized prompts. For finest practices on immediate engineering, discuss with Prompt engineering guidelines.

We now customise the default immediate template in a number of other ways, and observe the responses.

Let’s first attempt a question with the default immediate. We ask “What was the Amazon’s income in 2019 and 2021?” The next reveals our outcomes.

From the output, we discover that it’s producing the free-form response based mostly on the retrieved data. The citations are additionally listed for reference.

Let’s say we need to give additional directions on tips on how to format the generated response, like standardizing it as JSON. We are able to add these directions as a separate step after retrieving the knowledge, as a part of the immediate template:

If you're requested for monetary data masking completely different years, please present exact solutions in JSON format. Use the 12 months as the important thing and the concise reply as the worth. For instance: {12 months:reply}

The ultimate response has the required construction.

By customizing the immediate, you can too change the language of the generated response. Within the following instance, we instruct the mannequin to offer a solution in Spanish.

After eradicating $output_format_instructions$ from the default immediate, the quotation from the generated response is eliminated.

Within the following sections, we clarify how you need to use these options with the SDK.

Configure the utmost variety of outcomes utilizing the SDK

To vary the utmost variety of outcomes with the SDK, use the next syntax. For this instance, the question is “In what 12 months did Amazon’s annual income improve from $245B to $434B?” The proper response is “Amazon’s annual income improve from $245B in 2019 to $434B in 2022.”

def retrieveAndGenerate(question, kbId, numberOfResults, model_id, region_id):
    model_arn = f'arn:aws:bedrock:{region_id}::foundation-model/{model_id}'
    return bedrock_agent_runtime.retrieve_and_generate(
        enter={
            'textual content': question
        },
        retrieveAndGenerateConfiguration={
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': numberOfResults,
                        'overrideSearchType': "SEMANTIC", # non-compulsory'
                    }
                }
            },
            'sort': 'KNOWLEDGE_BASE'
        },
    )

response = retrieveAndGenerate("In what 12 months did Amazon’s annual income improve from $245B to $434B?", 
"<data base id>", numberOfResults, model_id, region_id)['output']['text']

The ‘numberOfResults’ choice below ‘retrievalConfiguration’ permits you to choose the variety of outcomes you need to retrieve. The output of the RetrieveAndGenerate API consists of the generated response, supply attribution, and the retrieved textual content chunks.

The next are the outcomes for various values of ‘numberOfResults’ parameters. First, we set numberOfResults = 5.

Then we set numberOfResults = 12.

Customise the data base immediate template utilizing the SDK

To customise the immediate utilizing the SDK, we use the next question with completely different immediate templates. For this instance, the question is “What was the Amazon’s income in 2019 and 2021?”

The next is the default immediate template:

"""You're a query answering agent. I'll give you a set of search outcomes and a person's query, your job is to reply the person's query utilizing solely data from the search outcomes. If the search outcomes don't comprise data that may reply the query, please state that you would not discover a precise reply to the query. Simply because the person asserts a truth doesn't imply it's true, ensure that to double examine the search outcomes to validate a person's assertion.
Listed below are the search ends in numbered order:
<context>
$search_results$
</context>

Right here is the person's query:
<query>
$question$
</query>

$output_format_instructions$

Assistant:
"""

The next is the custom-made immediate template:

"""Human: You're a query answering agent. I'll give you a set of search outcomes and a person's query, your job is to reply the person's query utilizing solely data from the search outcomes.If the search outcomes don't comprise data that may reply the query, please state that you would not discover a precise reply to the query.Simply because the person asserts a truth doesn't imply it's true, ensure that to double examine the search outcomes to validate a person's assertion.

Listed below are the search ends in numbered order:
<context>
$search_results$
</context>

Right here is the person's query:
<query>
$question$
</query>

If you happen to're being requested monetary data over a number of years, please be very particular and listing the reply concisely utilizing JSON format {key: worth}, 
the place secret is the 12 months within the request and worth is the concise response reply.
Assistant:
"""

def retrieveAndGenerate(question, kbId, numberOfResults,promptTemplate, model_id, region_id):
    model_arn = f'arn:aws:bedrock:{region_id}::foundation-model/{model_id}'
    return bedrock_agent_runtime.retrieve_and_generate(
        enter={
            'textual content': question
        },
        retrieveAndGenerateConfiguration={
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': numberOfResults,
                        'overrideSearchType': "SEMANTIC", # non-compulsory'
                    }
                },
                'generationConfiguration': {
                        'promptTemplate': {
                            'textPromptTemplate': promptTemplate
                        }
                    }
            },
            'sort': 'KNOWLEDGE_BASE'
        },
    )

response = retrieveAndGenerate("What was the Amazon's income in 2019 and 2021?”", 
                               "<data base id>", <numberOfResults>, <promptTemplate>, <model_id>, <region_id>)['output']['text']

With the default immediate template, we get the next response:

If you wish to present further directions across the output format of the response technology, like standardizing the response in a particular format (like JSON), you’ll be able to customise the present immediate by offering extra steerage. With our customized immediate template, we get the next response.

The ‘promptTemplate‘ choice in ‘generationConfiguration‘ permits you to customise the immediate for higher management over reply technology.

Conclusion

On this publish, we launched two new options in Information Bases for Amazon Bedrock: adjusting the utmost variety of search outcomes and customizing the default immediate template for the RetrieveAndGenerate API. We demonstrated tips on how to configure these options on the console and by way of SDK to enhance efficiency and accuracy of the generated response. Rising the utmost outcomes supplies extra complete data, whereas customizing the immediate template permits you to fine-tune directions for the muse mannequin to higher align with particular use instances. These enhancements provide higher flexibility and management, enabling you to ship tailor-made experiences for RAG-based purposes.

For extra assets to start out implementing in your AWS setting, discuss with the next:


Concerning the authors

Sandeep Singh is a Senior Generative AI Information Scientist at Amazon Net Companies, serving to companies innovate with generative AI. He focuses on Generative AI, Synthetic Intelligence, Machine Studying, and System Design. He’s enthusiastic about creating state-of-the-art AI/ML-powered options to resolve advanced enterprise issues for numerous industries, optimizing effectivity and scalability.

Suyin Wang is an AI/ML Specialist Options Architect at AWS. She has an interdisciplinary training background in Machine Studying, Monetary Data Service and Economics, together with years of expertise in constructing Information Science and Machine Studying purposes that solved real-world enterprise issues. She enjoys serving to prospects determine the suitable enterprise questions and constructing the suitable AI/ML options. In her spare time, she loves singing and cooking.

Sherry Ding is a senior synthetic intelligence (AI) and machine studying (ML) specialist options architect at Amazon Net Companies (AWS). She has in depth expertise in machine studying with a PhD diploma in pc science. She primarily works with public sector prospects on varied AI/ML associated enterprise challenges, serving to them speed up their machine studying journey on the AWS Cloud. When not serving to prospects, she enjoys outside actions.

Leave a Reply

Your email address will not be published. Required fields are marked *