Construct a generative AI Slack chat assistant utilizing Amazon Bedrock and Amazon Kendra


Regardless of the proliferation of knowledge and information in enterprise environments, staff and stakeholders usually discover themselves trying to find info and struggling to get their questions answered rapidly and effectively. This will result in productiveness losses, frustration, and delays in decision-making.

A generative AI Slack chat assistant can assist tackle these challenges by offering a available, clever interface for customers to work together with and acquire the data they want. Through the use of the pure language processing and technology capabilities of generative AI, the chat assistant can perceive person queries, retrieve related info from numerous information sources, and supply tailor-made, contextual responses.

By harnessing the facility of generative AI and Amazon Web Services (AWS) providers Amazon Bedrock, Amazon Kendra, and Amazon Lex, this answer gives a pattern structure to construct an clever Slack chat assistant that may streamline info entry, improve person experiences, and drive productiveness and effectivity inside organizations.

Why use Amazon Kendra for constructing a RAG utility?

Amazon Kendra is a completely managed service that gives out-of-the-box semantic search capabilities for state-of-the-art rating of paperwork and passages. You should use Amazon Kendra to quickly build high-accuracy generative AI applications on enterprise data and supply essentially the most related content material and paperwork to maximise the standard of your Retrieval Augmented Era (RAG) payload, yielding higher massive language mannequin (LLM) responses than utilizing standard or keyword-based search options. Amazon Kendra gives simple-to-use deep studying search fashions which are pre-trained on 14 domains and don’t require machine studying (ML) experience. Amazon Kendra can index content material from a variety of sources, together with databases, content material administration programs, file shares, and net pages.

Additional, the FAQ characteristic in Amazon Kendra enhances the broader retrieval capabilities of the service, permitting the RAG system to seamlessly change between offering prewritten FAQ responses and dynamically producing responses by querying the bigger information base. This makes it well-suited for powering the retrieval element of a RAG system, permitting the mannequin to entry a broad information base when producing responses. By integrating the FAQ capabilities of Amazon Kendra right into a RAG system, the mannequin can use a curated set of high-quality, authoritative solutions for generally requested questions. This will enhance the general response high quality and person expertise, whereas additionally decreasing the burden on the language mannequin to generate these fundamental responses from scratch.

This answer balances retaining customizations by way of mannequin choice, immediate engineering, and including FAQs with not having to cope with phrase embeddings, doc chunking, and different lower-level complexities usually required for RAG implementations.

Resolution overview

The chat assistant is designed to help customers by answering their questions and offering info on quite a lot of matters. The aim of the chat assistant is to be an internal-facing Slack software that may assist staff and stakeholders discover the data they want.

The structure makes use of Amazon Lex for intent recognition, AWS Lambda for processing queries, Amazon Kendra for looking via FAQs and net content material, and Amazon Bedrock for producing contextual responses powered by LLMs. By combining these providers, the chat assistant can perceive pure language queries, retrieve related info from a number of information sources, and supply humanlike responses tailor-made to the person’s wants. The answer showcases the facility of generative AI in creating clever digital assistants that may streamline workflows and improve person experiences primarily based on mannequin selections, FAQs, and modifying system prompts and inference parameters.

Structure diagram

The next diagram illustrates a RAG method the place the person sends a question via the Slack utility and receives a generated response primarily based on the information listed in Amazon Kendra. On this put up, we use Amazon Kendra Net Crawler as the information supply and embody FAQs saved on Amazon Simple Storage Service (Amazon S3). See Data source connectors for an inventory of supported information supply connectors for Amazon Kendra.

ML-16837-arch-diag

The step-by-step workflow for the structure is the next:

  1. The person sends a question comparable to What's the AWS Properly-Architected Framework? via the Slack app.
  2. The question goes to Amazon Lex, which identifies the intent.
  3. At present two intents are configured in Amazon Lex (Welcome and FallbackIntent).
  4. The welcome intent is configured to reply with a greeting when a person enters a greeting comparable to “hello” or “whats up.” The assistant responds with “Hiya! I can assist you with queries primarily based on the paperwork offered. Ask me a query.”
  5. The fallback intent is fulfilled with a Lambda operate.
    1. The Lambda operate searches Amazon Kendra FAQs via the search_Kendra_FAQ methodology by taking the person question and Amazon Kendra index ID as inputs. If there’s a match with a excessive confidence rating, the reply from the FAQ is returned to the person.
      def search_Kendra_FAQ(query, kendra_index_id):
          """
          This operate takes within the query from the person, and checks if the query exists within the Kendra FAQs.
          :param query: The query the person is asking that was requested through the frontend enter textual content field.
          :param kendra_index_id: The kendra index containing the paperwork and FAQs
          :return: If present in FAQs, returns the reply together with any related hyperlinks. If not, returns False after which calls kendra_retrieve_document operate.
          """
          kendra_client = boto3.consumer('kendra')
          response = kendra_client.question(IndexId=kendra_index_id, QueryText=query, QueryResultTypeFilter="QUESTION_ANSWER")
          for merchandise in response['ResultItems']:
              score_confidence = merchandise['ScoreAttributes']['ScoreConfidence']
              # Taking solutions from FAQs which have a really excessive confidence rating solely
              if score_confidence == 'VERY_HIGH' and len(merchandise['AdditionalAttributes']) > 1:
                  textual content = merchandise['AdditionalAttributes'][1]['Value']['TextWithHighlightsValue']['Text']
                  url = "None"
                  if merchandise['DocumentURI'] != '':
                      url = merchandise['DocumentURI']
                  return (textual content, url)
          return (False, False)

    2. If there isn’t a match with a excessive sufficient confidence rating, related paperwork from Amazon Kendra with a excessive confidence rating are retrieved via the kendra_retrieve_document methodology and despatched to Amazon Bedrock to generate a response because the context.
      def kendra_retrieve_document(query, kendra_index_id):
          """
          This operate takes within the query from the person, and retrieves related passages primarily based on default PageSize of 10.
          :param query: The query the person is asking that was requested through the frontend enter textual content field.
          :param kendra_index_id: The kendra index containing the paperwork and FAQs
          :return: Returns the context to be despatched to the LLM and doc URIs to be returned as related information sources.
          """
          kendra_client = boto3.consumer('kendra')
          paperwork = kendra_client.retrieve(IndexId=kendra_index_id, QueryText=query)
          textual content = ""
          uris = set()
          if len(paperwork['ResultItems']) > 0:
              for i in vary(len(paperwork['ResultItems'])):
                  score_confidence = paperwork['ResultItems'][i]['ScoreAttributes']['ScoreConfidence']
                  if score_confidence == 'VERY_HIGH' or score_confidence == 'HIGH':
                      textual content += paperwork['ResultItems'][i]['Content'] + "n"
                      uris.add(paperwork['ResultItems'][i]['DocumentURI'])
          return (textual content, uris)

    3. The response is generated from Amazon Bedrock with the invokeLLM methodology. The next is a snippet of the invokeLLM methodology inside the success operate. Learn extra on inference parameters and system prompts to change parameters which are handed into the Amazon Bedrock invoke mannequin request.
      def invokeLLM(query, context, modelId):
          """
          This operate takes within the query from the person, together with the Kendra responses as context to generate a solution
          for the person on the frontend.
          :param query: The query the person is asking that was requested through the frontend enter textual content field.
          :param paperwork: The response from the Kendra doc retrieve question, used as context to generate a greater
          reply.
          :return: Returns the ultimate reply that can be offered to the end-user of the applying who requested the unique
          query.
          """
          # Setup Bedrock consumer
          bedrock = boto3.consumer('bedrock-runtime')
          # configure mannequin specifics comparable to particular mannequin
          modelId = modelId
      
          # physique of information with parameters that's handed into the bedrock invoke mannequin request
          physique = json.dumps({"max_tokens": 350,
                  "system": "You're a truthful AI assistant. Your objective is to offer informative and substantive responses to queries primarily based on the paperwork offered. Should you have no idea the reply to a query, you in truth say you have no idea.",
                  "messages": [{"role": "user", "content": "Answer this user query:" + question + "with the following context:" + context}],
                  "anthropic_version": "bedrock-2023-05-31",
                      "temperature":0,
                  "top_k":250,
                  "top_p":0.999})
      
          # Invoking the bedrock mannequin along with your specs
          response = bedrock.invoke_model(physique=physique,
                                          modelId=modelId)
          # the physique of the response that was generated
          response_body = json.masses(response.get('physique').learn())
          # retrieving the precise completion discipline, the place you reply can be
          reply = response_body.get('content material')
          # returning the reply as a last consequence, which finally will get returned to the tip person
          return reply

    4. Lastly, the response generated from Amazon Bedrock together with the related referenced URLs are returned to the tip person.

    When deciding on web sites to index, adhere to the AWS Acceptable Use Policy and different AWS phrases. Keep in mind that you would be able to solely use Amazon Kendra Net Crawler to index your personal net pages or net pages that you’ve authorization to index. Go to the Amazon Kendra Web Crawler information supply information to study extra about utilizing the net crawler as an information supply. Utilizing Amazon Kendra Net Crawler to aggressively crawl web sites or net pages you don’t personal is not thought of acceptable use.

    Supported options

    The chat assistant helps the next options:

    1. Help for the next Anthropic’s fashions on Amazon Bedrock:
      • claude-v2
      • claude-3-haiku-20240307-v1:0
      • claude-instant-v1
      • claude-3-sonnet-20240229-v1:0
    2. Help for FAQs and the Amazon Kendra Net Crawler information supply
    3. Returns FAQ solutions provided that the boldness rating is VERY_HIGH
    4. Retrieves solely paperwork from Amazon Kendra which have a HIGH or VERY_HIGH confidence rating
    5. If paperwork with a excessive confidence rating aren’t discovered, the chat assistant returns “No related paperwork discovered”

    Stipulations

    To carry out the answer, you could have following stipulations:

    • Fundamental information of AWS
    • An AWS account with entry to Amazon S3 and Amazon Kendra
    • An S3 bucket to retailer your paperwork. For extra info, see Step 1: Create your first S3 bucket and the Amazon S3 User Guide.
    • A Slack workspace to combine the chat assistant
    • Permission to put in Slack apps in your Slack workspace
    • Seed URLs for the Amazon Kendra Net Crawler information supply
      • You’ll want authorization to crawl and index any web sites offered
    • AWS CloudFormation for deploying the answer sources

    Construct a generative AI Slack chat assistant

    To construct a Slack utility, use the next steps:

    1. Request model access on Amazon Bedrock for all Anthropic fashions
    2. Create an S3 bucket within the us-east-1 (N. Virginia) AWS Area.
    3. Upload the AIBot-LexJson.zip and SampleFAQ.csv recordsdata to the S3 bucket
    4. Launch the CloudFormation stack within the us-east-1 (N. Virginia) AWS Area.Launch Stack to create solution resources
    5. Enter a Stack identify of your selection
    6. For S3BucketName, enter the identify of the S3 bucket created in Step 2
    7. For S3KendraFAQKey, enter the identify of the SampleFAQs uploaded to the S3 bucket in Step 3
    8. For S3LexBotKey, enter the identify of the Amazon Lex .zip file uploaded to the S3 bucket in Step 3
    9. For SeedUrls, enter as much as 10 URLs for the net crawler as a comma delimited listing. Within the instance on this put up, we give the publicly out there Amazon Bedrock service web page because the seed URL
    10. Depart the remainder as defaults. Select Subsequent. Select Subsequent once more on the Configure stack choices
    11. Acknowledge by deciding on the field and select Submit, as proven within the following screenshot
      ML-16837-cfn-checkbox
    12. Look ahead to the stack creation to finish
    13. Confirm all sources are created
    14. Take a look at on the AWS Administration Console for Amazon Lex
      1. On the Amazon Lex console, select your chat assistant ${YourStackName}-AIBot
      2. Select Intents
      3. Select Model 1 and select Take a look at, as proven within the following screenshot
        ML-16837-lex-version1
      4. Choose the AIBotProdAlias and select Affirm, as proven within the following screenshot. If you wish to make modifications to the chat assistant, you should use the draft model, publish a brand new model, and assign the brand new model to the AIBotProdAlias. Study extra about Versioning and Aliases.
      5. Take a look at the chat assistant with questions comparable to, “Which AWS service has 11 nines of sturdiness?” and “What’s the AWS Properly-Architected Framework?” and confirm the responses. The next desk reveals that there are three FAQs within the pattern .csv file.
        _question _answer _source_uri
        Which AWS service has 11 nines of sturdiness? Amazon S3 https://aws.amazon.com/s3/
        What’s the AWS Properly-Architected Framework? The AWS Properly-Architected Framework permits clients and companions to evaluate their architectures utilizing a constant method and gives steerage to enhance designs over time. https://aws.amazon.com/structure/well-architected/
        In what Areas is Amazon Kendra out there? Amazon Kendra is presently out there within the following AWS Areas: Northern Virginia, Oregon, and Eire https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
      6. The next screenshot reveals the query “Which AWS service has 11 nines of sturdiness?” and its response. You possibly can observe that the response is similar as within the FAQ file and features a hyperlink.
        ML-16837-Q1inLex
      7. Primarily based on the pages you’ve got crawled, ask a query within the chat. For this instance, the publicly out there Amazon Bedrock web page was crawled and listed. The next screenshot reveals the query, “What are brokers in Amazon Bedrock?” and and a generated response that features related hyperlinks.
        ML-16837-Q2inLex
    1. For integration of the Amazon Lex chat assistant with Slack, see Integrating an Amazon Lex V2 bot with Slack. Select the AIBotProdAlias underneath Alias within the Channel Integrations

    Run pattern queries to check the answer

    1. In Slack, go to the Apps part. Within the dropdown menu, select Handle and choose Browse apps.
      ML-16837-slackBrowseApps
    2. Seek for ${AIBot} in App Listing and select the chat assistant. This can add the chat assistant to the Apps part in Slack. Now you can begin asking questions within the chat. The next screenshot reveals the query “Which AWS service has 11 nines of sturdiness?” and its response. You possibly can observe that the response is similar as within the FAQ file and features a hyperlink.
      ML-16837-Q1slack
    3. The next screenshot reveals the query, “What's the AWS Properly-Architected Framework?” and its response.
      ML-16837-Q2slack
    4. Primarily based on the pages you’ve got crawled, ask a query within the chat. For this instance, the publicly out there Amazon Bedrock web page was crawled and listed. The next screenshot reveals the query, “What are brokers in Amazon Bedrock?” and and a generated response that features related hyperlinks.
      ML-16837-Q3slack
    5. The next screenshot reveals the query, “What's amazon polly?” As a result of there isn’t a Amazon Polly documentation listed, the chat assistant responds with “No related paperwork discovered,” as anticipated.
      ML-16837-Q4slack

    These examples present how the chat assistant retrieves paperwork from Amazon Kendra and gives solutions primarily based on the paperwork retrieved. If no related paperwork are discovered, the chat assistant responds with “No related paperwork discovered.”

    Clear up

    To wash up the sources created by this answer:

    1. Delete the CloudFormation stack by navigating to the CloudFormation console
    2. Choose the stack you created for this answer and select Delete
    3. Affirm the deletion by getting into the stack identify within the offered discipline. This can take away all of the sources created by the CloudFormation template, together with the Amazon Kendra index, Amazon Lex chat assistant, Lambda operate, and different associated sources.

    Conclusion

    This put up describes the event of a generative AI Slack utility powered by Amazon Bedrock and Amazon Kendra. That is designed to be an internal-facing Slack chat assistant that helps reply questions associated to the listed content material. The answer structure consists of Amazon Lex for intent identification, a Lambda operate for fulfilling the fallback intent, Amazon Kendra for FAQ searches and indexing crawled net pages, and Amazon Bedrock for producing responses. The put up walks via the deployment of the answer utilizing a CloudFormation template, gives directions for working pattern queries, and discusses the steps for cleansing up the sources. Total, this put up demonstrates learn how to use numerous AWS providers to construct a robust generative AI–powered chat assistant utility.

    This answer demonstrates the facility of generative AI in constructing clever chat assistants and search assistants. Discover the generative AI Slack chat assistant: Invite your groups to a Slack workspace and begin getting solutions to your listed content material and FAQs. Experiment with totally different use instances and see how one can harness the capabilities of providers like Amazon Bedrock and Amazon Kendra to reinforce what you are promoting operations. For extra details about utilizing Amazon Bedrock with Slack, consult with Deploy a Slack gateway for Amazon Bedrock.


    Concerning the authors

    Kruthi Jayasimha Rao is a Companion Options Architect with a concentrate on AI and ML. She gives technical steerage to AWS Companions in following greatest practices to construct safe, resilient, and extremely out there options within the AWS Cloud.

    Mohamed Mohamud is a Companion Options Architect with a concentrate on Information Analytics. He focuses on streaming analytics, serving to companions construct real-time information pipelines and analytics options on AWS. With experience in providers like Amazon Kinesis, Amazon MSK, and Amazon EMR, Mohamed permits data-driven decision-making via streaming analytics.

Leave a Reply

Your email address will not be published. Required fields are marked *