Asserting the up to date Microsoft OneDrive connector (V2) for Amazon Kendra


Amazon Kendra is an clever search service powered by machine studying (ML), enabling organizations to offer related info to prospects and staff, once they want it.

Amazon Kendra makes use of ML algorithms to allow customers to make use of pure language queries to seek for info scattered throughout a number of knowledge souces in an enterprise, together with generally used doc storage programs like Microsoft OneDrive.

OneDrive is a web based cloud storage service that permits you to host your content material and have it routinely sync throughout a number of gadgets. Amazon Kendra can index doc codecs like Microsoft OneNote, HTML, PDF, Microsoft Phrase, Microsoft PowerPoint, Microsoft Excel, Wealthy Textual content, JSON, XML, CSV, XSLT, and plain textual content.

We’re excited to announce that we have now up to date the OneDrive connector for Amazon Kendra so as to add much more capabilities. For instance, we have now added assist to go looking OneNote paperwork. Moreover, now you can select to make use of identification or ACL info to make your searches extra granular.

The connector helps to index paperwork and their entry management info to restrict the search outcomes to solely these paperwork the person is allowed to entry. To point out the search outcomes primarily based on person entry rights and utilizing solely the person info, the connector offers an identification crawler to load principal info, comparable to person and group mappings right into a principal retailer.

On this submit, we exhibit find out how to configure a number of knowledge sources in Amazon Kendra to offer a central place to go looking throughout your doc repository.

Resolution overview

For our resolution, we exhibit find out how to index a OneDrive repository or folder utilizing the Amazon Kendra connector for OneDrive. The answer consists of the next steps:

  1. Create and configure an app on Microsoft Azure Portal and get the authentication credentials.
  2. Create a OneDrive knowledge supply through the Amazon Kendra console.
  3. Index the information within the OneDrive repository.
  4. Run a pattern question to get the data.
  5. Filter the question by customers or teams.

Conditions

To check out the Amazon Kendra connector for OneDrive, you want the next:

Configure an Azure utility and assign connection permissions

Earlier than we arrange the OneDrive knowledge supply, we want just a few particulars in regards to the OneDrive repository. Full the next steps:

  1. Log in to Azure.
  2. After logging in together with your account credentials, select App registrations, then select New registration.
  3. Give an applicable identify to your utility and register the appliance.
  4. Acquire the details about the shopper ID, tenant ID, and different particulars of the appliance.
  5. To get a shopper secret, select Add a certificates or secret below Shopper credentials.
  6. Select New shopper secret and supply the correct description and expiry.
  7. Word the client-id, tenant-id, and secret-id values. We use these for authenticating the OAuth2 utility.
  8. Navigate to App, select API permissions within the navigation pane, and select Add a permission.
  9. Select Microsoft Graph.
  10. Underneath Software permissions, enter File within the search bar and below Recordsdata, choose Recordsdata.Learn.All.
  11. Select Add permissions
  12. Equally, add the next permissions on the Microsoft Graph possibility for the appliance you created:
    1. Group.Learn.All
    2. Notes.Learn.All

On completion, the API permissions will appear like the next screenshot.

Configure the Amazon Kendra connector for OneDrive

To configure the Amazon Kendra connector, full the next steps:

  1. On the Amazon Kendra console, select Create an Index.
  2. For Index identify, enter a reputation for the index (for instance, my-onedrive-index).
  3. Enter an elective description.
  4. Select Create a brand new function.
  5. For Position identify, enter an IAM function identify.
  6. Configure elective encryption settings and tags
  7. Select Subsequent
  8. Within the Configure person entry management part, choose Sure below Entry management settings.
  9.  For Token kind, select JSON on the drop-down menu.
  10. Go away the remaining values as their default values.
  11. Select Subsequent

Earlier than we transfer to the following configuration step, we have to present Amazon Kendra with a task that has the permissions vital for connecting to the location. These embrace permission to get and decrypt the AWS Secrets and techniques Supervisor secret that accommodates the appliance ID and secret key vital to hook up with the OneDrive web site.

  1. Open one other tab for the AWS account, and on the IAM console, navigate to the function that you just created earlier (for instance, AmazonKendra-us-west-2-onedrive).
  2. Select Add permissions and Create inline coverage.
  3. For Service, select Kendra.
  4. For Actions¸select Write and specify BatchPutDocument.
  5. For Assets, select All sources.
  6. Select Evaluation coverage.
  7. For Identify, enter a reputation (for instance, BatchPutPolicy).
  8. Select Create coverage.
  9. Add this coverage to the function you created.
  10. Moreover, connect the SecretsManagerReadWrite AWS managed coverage to the function
  11. Return to the Amazon Kendra tab.
  12. Choose Developer version and select Create.

This creates and propagates the IAM function after which creates the Amazon Kendra index, which might take as much as half-hour.

  1. Return to the Amazon Kendra console, select Information sources within the navigation pane, and select Add knowledge supply.
  2. Underneath OneDrive connector V2.0, select Add connector.
  3. For Information supply identify, enter a reputation (for instance, my-onedrive).
  4. Enter an elective description.
  5. Select Subsequent.
  6. For OneDrive Tenant ID, enter the tenant ID you gathered earlier.
  7. For Configure VPC and safety group, go away the default (No VPC).
  8. Hold Identification crawler is on chosen. This imports identification info into the index.
  9. For IAM function, select Create a brand new function.
  10. Enter a task identify, comparable to AmazonKendra-us-west-2-onedrive, then select Subsequent.
  11. Within the Authentication part, select Create and add a secret.
  12. Create a secret with clientId and clientSecret as keys.
  13. Add their respective values with the data you collected earlier.
  14. Select Subsequent.
  15. Within the Configure sync settings part, add the OneDrive customers whose paperwork you need to index.
  16. Choose the sync mode for the index. For this submit, we choose New, modified or deleted content material sync.
  17. Select the frequency of indexing as Run on demand, then select Subsequent.

Discipline mappings allow permit you to set the searchability and relevance of fields. For instance, the lastUpdatedAt discipline can kind or enhance the rating of the paperwork primarily based on how lately it was up to date.

  1. Hold all of the defaults within the Set discipline mappings part and select Subsequent.
  2. On the overview web page, select Add knowledge supply

  3. Select Sync now

The sync can take as much as half-hour to finish.

Take a look at the answer

Now that you’ve listed the content material from OneDrive, you’ll be able to take a look at it by querying the index.

  1. Go to your index on the Amazon Kendra console and select Search listed content material within the navigation pane.
  2. Enter a search time period and press Enter.

Discover that and not using a token, the ACLs forestall a search end result from being returned.

  1. Broaden Take a look at question with an entry token and select Apply token.
  2. Enter the suitable token with a person who has permissions to learn the file and select Apply.
  3. Seek for info current in OneDrive once more.

You’ll be able to confirm that Amazon Kendra presents the ranked outcomes as anticipated.

Congratulations, you might have configured Amazon Kendra to index and search paperwork in OneDrive and management entry to them utilizing ACL.

Conclusion

With the Microsoft OneDrive V2 connector for Amazon Kendra, organizations can faucet into generally used enterprise doc shops, securely utilizing clever search powered by Amazon Kendra. You’ll be able to improve the search expertise by integrating the information supply with the Customized Doc Enrichment (CDE) functionality in Amazon Kendra to carry out further attribute mapping logic and even customized content material transformation throughout ingestion.


In regards to the authors

Pravinchandra Varma is a Senior Buyer Supply Architect with the AWS Skilled Providers workforce and is enthusiastic about functions of machine studying and synthetic intelligence companies.

Supratim Barat is a Software program Developer Engineer with AWS Kendra Yellowbadge Group and is a blockchain and cyber safety fanatic

Leave a Reply

Your email address will not be published. Required fields are marked *