Index your Confluence content material utilizing the brand new Confluence connector V2 for Amazon Kendra


Amazon Kendra is a extremely correct and simple-to-use clever search service powered by machine studying (ML). Amazon Kendra affords a collection of information supply connectors to simplify the method of ingesting and indexing your content material, wherever it resides.

Precious knowledge in organizations is saved in each structured and unstructured repositories. An enterprise search resolution ought to be capable to pull collectively knowledge throughout a number of structured and unstructured repositories to index and search on.

One such unstructured knowledge repository is Confluence. Confluence is a staff workspace that offers data employee groups a spot to create, seize, and collaborate on any undertaking or concept. Staff areas assist groups construction, set up, and share work, so each staff member has visibility into institutional data and entry to the data they want.

There are two Confluence offerings:

  • Cloud – That is provided as a software program as a service (SaaS) product. It’s all the time on, repeatedly up to date, and extremely safe.
  • Data Center (self-managed) – Right here, you host Confluence in your infrastructure, which may very well be on premises or the cloud. This lets you hold knowledge inside your community and handle it your self.

We’re excited to announce which you can now use the brand new Amazon Kendra connector V2 for Confluence to look data saved in your Confluence account each on the cloud and your knowledge heart. On this submit, we present index data saved in Confluence and use the Amazon Kendra clever search operate. As well as, the ML-powered clever search can precisely discover data from unstructured paperwork having pure language narrative content material, for which key phrase search just isn’t very efficient.

What’s new for this model

This model helps OAuth 2.0 authentication along with primary authentication for the Cloud version. For the Knowledge Middle (on-premises) version, we now have added OAuth2 along with primary authentication and private entry tokens for exhibiting search outcomes primarily based on consumer entry rights. You possibly can profit from the next options:

  • Now you can crawl feedback along with areas, pages, blogs, and attachments
  • You now have fine-grained selections in your sync scope—you may specify pages, blogs, feedback, and attachments
  • You possibly can select to import identities (or not)
  • This model affords regex assist for selecting entity titles in addition to file varieties
  • You’ve gotten the selection of a number of Sync modes

Answer overview

With Amazon Kendra, you may configure a number of knowledge sources to offer a central place to look throughout your doc repository. For our resolution, we reveal index a Confluence repository utilizing the Amazon Kendra connector for Confluence. The answer consists of the next steps:

  1. Select an authentication mechanism.
  2. Configure an app on Confluence and get the connection particulars.
  3. Retailer the main points in AWS Secrets Manager.
  4. Create a Confluence knowledge supply V2 through the Amazon Kendra console.
  5. Index the information within the Confluence repository.
  6. Run a pattern question to check the answer.

Conditions

To check out the Amazon Kendra connector for Confluence, you want the next:

Select an authentication mechanism

Select your most well-liked authentication methodology:

  • Primary – This works on each the Cloud and Knowledge Middle editions. You want a consumer ID and a password to configure this methodology.
  • Private entry token – This selection solely works for the Knowledge Middle version.
  • OAuth2 – That is extra concerned and works for each Cloud and Knowledge Middle editions.

Collect authentication particulars

On this part, we present the steps to collect your authentication particulars relying in your authentication methodology.

Primary authentication

For primary authentication with the Knowledge Middle version, all you want is your login and password. Make certain your login has privileges to collect all content material.

For Cloud version, your consumer ID serves as your consumer login. On your password, it is advisable get a token. Full the next steps:

  1. Log in to https://id.atlassian.com/manage-profile/security/api-tokens and select Create API token.

  1. For Label, enter a reputation for the token.
  2. Select Create.

  1. Copy the worth and put it aside to make use of as your password.

Private entry token

This authentication methodology works for on premises (Knowledge Middle) solely. Full the next steps to amass authentication particulars:

  1. Log in to your Confluence URL utilizing the consumer ID and password that you really want Amazon Kendra to make use of whereas retrieving content material.
  2. Select the profile icon and select Settings.

  1. Select Private Entry Tokens within the navigation pane, then select Create token.

create token

  1. For Token title, enter a reputation.
  2. For Expiry date, deselect Automated expiry.
  3. Select Create.

  1. Copy the token and put it aside in a protected place.

To configure Secrets and techniques Supervisor, we use the login URL and this worth.

OAuth2 authentication for Confluence Cloud version

This authentication methodology follows the total OAuth2.0 (3LO) documentation from Confluence. We first create and configure an app on Confluence and allow it for OAuth2. The method is barely totally different for the Cloud and Knowledge Middle editions. We then get an authorization token and change this for an entry token. Lastly, we get the consumer ID, consumer secret, and consumer code. Full the next steps:

  1. Log in to the Confluence app.
  2. Navigate to https://developer.atlassian.com/.
  3. Subsequent to My apps, select Create and select OAuth2 Integration.

  1. For Identify, enter a reputation.
  2. Select Create.

  1. Select Authorization within the navigation pane.
  2. Select Add subsequent to your authorization kind.

  1. For Callback URL, enter the URL you employ to log in to Confluence.
  2. Select Save modifications.

save changess

  1. Below Authorization URL generator, select Add APIs.

add apis

  1. Subsequent to Person identification API, select Add, then select Configure.

add permissions

  1. Select Edit Scopes to configure learn scopes for the app.
  2. Choose View energetic consumer profile and View consumer profiles.

edit scopes

  1. Select Permissions within the navigation pane.
  2. Subsequent to Confluence API, select Add, then select Configure.
  3. On the Traditional scopes tab, select Edit Scopes.
  4. Choose all learn, search, and obtain scopes.
  5. Select Save.

grannular scopes

  1. On the Granular scopes tab, select Edit Scopes.
  2. Seek for learn and choose all of the scopes discovered.
  3. Select Save.

scope choice confirmation

  1. Select Authorization within the navigation pane.
  2. Subsequent to your authorization kind, select Configure.

configure authorization type

It’s best to see three URLs listed.

generated urls

  1. Copy the code for Granular Confluence API authorization URL.

The next is instance code:

https://auth.atlassian.com/authorize?
viewers=api.atlassian.com
&client_id=YOUR_CLIENT_ID
&scope=REQUESTED_SCOPEpercent20REQUESTED_SCOPE_TWO

&redirect_uri=https://YOUR_APP_CALLBACK_URL
&state=YOUR_USER_BOUND_VALUE
&response_type=code
&immediate=consent

  1. If you wish to generate a refresh token so that you simply don’t need to repeat this course of, add offline_access (or %20offline_access) to the tip of all of the scopes within the URL (for instance, &scope=REQUESTED_SCOPEpercent20REQUESTED_SCOPE_TWOpercent20offline_access).
  2. For those who’re okay producing a brand new token every time, simply enter the URL in your browser.
  3. Select Settle for.

choose accept

You’re redirected to your Confluence dwelling web page.

  1. Examine the browser URL and find code=xxxxx.
  2. Copy this code and put it aside.

That is the authorization code that we use to change with the entry token.

copy authorization code

  1. Return to the Atlassian developer console and select Settings within the navigation pane.
  2. Copy the values of the consumer ID and secret ID and save them.

We’d like these values to make a name to change the authorization token with the entry token.

postman utility

Subsequent, we use the Postman utility to submit the authorization code to get the entry token. You should utilize alternate instruments like curl to do that as nicely.

  1. The URL to submit the authorization code is https://auth.atlassian.com/oauth/token.
  2. The JSON physique to submit is as follows:
    {"grant_type": "authorization_code",
    "client_id": "YOUR_CLIENT_ID",
    "client_secret": "YOUR_CLIENT_SECRET",
    "code": "YOUR_AUTHORIZATION_CODE",
    "redirect_uri": "https://YOUR_APP_CALLBACK_URL"}

The grant_type parameter is hard-coded. We collected the values for client_id and client_secret in a earlier step. The worth for code is the authorization code we collected earlier.

A profitable response will return the entry token. For those who added offline entry to the URL earlier, you additionally get a refresh token.

return response with access token

  1. Save the entry token to make use of when organising Secrets and techniques Supervisor.

For those who’re producing a brand new token from the refresh token, the present token is legitimate just for 1 hour. If it is advisable get a brand new token, you can begin another time. Nonetheless, in case you have the refresh token, as earlier than, use Postman to submit to the next URL: https://auth.atlassian.com/oauth/token. Use the next JSON format for the physique of the token:

{"grant_type": "refresh_token",
"client_id": "YOUR_CLIENT_ID",
"client_secret": "YOUR_CLIENT_SECRET",
"refresh_token": "YOUR_REFRESH_TOKEN"}

The decision will return a brand new entry token

new access token

OAuth2 authentication for Confluence Knowledge Middle version

If utilizing the Knowledge Middle version with OAuth2 authentication, full the next steps:

  1. Log in to Confluence Knowledge Middle version.
  2. Select the gear icon, then select Common configuration.
  3. Within the navigation pane, select Utility hyperlinks, then select Create hyperlink.
  4. Within the Create hyperlink pop-up window, choose Exterior software and Incoming, then select Proceed.
  5. For Identify, enter a reputation.
  6. For Redirect URL, enter https://httpbin.org/.
  7. Select Save.
  8. Copy and save the values for the consumer ID and consumer secret.
  9. On a separate browser tab, open the URL https://example-app.com/pkce.
  10. Select Generate Random String and Calculate Hash.
  11. Copy the worth beneath Code Problem.

  12. Return to your unique tab.
  13. Use the next URL to get the authorization code:
    https://<confluence url>/relaxation/oauth2/newest/authorize
    ?client_id=CLIENT_ID
    &redirect_uri=REDIRECT_URI
    &response_type=code
    &scope=SCOPE
    &code_challenge=CODE_CHALLENGE
    &code_challenge_method=S256

Use the consumer ID you copied earlier, and https://httpbin.org for the redirect URI. For CODE_CHALLENGE, enter the code you copied earlier.

  1. Select Permit.

You’re redirected to httpbin.org.

  1. Save the code to make use of within the subsequent step.

  1. To get the entry token and refresh token, use a device similar to curl or Postman to submit the next values to https://<your confluence URL>/relaxation/oauth2/newest/token:
    grant_type: authorization_code
    client_id: YOUR_CLIENT_ID
    client_secret: YOUR_CLIENT_SECRET
    code: YOUR_AUTHORIZATION_CODE
    code_verifier: CODE_VERIFIER
    redirect_uri: YOUR_REDIRECT_URL

Use the consumer ID, consumer secret, and authorization code you saved earlier. For CODE_VERIFIER, enter the worth from whenever you generated the code problem.

  1. Copy the entry token and refresh token to make use of later

copy access and refresh tokens

The entry token and refresh token are legitimate just for 1 hour. To refresh the token, submit the next code to the identical URL to get new values:

grant_type: refresh_token
client_id: YOUR_CLIENT_ID
client_secret: YOUR_CLIENT_SECRET
refresh_token: REFRESH_TOKEN
redirect_uri: YOUR_REDIRECT_URL

The brand new tokens are legitimate for 1 hour.

new tokens

Retailer Confluence credentials in Secrets and techniques Supervisor

To retailer your Confluence credentials in Secrets and techniques Supervisor, compete the next steps:

  1. On the Secrets and techniques Supervisor console, select Retailer a brand new secret.
  2. Choose Different kind of secret.

other type

  1. Relying on the kind of secret, enter the key-values as follows:
    • For Confluence Cloud primary authentication, enter the next key-value pairs (word that the password just isn’t the login password, however the token you created earlier):
      "username" : "<your login username>",
      
      "password" : "<your token worth>"

    • For Confluence Cloud OAuth authentication, enter the next key-value pairs:
      "confluenceAppKey" : “<your clientid>”
      
      "confluenceAppSecret" : “<your consumer Secret>”
      
      "confluenceAccessToken" : “<your entry token>”
      
      "confluenceRefreshToken" : “<your refresh token>”

    • For Confluence Knowledge Middle primary authentication, enter the next key-value pairs:
      "username" : "<login username>"
      
      "password" : "<login password>"

    • For Confluence Knowledge Middle private entry token authentication, enter the next key-value pairs:
      "patToken" :"<your private entry token>"

    • For Confluence Knowledge Middle OAuth authentication, enter the next key-value pairs:
      "confluenceAppKey" : "<your consumer id>"
      
      "confluenceAppSecret" : “<your Consumer Secret>”
      
      "confluenceAccessToken" : “<your Entry Token>"
      
      "confluenceRefreshToken" : “<your refresh token>”

  1. Select Subsequent.

choose next

  1. For Secret title, enter a reputation (for instance, AmazonKendra-my-confluence-secret).
  2. Enter an non-compulsory description.
  3. Select Subsequent.

configure secret

  1. Within the Configure rotation part, hold all settings at their defaults and select Subsequent.

configure rotation

  1. On the Overview web page, select Retailer.

Configure the Amazon Kendra connector for Confluence

To configure the Amazon Kendra connector, full the next steps:

  1. On the Amazon Kendra console, select Create an Index.

create an index

  1. For Index title, enter a reputation for the index (for instance, my-confluence-index).
  2. Enter an non-compulsory description.
  3. For Position title, enter an IAM position title.
  4. Configure non-compulsory encryption settings and tags.
  5. Select Subsequent.

specify index details

  1. Within the Configure consumer entry management part, depart the settings at their defaults and select Subsequent.

configure user access control

  1. Within the Specify provisioning part, choose Developer version and select Subsequent.

specify provisioning

  1. On the overview web page, select Create.

This creates and propagates the IAM position after which creates the Amazon Kendra index, which may take as much as half-hour.

index created

Create a Confluence knowledge supply

Full the next steps to create your knowledge supply:

  1. On the Amazon Kendra console, select Knowledge sources within the navigation pane.
  2. Below Confluence connector V2.0, select Add connector.

.

  1. For Knowledge supply title, enter a reputation (for instance, my-Confluence-data-source).
  2. Enter an non-compulsory description.
  3. Select Subsequent.

specify data source details

  1. Select both Confluence Cloud or Confluence Server relying in your knowledge supply.
  2. For Authentication, select your authentication possibility.
  3. Choose Id crawler is on.
  4. For IAM position¸ select Create a brand new position.
  5. For Position title, enter a reputation (for instance, AmazonKendra-my-confluence-datasource-role).
  6. Select Subsequent.

define access and security

For Confluence Knowledge Middle and Cloud editions, we are able to add further non-compulsory data (not proven) just like the VPC. For Knowledge Middle version solely, we are able to add further data for the net proxy. There’s additionally a further authentication possibility if utilizing a private entry token that’s legitimate just for Knowledge Middle and never Cloud version.

  1. For Sync scope, choose all of the content material to sync.
  2. For Sync mode, choose Full sync.
  3. For Frequency, select Run on demand.
  4. Select Subsequent.

configure sync settings

  1. Optionally, you may set mapping fields.

Mapping fields is a helpful train the place you may substitute discipline names to values which might be user-friendly and slot in your group’s vocabulary.

  1. For this submit, hold all defaults and select Subsequent.

set field mappings

  1. Overview the settings and select Add knowledge supply.
  2. To sync the information supply, select Sync now.

sync data source

A banner message seems when the sync is full.

Check the answer

Now that you’ve got ingested the content material out of your Confluence account into your Amazon Kendra index, you may check some queries. For the needs of our check, we now have created a Confluence web site with two groups: team1 with the member Analyst1 and team2 with the member Analyst2.

  1. On the Amazon Kendra console, navigate to your index and select Search listed content material.
  2. Enter a pattern search question and overview your search outcomes (your outcomes will differ primarily based on the contents of your account).

simple search

The Confluence connector additionally crawls native identification data from Confluence. You should utilize this function to slender down your question by consumer. Confluence affords complete visibility choices. Customers can select their content material to be seen by different customers, at an area stage, or by teams. While you filter your searches by customers, the question returns solely these paperwork that the consumer has entry to on the time of ingestion.

  1. To make use of this function, broaden Check question with consumer title or teams and select Apply consumer title or teams.
  2. Enter the consumer title of your consumer and select Apply.

Observe that for Confluence Knowledge Middle version, the consumer title is the e-mail ID.

apply user name or groups

Rerun your search question.

This brings you a filtered set of outcomes. Discover we carry again simply 62 outcomes.

filtered resultw

We now return and prohibit Bob Straham to only be capable to entry his workspace and run the search once more.

bob's results

Discover that we get only a subset of the outcomes as a result of the search is restricted to only Bob’s content material.

When fronting Amazon Kendra with an software similar to an software constructed utilizing Experience Builder, you may move the consumer identification (within the type of the e-mail ID for Cloud version or consumer title for Knowledge Middle version) to Amazon Kendra to make sure that every consumer solely sees content material particular to their consumer ID. Alternately, you should utilize AWS IAM Identity Center (successor to AWS Single Signal-On) to regulate consumer context being handed to Amazon Kendra to restrict queries by consumer.

Congratulations! You’ve gotten efficiently used Amazon Kendra to floor solutions and insights primarily based on the content material listed out of your Confluence account.

Clear up

To keep away from incurring future prices, clear up the sources you created as a part of this resolution. For those who created a brand new Amazon Kendra index whereas testing this resolution, delete it. For those who solely added a brand new knowledge supply utilizing the Amazon Kendra connector for Confluence V2, delete that knowledge supply.

Conclusion

With the brand new Confluence connector V2 for Amazon Kendra, organizations can faucet into the repository of knowledge saved of their account securely utilizing clever search powered by Amazon Kendra.

To find out about these potentialities and extra, consult with the Amazon Kendra Developer Guide. For extra data on how one can create, modify, or delete metadata and content material when ingesting your knowledge from Confluence, consult with Enriching your documents during ingestion and Enrich your content and metadata to enhance your search experience with custom document enrichment in Amazon Kendra.


Concerning the creator

Ashish Lagwankar is a Senior Enterprise Options Architect at AWS. His core pursuits embody AI/ML, serverless, and container applied sciences. Ashish is predicated within the Boston, MA, space and enjoys studying, outdoor, and spending time along with his household.

Leave a Reply

Your email address will not be published. Required fields are marked *