Safe Amazon SageMaker Studio presigned URLs Half 3: Multi-account non-public API entry to Studio


Enterprise prospects have a number of traces of companies (LOBs) and teams and groups inside them. These prospects must steadiness governance, safety, and compliance towards the necessity for machine studying (ML) groups to shortly entry their information science environments in a safe method. These enterprise prospects which might be beginning to undertake AWS, increasing their footprint on AWS, or plannng to reinforce a longtime AWS setting want to make sure they’ve a robust basis for his or her cloud setting. One essential facet of this basis is to arrange their AWS setting following a multi-account technique.

Within the publish Secure Amazon SageMaker Studio presigned URLs Part 2: Private API with JWT authentication, we demonstrated easy methods to construct a non-public API to generate Amazon SageMaker Studio presigned URLs which might be solely accessible by an authenticated end-user inside the company community from a single account. On this publish, we present how one can lengthen that structure to a number of accounts to assist a number of LOBs. We show how you should utilize Studio presigned URLs in a multi-account setting to safe and route entry from completely different personas to their applicable Studio area. We clarify the method and community move, and easy methods to simply scale this structure to a number of accounts and Amazon SageMaker domains. The proposed resolution additionally ensures that every one community site visitors stays inside AWS’s non-public community and communication occurs in a safe approach.

Though we show utilizing two completely different LOBs, every with a separate AWS account, this resolution can scale to a number of LOBs. We additionally introduce a logical assemble of a shared companies account that performs a key function in governance, administration, and orchestration.

Resolution overview

We will obtain communication between all LOBs’ SageMaker VPCs and the shared companies account VPC utilizing both VPC peering or AWS Transit Gateway. On this publish, we use a transit gateway as a result of it gives an easier VPC-to-VPC communication mechanism over VPC peering when there are a lot of VPCs concerned. We additionally use Amazon Route 53 forwarding guidelines together with inbound and outbound resolvers to resolve all DNS queries to the shared service account VPC endpoints. The networking structure has been designed utilizing the next patterns:

Let’s have a look at the 2 principal structure parts, the data move and community move, in additional element.

Info move

The next diagram illustrates the structure of the data move.

The workflow steps are as follows:

  1. The person authenticates with the Amazon Cognito person pool and receives a token to eat the Studio entry API.
  2. The person calls the API to entry Studio and contains the token within the request.
  3. When this API is invoked, the customized AWS Lambda authorizer is triggered to validate the token with the identification supplier (IdP), and returns the right permissions for the person.
  4. After the decision is allowed, a Lambda operate is triggered.
  5. This Lambda operate makes use of the person’s title to retrieve their LOB title and the LOB account from the next Amazon DynamoDB tables that retailer these relationships:
    1. Customers desk – This desk holds the connection between customers and their LOB.
    2. LOBs desk – This desk holds the connection between the LOBs and the AWS account the place the SageMaker area for that LOB exists.
  6. With the account ID, the Lambda operate assumes the PresignedUrlGenerator function in that account (every LOB account has a PresignedURLGenerator function that may solely be assumed by the Lambda operate in control of producing the presigned URLs).
  7. Lastly, the operate invokes the SageMaker create-presigned-domain-url API name for that person of their LOB´s SageMaker area.
  8. The presigned URL is returned to the end-user, who consumes it through the Studio VPC endpoint.

Steps 1–4 are lined in additional element in Part 2 of this sequence, the place we clarify how the customized Lambda authorizer works and takes care of the authorization course of within the entry API Gateway.

Community move

All community site visitors flows in a safe and personal method utilizing AWS PrivateLink, as proven within the following diagram.

The steps are as follows:

  1. When the person calls the entry API, it occurs through the VPC endpoint for Amazon API Gateway within the networking VPC within the shared companies account. This API is ready as non-public, and has a coverage that permits its consumption solely through this VPC endpoint, as described in Part 2 of this sequence.
  2. All of the authorization course of occurs privately between API Gateway, Lambda, and Amazon Cognito.
  3. After authorization is granted, API Gateway triggers the Lambda operate in control of producing the presigned URLs utilizing AWS’s non-public community.
  4. Then, as a result of the routing Lambda operate lives in a VPC, all calls to completely different companies occur by their respective VPC endpoints within the shared companies account. The operate performs the next actions:
    1. Retrieve the credentials to imagine the function through the AWS Security Token Service (AWS STS) VPC endpoint within the networking account.
    2. Name DynamoDB to retrieve person and LOB info by the DynamoDB VPC endpoint.
    3. Name the SageMaker API to create a presigned URL for the person of their SageMaker area by the SageMaker API VPC endpoint.
  5. The person lastly consumes the presigned URL through the Studio VPC endpoint within the networking VPC within the shared companies account, as a result of this VPC endpoint has been specified in the course of the creation of the presigned URL.
  6. All additional communications between Studio and AWS companies occur through Studio’s ENI contained in the LOB account’s SageMaker VPC. For instance, to permit SageMaker to name Amazon Elastic Container Registry (Amazon ECR), the Amazon ECR interface VPC endpoint could be provisioned within the shared companies account VPC, and a forwarding rule is shared with the SageMaker accounts that must eat it. This enables SageMaker queries to Amazon ECR to be resolved to this endpoint, and the Transit Gateway routing will do the remainder.

Conditions

To signify a multi-account setting, we use one shared companies account and two completely different LOBs:

  • Shared companies account – The place the VPC endpoints and the Studio entry Gateway API dwell
  • SageMaker account LOB A – The account for the SageMaker area for LOB A
  • SageMaker account LOB B – The account for the SageMaker area for LOB B

For extra info on easy methods to create an AWS account, discuss with How do I create and activate a new AWS account.

LOB accounts are logical entities which might be enterprise, division, or area particular. We assume one account per logical entity. Nonetheless, there might be completely different accounts per setting (growth, take a look at, manufacturing). For every setting, you usually have a separate shared companies account (primarily based on compliance necessities) to limit the blast radius.

You should use the templates and directions within the GitHub repository to arrange the wanted infrastructure. This repository is structured into folders for the completely different accounts and completely different elements of the answer.

Infrastructure setup

For giant firms with many Studio domains, it’s additionally advisable to have a centralized endpoint structure. This may end up in price financial savings because the structure scales and extra domains and accounts are created. The networking.yml template within the shared companies account deploys the VPC endpoints and wanted Route 53 assets, and the Transit Gateway infrastructure to scale out the proposed resolution.

Detailed directions of the deployment could be discovered within the README.md file within the GitHub repository. The complete deployment contains the next assets:

  • Two AWS CloudFormation templates within the shared companies account: one for networking infrastructure and one for the AWS Serverless Application Model (AWS SAM) Studio entry Gateway API
  • One CloudFormation template for the infrastructure within the SageMaker account LOB A
  • One CloudFormation template for the infrastructure of the SageMaker account LOB B
  • Optionally, an on-premises simulator could be deployed within the shared companies account to check the end-to-end deployment

After every little thing is deployed, navigate to the Transit Gateway console for every SageMaker account (LOB accounts) and make sure that the transit gateway has been accurately shared and the VPCs are related to it.

Optionally, if any forwarding guidelines have been shared with the accounts, they are often related to the SageMaker accounts’ VPC. The essential guidelines to make the centralized VPC endpoints resolution work are robotically shared with the LOB Account throughout deployment. For extra details about this method, discuss with Centralized access to VPC private endpoints.

Populate the information

Run the next script to populate the DynamoDB tables and Amazon Cognito person pool with the required info:

./scripts/setup/fill-data.sh

The script performs the required API calls utilizing the AWS Command Line Interface (AWS CLI) and the beforehand configured parameters and profiles.

Amazon Cognito customers

This step works the identical as Part 2 of this sequence, however must be carried out for customers in all LOBs and will match their person profile in SageMaker, no matter which LOB they belong to. For this publish, we now have one person in a Studio area in LOB A (user-lob-a) and one person in a Studio area in LOB B (user-lob-b). The next desk lists the customers populated within the Amazon Cognito person pool.

Person Password
user-lob-a UserLobA1!
user-lob-b UserLobB1!

Observe that these passwords have been configured for demo functions.

DynamoDB tables

The entry software makes use of two DynamoDB tables to direct requests from the completely different customers to their LOB’s Studio area.

The customers desk holds the connection between customers and their LOB.

Major Key LOB
user-lob-a lob-a
user-lob-b lob-b

The LOB desk holds the connection between the LOB and the AWS account the place the SageMaker area for that LOB exists.

LOB ACCOUNT_ID
lob-a <YOUR_LOB_A_ACCOUNT_ID>
lob-b <YOUR_LOB_B_ACCOUNT_ID>

Observe that these person names have to be constant throughout the Studio person profiles and the names of the customers we beforehand added to the Amazon Cognito person pool.

Take a look at the deployment

At this level, we will take a look at the deployment going to API Gateway and examine what the API responds for any of the customers. We get a presigned URL within the response; nevertheless, consuming that URL within the browser will give an auth token error.

For this demo, we now have arrange a simulated on-premises setting with a bastion host and a Home windows software. We set up Firefox within the Home windows occasion and use the dev instruments so as to add authorization headers to our requests and take a look at the answer. Extra detailed info on easy methods to arrange the on-premises simulated setting is accessible within the related GitHub repository.

The next diagram reveals our take a look at structure.

We’ve got two customers, one for LOB A (Person A) and one other one for LOB B (Person B), and we present how the Studio area modifications simply by altering the authorization key retrieved from Amazon Cognito when logging in as Person A and Person B.

Full the next steps to check the deployment:

  1. Retrieve the session token for Person A, as proven in Part 2 of the sequence and likewise within the directions within the GitHub repository.

We use the next instance command to get the person credentials from Amazon Cognito:

aws cognito-idp initiate-auth 
--auth-flow USER_PASSWORD_AUTH 
--client-id <your-cognito-client-id> 
--auth-parameters USERNAME=user-lob-a,PASSWORD=Userloba1! 
--region <your-region>

  1. For this demo, we use a simulated Home windows on-premises software. To connect with the Home windows occasion, you may comply with the identical method laid out in Secure access to Amazon SageMaker Studio with AWS SSO and a SAML application.
  2. Firefox ought to be put in within the occasion. If not, as soon as within the occasion, we will install Firefox.
  3. Open Firefox and attempt to entry the API of Studio with both user-lob-a or user-lob-b because the API path parameter.

You get a not licensed message.

  1. Open the developer instruments of Firefox and on the Community tab, select (right-click) the earlier API name, and select Edit and Resend.

  1. Right here we add the token as an authorization header within the Firefox developer instruments and make the request to the Studio entry Gateway API once more.

This time, we see within the developer instruments that the URL is returned together with a 302 redirect.

  1. Though the redirect received´t work when utilizing the developer instruments, you may nonetheless select it to entry the LOB SageMaker area for that person.

  1. Repeat for Person B with its corresponding token and examine that they get redirected to a unique Studio area.

For those who carry out these steps accurately, you may entry each domains on the similar time.

In our on-premises Home windows software, we will have each domains consumed through the Studio VPC endpoint by our VPC peering connection.

Let’s discover another testing eventualities.

For those who edit the API once more and alter the trail to the other LOB, when resending, we get an error within the API response: a forbidden response from API Gateway.

Attempting to take the returned URL for the proper person and eat it in your laptop computer´s browser will even fail, as a result of it received’t be consumed through the inner Studio VPC endpoint. This is identical error we noticed when testing with API Gateway. It returns an “Auth token containing inadequate permissions” error.

Taking too lengthy to eat the presigned URL will end in an “Invalid or Expired Auth Token” error.

Scale domains

Every time a brand new SageMaker area is added, you have to full the next networking and entry steps:

  1. Share the transit gateway with the brand new account utilizing AWS Resource Access Manager (AWS RAM).
  2. Connect the VPC to the transit gateway within the LOB account (that is achieved in AWS CloudFormation).

In our situation, the transit gateway was set with automated affiliation to the default route desk and automated propagation enabled. In a real-world use case, it’s possible you’ll want to finish three extra steps:

  1. Within the shared companies account, affiliate the hooked up Studio VPC to the respective Transit Gateway route desk for SageMaker domains.
  2. Propagate the related VPC routes to Transit Gateway.
  3. Lastly, add the account ID together with the LOB title to the LOBs’ DynamoDB desk.

Clear up

Full the next steps to scrub up your assets:

  1. Delete the VPC peering connection.
  2. Take away the related VPCs from the non-public hosted zones.
  3. Delete the on-premises simulator template from the shared companies account.
  4. Delete the Studio CloudFormation templates from the SageMaker accounts.
  5. Delete the entry CloudFormation template from the shared companies account.
  6. Delete the networking CloudFormation template from the shared companies account.

Conclusion

On this publish, we walked by how one can arrange multi-account non-public API entry to Studio. We defined how the networking and software flows occur in addition to how one can simply scale this structure for a number of accounts and SageMaker domains. Head over to the GitHub repository to start your journey. We’d love to listen to your suggestions!


Concerning the Authors

Neelam Koshiya is an Enterprise Options Architect at AWS. Her present focus helps enterprise prospects with their cloud adoption journey for strategic enterprise outcomes. In her spare time, she enjoys studying and being open air.

Alberto Menendez is an Affiliate DevOps Guide in Skilled Companies at AWS. He helps speed up prospects´ journeys to the cloud. In his free time, he enjoys taking part in sports activities, particularly basketball and padel, spending time with household and mates, and studying about know-how.

Rajesh Ramchander is a Senior Knowledge & ML Engineer in Skilled Companies at AWS. He helps prospects migrate large information and AL/ML workloads to AWS.

Ram Vittal is a machine studying options architect at AWS. He has over 20 years of expertise architecting and constructing distributed, hybrid, and cloud purposes. He’s captivated with constructing safe and scalable AI/ML and large information options to assist enterprise prospects with their cloud adoption and optimization journey to enhance their enterprise outcomes. In his spare time, he enjoys tennis and images.

Leave a Reply

Your email address will not be published. Required fields are marked *