Arrange cross-account Amazon S3 entry for Amazon SageMaker notebooks in VPC-only mode utilizing Amazon S3 Entry Factors


Developments in synthetic intelligence (AI) and machine studying (ML) are revolutionizing the monetary business to be used circumstances equivalent to fraud detection, credit score worthiness evaluation, and buying and selling technique optimization. To develop fashions for such use circumstances, knowledge scientists want entry to numerous datasets like credit score resolution engines, buyer transactions, threat urge for food, and stress testing. Managing acceptable entry management for these datasets among the many knowledge scientists engaged on them is essential to fulfill stringent compliance and regulatory necessities. Usually, these datasets are aggregated in a centralized Amazon Simple Storage Service (Amazon S3) location from varied enterprise purposes and enterprise methods. Information scientists throughout enterprise items engaged on mannequin growth utilizing Amazon SageMaker are granted entry to related knowledge, which may result in the requirement of managing prefix-level entry controls. With a rise in use circumstances and datasets utilizing bucket policy statements, managing cross-account entry per software is simply too advanced and lengthy for a bucket coverage to accommodate.

Amazon S3 Access Points simplify managing and securing knowledge entry at scale for purposes utilizing shared datasets on Amazon S3. You’ll be able to create distinctive hostnames utilizing entry factors to implement distinct and safe permissions and community controls for any request made via the entry level.

S3 Entry Factors simplifies the administration of entry permissions particular to every software accessing a shared dataset. It permits safe, high-speed knowledge copy between same-Area entry factors utilizing AWS inside networks and VPCs. S3 Entry Factors can prohibit entry to VPCs, enabling you to firewall knowledge inside non-public networks, check new entry management insurance policies with out impacting present entry factors, and configure VPC endpoint insurance policies to limit entry to particular account ID-owned S3 buckets.

This put up walks via the steps concerned in configuring S3 Entry Factors to allow cross-account entry from a SageMaker pocket book occasion.

Resolution overview

For our use case, we now have two accounts in a corporation: Account A (111111111111), which is utilized by knowledge scientists to develop fashions utilizing a SageMaker pocket book occasion, and Account B (222222222222), which has required datasets within the S3 bucket test-bucket-1. The next diagram illustrates the answer structure.

To implement the answer, full the next high-level steps:

  1. Configure Account A, together with VPC, subnet safety group, VPC gateway endpoint, and SageMaker pocket book.
  2. Configure Account B, together with S3 bucket, entry level, and bucket coverage.
  3. Configure AWS Identity and Access Management (IAM) permissions and insurance policies in Account A.

It’s best to repeat these steps for every SageMaker account that wants entry to the shared dataset from Account B.

The names for every useful resource talked about on this put up are examples; you’ll be able to change them with different names as per your use case.

Configure Account A

Full the next steps to configure Account A:

  1. Create a VPC known as DemoVPC.
  2. Create a subnet known as DemoSubnet within the VPC DemoVPC.
  3. Create a security group known as DemoSG.
  4. Create a VPC S3 gateway endpoint known as DemoS3GatewayEndpoint.
  5. Create the SageMaker execution role.
  6. Create a notebook instance known as DemoNotebookInstance and the safety tips as outlined in How to configure security in Amazon SageMaker.
    1. Specify the Sagemaker execution position you created.
    2. For the pocket book community settings, specify the VPC, subnet, and safety group you created.
    3. Guarantee that Direct Internet access is disabled.

You assign permissions to the position in subsequent steps after you create the required dependencies.

Configure Account B

To configure Account B, full the next steps:

  1. In Account B, create an S3 bucket known as test-bucket-1 following Amazon S3 security guidance.
  2. Upload your file to the S3 bucket.
  3. Create an access point known as test-ap-1 in Account B.
    1. Don’t change or edit any Block Public Entry settings for this entry level (all public entry ought to be blocked).
  4. Connect the next coverage to your entry level:
{
    "Model": "2012-10-17",
    "Assertion": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": “arn:aws:iam:: 111111111111:role/demo ”
            },
            "Action": ["s3:GetObject", "s3:GetObjectVersion", "s3:PutObject", "s3:PutObjectAcl"]
            "Useful resource": [
                “arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1”,
                " arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1/object/*"
            ]
        }
    ]
}

The actions outlined within the previous code are pattern actions for demonstration functions. You’ll be able to define the actions as per your necessities or use case.

  1. Add the next bucket coverage permissions to entry the entry level:
{
    "Model": "2012-10-17",
    "Assertion": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": " arn:aws:iam:: 111111111111:role/demo "
            },
            "Action" : ["s3:GetObject","s3:ListBucket"],
            "Useful resource" : ["arn:aws:s3:::test-bucket-1 ”, " arn:aws:s3:::test-bucket-1/*"]
            "Situation": {
                "StringEquals": {
                    "s3:DataAccessPointAccount": "222222222222"
                }
            }
        }
    ]
}

The previous actions are examples. You’ll be able to outline the actions as per your necessities.

Configure IAM permissions and insurance policies

Full the next steps in Account A:

  1. Affirm that the SageMaker execution position has the AmazonSagemakerFullAccess custom IAM inline policy, which appears to be like like the next code:
{
            "Sid": "VisualEditor2",
            "Impact": "Permit",
            " Motion": ["s3:GetObject", "s3:GetObjectVersion", "s3:PutObject", "s3:PutObjectAcl"]
            "Useful resource": [
                “arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1 ”,
                "arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1 /object/*”,                             "arn:aws:s3:::test-bucket-1”,
                " arn:aws:s3:::test-bucket-1/*"
            ]
}

The actions within the coverage code are pattern actions for demonstration functions.

  1. Go to the DemoS3GatewayEndpoint endpoint you created and add the next permissions:
{

	"Model": "2012-10-17",
	"Assertion": [
		{
			"Sid": "AllowCrossAccountAccessThroughAccessPoint",
			"Effect": "Allow",
			"Principal": "*",
			"Action": [
				"s3:Get*",
				"s3:List*",
				"s3:Put*"
			],
			"Useful resource": ": [
                “arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1 ”,
                "arn:aws:s3:us-east-1: 222222222222:accesspoint/test-ap-1 /object/*”,                             "arn:aws:s3:::test-bucket-1 ”,
                " arn:aws:s3:::test-bucket-1/*"
            ]
 
		}
	]
}

  1. To get a prefix listing, run the AWS Command Line Interface (AWS CLI) describe-prefix-lists command:
aws ec2 describe-prefix-lists

  1. In Account A, Go to the safety group DemoSG for the goal SageMaker pocket book occasion
  2. Beneath Outbound guidelines, create an outbound rule with All site visitors or All TCP, after which specify the vacation spot because the prefix listing ID you retrieved.

This completes the setup in each accounts.

Take a look at the answer

To validate the answer, go to the SageMaker pocket book occasion terminal and enter the next instructions to listing the objects via the entry level:

  • To listing the objects efficiently via S3 entry level test-ap-1:
aws s3 ls arn:aws:s3:us-east-1:222222222222:accesspoint/Take a look at-Ap-1

  • To get the objects efficiently via S3 entry level test-ap-1:
aws s3api get-object --bucket arn:aws:s3:us-east-1:222222222222:accesspoint/test-ap-1 --key sample2.csv test2.csv

Clear up

If you’re performed testing, delete any S3 access points and S3 buckets. Additionally, delete any Sagemaker notebook instances to cease incurring expenses.

Conclusion

On this put up, we confirmed how S3 Entry Factors permits cross-account entry to giant, shared datasets from SageMaker pocket book situations, bypassing dimension constraints imposed by bucket insurance policies whereas configuring at-scale entry administration on shared datasets.

To be taught extra, check with Easily Manage Shared Data Sets with Amazon S3 Access Points.


Concerning the authors

Kiran Khambete is working as Senior Technical Account Supervisor at Amazon Net Companies (AWS). As a TAM, Kiran performs a job of technical knowledgeable and strategic information to serving to Enterprise clients attaining their enterprise targets.

Ankit Soni with complete expertise of 14 years holds the place of Principal Engineer at NatWest Group, the place he has served as a Cloud Infrastructure Architect for the previous six years.

Kesaraju Sai Sandeep is a Cloud Engineer specializing in Massive Information Companies at AWS.

Leave a Reply

Your email address will not be published. Required fields are marked *