Use Amazon SageMaker Studio with a customized file system in Amazon EFS


Amazon SageMaker Studio is the most recent web-based expertise for operating end-to-end machine studying (ML) workflows. SageMaker Studio gives a collection of built-in growth environments (IDEs), which incorporates JupyterLab, Code Editor, in addition to RStudio. Information scientists and ML engineers can spin up SageMaker Studio non-public and shared areas, that are used to handle the storage and useful resource wants of the JupyterLab and Code Editor purposes, allow stopping the purposes when not in use to save lots of on compute prices, and resume the work from the place they stopped.

The storage sources for SageMaker Studio areas are Amazon Elastic Block Store (Amazon EBS) volumes, which provide low-latency entry to consumer knowledge like notebooks, pattern knowledge, or Python/Conda digital environments. Nevertheless, there are a number of eventualities the place utilizing a distributed file system shared throughout non-public JupyterLab and Code Editor areas is handy, which is enabled by configuring an Amazon Elastic File System (Amazon EFS) file system in SageMaker Studio. Amazon EFS supplies a scalable absolutely managed elastic NFS file system for AWS compute cases.

Amazon SageMaker helps automatically mounting a folder in an EFS volume for every consumer in a website. Utilizing this folder, customers can share knowledge between their very own non-public areas. Nevertheless, customers can’t share knowledge with different customers within the area; they solely have entry to their very own folder user-default-efs within the $HOME listing of the SageMaker Studio software.

On this put up, we discover three distinct eventualities that exhibit the flexibility of integrating customized Amazon EFS with SageMaker Studio.

For additional data on configuring Amazon EFS in SageMaker Studio, discuss with Attaching a custom file system to a domain or user profile.

Answer overview

Within the first situation, an AWS infrastructure admin desires to arrange an EFS file system that may be shared throughout the non-public areas of a given consumer profile in SageMaker Studio. Because of this every consumer inside the area could have their very own non-public house on the EFS file system, permitting them to retailer and entry their very own knowledge and information. The automation described on this put up will allow new group members becoming a member of the information science group can rapidly arrange their non-public house on the EFS file system and entry the required sources to begin contributing to the continued challenge.

The next diagram illustrates this structure.

First scenario architecture

This situation gives the next advantages:

  • Particular person knowledge storage and evaluation – Customers can retailer their private datasets, fashions, and different information of their non-public areas, permitting them to work on their very own initiatives independently. Segregation is made by their consumer profile.
  • Centralized knowledge administration – The administrator can handle the EFS file system centrally, sustaining knowledge safety, backup, and direct entry for all customers. By establishing an EFS file system with a non-public house, customers can effortlessly observe and keep their work.
  • Cross-instance file sharing – Customers can entry their information from a number of SageMaker Studio areas, as a result of the EFS file system supplies a persistent storage answer.

The second situation is said to the creation of a single EFS listing that’s shared throughout all of the areas of a given SageMaker Studio area. Because of this all customers inside the area can entry and use the identical shared listing on the EFS file system, permitting for higher collaboration and centralized knowledge administration (for instance, to share frequent artifacts). This can be a extra generic use case, as a result of there is no such thing as a particular segregated folder for every consumer profile.

The next diagram illustrates this structure.

Second scenario architecture

This situation gives the next advantages:

  • Shared challenge directories – Suppose the information science group is engaged on a large-scale challenge that requires collaboration amongst a number of group members. By establishing a shared EFS listing at challenge degree, the group can collaborate on the identical initiatives by accessing and dealing on information within the shared listing. The info science group can, for instance, use the shared EFS listing to retailer their Jupyter notebooks, evaluation scripts, and different project-related information.
  • Simplified file administration – Customers don’t have to handle their very own non-public file storage, as a result of they will depend on the shared listing for his or her file-related wants.
  • Improved knowledge governance and safety – The shared EFS listing, being centrally managed by the AWS infrastructure admin, can present improved knowledge governance and safety. The admin can implement entry controls and different knowledge administration insurance policies to take care of the integrity and safety of the shared sources.

The third situation explores the configuration of an EFS file system that may be shared throughout a number of SageMaker Studio domains inside the identical VPC. This permits customers from totally different domains to entry and work with the identical set of information and knowledge, enabling cross-domain collaboration and centralized knowledge administration.

The next diagram illustrates this structure.

Third scenario architecture

This situation gives the next advantages:

  • Enterprise-level knowledge science collaboration – Think about a big group with a number of knowledge science groups engaged on varied initiatives throughout totally different departments or enterprise models. By establishing a shared EFS file system accessible throughout the group’s SageMaker Studio domains, these groups can collaborate on cross-functional initiatives, share artifacts, and use a centralized knowledge repository for his or her work.
  • Shared infrastructure and sources – The EFS file system can be utilized as a shared useful resource throughout a number of SageMaker Studio domains, selling effectivity and cost-effectiveness.
  • Scalable knowledge storage – Because the variety of customers or domains will increase, the EFS file system robotically scales to accommodate the rising storage and entry necessities.
  • Information governance – The shared EFS file system, being managed centrally, will be topic to stricter knowledge governance insurance policies, entry controls, and compliance necessities. This may help the group meet regulatory and safety requirements whereas nonetheless enabling cross-domain collaboration and knowledge sharing.

Conditions

This put up supplies an AWS CloudFormation template to deploy the primary sources for the answer. Along with this, the answer expects that the AWS account by which the template is deployed already has the next configuration and sources:

Seek advice from Attaching a custom file system to a domain or user profile for added conditions.

Configure an EFS listing shared throughout non-public areas of a given consumer profile

On this situation, an administrator desires to provision an EFS file system for all customers of a SageMaker Studio area, creating a non-public file system listing for every consumer. We are able to distinguish two use instances:

  • Create new SageMaker Studio consumer profiles – A brand new group member joins a preexisting SageMaker Studio area and desires to connect a customized EFS file system to the JupyterLab or Code Editor areas
  • Use preexisting SageMaker Studio consumer profiles – A group member is already engaged on a selected SageMaker Studio area and desires to connect a customized EFS file system to the JupyterLab or Code Editor areas

The answer offered on this put up focuses on the primary use case. We focus on easy methods to adapt the answer for preexisting SageMaker Studio area consumer profiles later on this put up.

The next diagram illustrates the high-level structure of the answer.

AWS Architecture

On this answer, we use CloudTrail, Amazon EventBridge, and Lambda to robotically create a non-public EFS listing when a brand new SageMaker Studio consumer profile is created. The high-level steps to arrange this structure are as follows:

  1. Create an EventBridge rule that invokes the Lambda perform when a brand new SageMaker consumer profile is created and logged in CloudTrail.
  2. Create an EFS file system with an entry level for the Lambda perform and with a mount goal in each Availability Zone that the SageMaker Studio area is situated.
  3. Use a Lambda perform to create a non-public EFS listing with the required POSIX permissions for the profile. The perform can even replace the profile with the brand new file system configuration.

Deploy the answer utilizing AWS CloudFormation

To make use of the answer, you possibly can deploy the infrastructure utilizing the next CloudFormation template. This template deploys three most important sources in your account: Amazon EFS sources (file system, entry factors, mount targets), an EventBridge rule, and a Lambda perform.

Seek advice from Create a stack from the CloudFormation console for added data. The enter parameters for this template are:

  • SageMakerDomainId – The SageMaker Studio area ID that can be related to the EFS file system.
  • SageMakerStudioVpc – The VPC related to the SageMaker Studio area.
  • SageMakerStudioSubnetId – One or a number of subnets related to the SageMaker Studio area. The template deploys its sources in these subnets.
  • SageMakerStudioSecurityGroupId – The safety group related to the SageMaker Studio area. The template configures the Lambda perform with this safety group.

Amazon EFS sources

After you deploy the template, navigate to the Amazon EFS console and ensure that the EFS file system has been created. The file system has a mount goal in each Availability Zone that your SageMaker area connects to.

Observe that every mount goal makes use of the EC2 safety group that SageMaker created in your AWS account once you first created the area, which permits NFS site visitors at port 2049. The offered template robotically retrieves this safety group when it’s first deployed, utilizing a Lambda backed custom resource.

You may as well observe that the file system has an EFS entry level. This entry level grants root entry on the file system for the Lambda perform that may create the directories for the SageMaker Studio consumer profiles.

EventBridge rule

The second most important useful resource is an EventBridge rule invoked when a brand new SageMaker Studio consumer profile is created. Its goal is the Lambda perform that creates the folder within the EFS file system and updates the profile that has been simply created. The enter of the Lambda perform is the occasion matched, the place you will get the SageMaker Studio area ID and the SageMaker consumer profile identify.

Lambda perform

Lastly, the template creates a Lambda perform that creates a listing within the EFS file system with the required POSIX permissions for the consumer profile and updates the consumer profile with the brand new file system configuration.

At a POSIX permissions degree, you possibly can management which customers can entry the file system and which information or knowledge they will entry. The POSIX consumer and group ID for SageMaker apps are:

  • UID – The POSIX consumer ID. The default is 200001. A sound vary is a minimal worth of 10000 and most worth of 4000000.
  • GID – The POSIX group ID. The default is 1001. A sound vary is a minimal worth of 1001 and most worth of 4000000.

The Lambda perform is in the identical VPC because the EFS file system and it has hooked up the file system and entry level beforehand created.

Lambda function configuration

Adapt the answer for preexisting SageMaker Studio area consumer profiles

We are able to reuse the earlier answer for eventualities by which the area already has consumer profiles created. For that, you possibly can create an additional Lambda function in Python that lists all of the consumer profiles for the given SageMaker Studio area and creates a devoted EFS listing for every consumer profile.

The Lambda perform must be in the identical VPC because the EFS file system and it has hooked up the file system and entry level beforehand created. You’ll want to add the efs_id and domain_id values as setting variables for the perform.

You’ll be able to embody the next code as a part of this new Lambda perform and run it manually:

import json
import subprocess
import boto3
import os

sm_client = boto3.consumer('sagemaker')

def lambda_handler(occasion, context):
    
    # Get EFS and Area ID
    file_system=os.environ['efs_id']
    domain_id=os.environ['domain_id']    
    
    
    # Get Area consumer profiles
    list_user_profiles_response = sm_client.list_user_profiles(
        DomainIdEquals=domain_id
    )
    domain_users = list_user_profiles_response["UserProfiles"]
    
    # Create directories for every consumer
    for consumer in domain_users:

        user_profile_name = consumer["UserProfileName"]

        # Permissions
        repository=f'/mnt/efs/{user_profile_name}'
        subprocess.name(['mkdir', repository])
        subprocess.name(['chown', '200001:1001', repository])
        
        # Replace SageMaker consumer
        response = sm_client.update_user_profile(
            DomainId=domain_id,
            UserProfileName=user_profile_name,
            UserSettings={
                'CustomFileSystemConfigs': [
                    {
                        'EFSFileSystemConfig': {
                            'FileSystemId': file_system,
                            'FileSystemPath': f'/{user_profile_name}'
                        }
                    }
                ]
            }
        )

Configure an EFS listing shared throughout all areas of a given area

On this situation, an administrator desires to provision an EFS file system for all customers of a SageMaker Studio area, utilizing the identical file system listing for all of the customers.

To realize this, along with the conditions described earlier on this put up, you should full the next steps.

Create the EFS file system

The file system must be in the identical VPC because the SageMaker Studio area. Seek advice from Creating EFS file systems for added data.

Add mount targets to the EFS file system

Earlier than SageMaker Studio can entry the brand new EFS file system, the file system will need to have a mount goal in every of the subnets related to the area. For extra details about assigning mount targets to subnets, see Managing mount targets. You may get the subnets related to the area on the SageMaker Studio console below Community. You’ll want to create a mount goal for every subnet.

Networking used

Moreover, for every mount goal, you need to add the safety group that SageMaker created in your AWS account once you created the SageMaker Studio area. The safety group identify has the format security-group-for-inbound-nfs-domain-id.

The next screenshot reveals an instance of an EFS file system with two mount targets for a SageMaker Studio area related to 2 subnets. Observe the safety group related to each mount targets.

EFS file system

Create an EFS entry level

The Lambda perform accesses the EFS file system as root utilizing this entry level. See Creating access points for added data.

EFS access point

Create a brand new Lambda perform

Define a new Lambda function with the identify LambdaManageEFSUsers. This perform updates the default house settings of the SageMaker Studio area, configuring the file system settings to make use of a selected EFS file system shared repository path. This configuration is robotically utilized to all areas inside the area.

The Lambda perform is in the identical VPC because the EFS file system and it has hooked up the file system and entry level beforehand created. Moreover, you should add efs_id and domain_id as setting variables for the perform.

At a POSIX permissions degree, you possibly can management which customers can entry the file system and which information or knowledge they will entry. The POSIX consumer and group ID for SageMaker apps are:

  • UID – The POSIX consumer ID. The default is 200001.
  • GID – The POSIX group ID. The default is 1001.

The perform updates the default house settings of the SageMaker Studio area, configuring the EFS file system for use by all customers. See the next code:

import json
import subprocess
import boto3
import os
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)
sm_client = boto3.consumer('sagemaker')

def lambda_handler(occasion, context):
    
    # Surroundings variables
    file_system=os.environ['efs_id']
    domain_id=os.environ['domain_id']
    
    # EFS listing identify
    repository_name="shared_repository"
    repository=f'/mnt/efs/{repository_name}'
            
    # Add permissions to the brand new listing
    strive:
        subprocess.name(['mkdir -p', repository])
        subprocess.name(['chown', '200001:1001', repository])
    besides:
        print("Repository already created")
    
    # Replace Sagemaker area to allow entry to the brand new listing
    response = sm_client.update_domain(
        DomainId=domain_id,
        DefaultUserSettings={
            'CustomFileSystemConfigs': [
                {
                    'EFSFileSystemConfig': {
                        'FileSystemId': file_system,
                        'FileSystemPath': f'/{repository_name}'
                    }
                }
            ]
        }
    )
    logger.data(f"Up to date Studio Area {domain_id} and EFS {file_system}")
    return {
        'statusCode': 200,
        'physique': json.dumps(f"Created dir and modified permissions for Studio Area {domain_id}")
    }

The execution position of the Lambda perform must have permissions to replace the SageMaker Studio area:

{ 
"Model": "2012-10-17",
    "Assertion": [ 
        { 
        "Effect": "Allow", 
        "Action": [
            "sagemaker:UpdateDomain"
        ],
        "Useful resource": "*" 
        } 
    ]
}

Configure an EFS listing shared throughout a number of domains below the identical VPC

On this situation, an administrator desires to provision an EFS file system for all customers of a number of SageMaker Studio domains, utilizing the identical file system listing for all of the customers. The concept on this case is to assign the identical EFS file system to all customers of all domains which are inside the identical VPC. To check the answer, the account ought to ideally have two SageMaker Studio domains contained in the VPC and subnet.

Create the EFS file system, add mount targets, and create an entry level

Full the steps within the earlier part to arrange your file system, mount targets, and entry level.

Create a brand new Lambda perform

Outline a Lambda perform referred to as LambdaManageEFSUsers. This perform is chargeable for automating the configuration of SageMaker Studio domains to make use of a shared EFS file system inside a selected VPC. This may be helpful for organizations that wish to present a centralized storage answer for his or her ML initiatives throughout a number of SageMaker Studio domains. See the next code:

import json
import subprocess
import boto3
import os
import sys

import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

sm_client = boto3.consumer('sagemaker')

def lambda_handler(occasion, context):
    
    #Surroundings variables
    event_domain_id =occasion["domain_id"]
    file_system=os.environ['efs_id']
    env_vpc_id =os.environ['vpc_id']
    
    #Occasion parameters 
    repository_name="shared_repository"
    repository=f'/mnt/efs/{repository_name}'
    domains =[]    

    # Checklist all SageMaker domains within the specified VPC
    response = sm_client.list_domains()
    all_domains = response['Domains']
    for area in all_domains:
        domain_id =area["DomainId"]
        knowledge =sm_client.describe_domain(DomainId=domain_id)
        domain_vpc_id = knowledge['VpcId']
        if domain_vpc_id ==env_vpc_id:
            domains.append(domain_id)
    
    # Create listing and add the permission
    strive:
        subprocess.name(['mkdir -p', repository])
        subprocess.name(['chown', '200001:1001', repository])
    besides:
        print("Repository already created")
    
    #Replace Sagemaker area
    if len(domains)>0:
        for domain_id in domains: 
            response = sm_client.update_domain(
                DomainId=event_domain_id,
                DefaultUserSettings={
                    'CustomFileSystemConfigs': [
                        {
                            'EFSFileSystemConfig': {
                                'FileSystemId': file_system,
                                'FileSystemPath': f'/{repository_name}'
                            }
                        }
                    ]
                }
            )
   
        logger.data(f"Up to date Studio for Domains {domains} and EFS {file_system}")
        return {
                'statusCode': 200,
                'physique': json.dumps(f"Created dir and modified permissions for Domains {domains}")
            }
    
    else:
        return {
            'statusCode': 400,
            'physique': json.dumps(f"VPC id of all of the domains {domain_vpc} is totally different than the vpc id configured {env_vpc_id}")
        }

The execution position of the Lambda perform must have permissions to explain and replace the SageMaker Studio area:

{ 
"Model": "2012-10-17",
    "Assertion": [ 
        { 
        "Effect": "Allow", 
        "Action": [
            "sagemaker:DescribeDomain",
            "sagemaker:UpdateDomain"
        ],
        "Useful resource": "*" 
        } 
    ]
}

Clear up

To wash up the answer you carried out and keep away from additional prices, delete the CloudFormation template you deployed in your AWS account. Whenever you delete the template, you additionally delete the EFS file system and its storage. For extra data, discuss with Delete a stack from the CloudFormation console.

Conclusion

On this put up, we’ve explored three eventualities demonstrating the flexibility of integrating Amazon EFS with SageMaker Studio. These eventualities spotlight how Amazon EFS can present a scalable, safe, and collaborative knowledge storage answer for knowledge science groups.

The primary situation centered on configuring an EFS listing with non-public areas for particular person consumer profiles, permitting customers to retailer and entry their very own knowledge whereas the administrator manages the EFS file system centrally.

The second situation showcased a shared EFS listing throughout all areas inside a SageMaker Studio area, enabling higher collaboration and centralized knowledge administration.

The third situation explored an EFS file system shared throughout a number of SageMaker Studio domains, empowering enterprise-level knowledge science collaboration and selling environment friendly use of shared sources.

By implementing these Amazon EFS integration eventualities, organizations can unlock the complete potential of their knowledge science groups, enhance knowledge governance, and improve the general effectivity of their data-driven initiatives. The combination of Amazon EFS with SageMaker Studio supplies a flexible platform for knowledge science groups to thrive within the evolving panorama of ML and AI.


In regards to the Authors

Irene Arroyo Delgado is an AI/ML and GenAI Specialist Options Architect at AWS. She focuses on bringing out the potential of generative AI for every use case and productionizing ML workloads, to realize clients’ desired enterprise outcomes by automating end-to-end ML lifecycles. In her free time, Irene enjoys touring and mountain climbing.

Itziar Molina Fernandez is an AI/ML Marketing consultant within the AWS Skilled Companies group. In her position, she works with clients constructing large-scale machine studying platforms and generative AI use instances on AWS. In her free time, she enjoys exploring new locations.

Matteo Amadei is a Information Scientist Marketing consultant within the AWS Skilled Companies group. He makes use of his experience in synthetic intelligence and superior analytics to extract helpful insights and drive significant enterprise outcomes for patrons. He has labored on a variety of initiatives spanning NLP, pc imaginative and prescient, and generative AI. He additionally has expertise with constructing end-to-end MLOps pipelines to productionize analytical fashions. In his free time, Matteo enjoys touring and studying.

Giuseppe Angelo Porcelli is a Principal Machine Studying Specialist Options Architect for Amazon Internet Companies. With a number of years of software program engineering and an ML background, he works with clients of any measurement to grasp their enterprise and technical wants and design AI and ML options that make one of the best use of the AWS Cloud and the Amazon Machine Studying stack. He has labored on initiatives in numerous domains, together with MLOps, pc imaginative and prescient, and NLP, involving a broad set of AWS providers. In his free time, Giuseppe enjoys enjoying soccer.

Leave a Reply

Your email address will not be published. Required fields are marked *