Cost attribution - SageMaker Studio Administration Best Practices (original) (raw)
SageMaker AI Studio has built in capabilities to help administrators track the spend of their individual domains, shared spaces, and users.
Automated tagging
SageMaker AI Studio now automatically tags new SageMaker resources such as training jobs, processing jobs, and kernel apps with their respective sagemaker:domain-arn. At a more granular level, SageMaker AI also tags the resource with thesagemaker:user-profile-arn or sagemaker:space-arn to designate the the principal creator of the resource.
SageMaker AI domain EFS volumes are tagged with a key namedManagedByAmazonSageMakerResource with the value of the domain ARN. They do not have granular tags to understand the space usage on a per user level. Administrators can attach the EFS volume to an EC2 instance for bespoke monitoring though.
Cost monitoring
Automated tags enable Administrators to track, report, and monitor your ML spend through out-of-the-box solutions such as AWS Cost Explorer andAWS Budgets, as well as custom solutions built on the data from AWS Cost and Usage Reports (CURs).
To use the attached tags for cost analysis, they must first be activated in the Cost allocation tags section of the AWS Billing console. It can take up to 24 hours for tags to show up in the cost allocate tag panel, so you’ll need to create a SageMaker AI resource prior to enabling them.
After you have enabled a cost allocation tag, AWS will begin tracking your tagged resources, and after 24-48 hours, the tags will show up as selectable filters in cost explorer.
Cost control
When the first SageMaker AI Studio user is onboarded, SageMaker AI creates an EFS volume for the domain. Storage costs are incurred for this EFS volume as notebooks and data files are stored in the user’s home directory. When the user launches Studio notebooks, they are launched for the compute instances running the notebooks. Refer to Amazon SageMaker AI pricing for detailed breakdown of costs.
Administrators can control compute costs by specifying the list of instances a user can spin up, using IAM policies as mentioned in the Common guardrails section. In addition, we recommend that customers make use of the SageMaker AI Studio auto shutdown extension to save costs by automatically shutting down idle apps. This server extension periodically polls for running apps per user profile, and shuts down idle apps based on a timeout set by the administrator.
To set this extension for all users in your domain, you can use a lifecycle configuration as described in Customization section. Additionally, you can also use the extension checker to ensure all of your domain’s users have the extension installed.