SageMaker Studio
As of 2025, the project behind setting up code-server is effectively archived (last release was 2023). For a more modern setup with active development, see the setup for Sagemaker Studio.
This setup can be used for running SageMaker Studio with a simple network infrastructure, security configurations, and minimal IAM. The configuration files provide the essential components needed to deploy a VPC, core Sagemaker resources (e.g., domain, user profile, space, lifecycle configuration), S3 for storage, and ECR repository for docker images.
For a setup that integrates
optuna
for hyperparameter optimization, refer to the Optuna SageMaker Studio documentation.
├── backend.hcl-example # Example configuration for Terraform backend
├── ecr.tf # Elastic Container Registry for custom Docker images
├── iam.tf # IAM roles and policies for SageMaker Studio
├── lifecycle_scripts # Lifecycle scripts for configuring SageMaker Studio
│ └── setup_coder_editor.sh # Script to configure the code editor environment
├── main.tf # Main Terraform configuration file
├── s3.tf # S3 bucket configuration
├── sagemaker.tf # SageMaker Studio configurations
├── secrets_manager.tf # Secrets Manager for sensitive information
├── security_groups.tf # Security groups for SageMaker Studio
├── variables.tf # Variable definitions
├── variables.tfvars-example # Example variable values
└── vpc.tf # VPC configuration
Lifecycle Script
This lifecycle script shows examples for the following:
Installation
- Tools into the
base
conda environment - Python dependency management tools
poetry
,uv
, andpdm
- Extensions for
sagemaker-code-editor
Configuration
- Configurations for
conda
andgit
- Keyboard shortcuts and settings for
sagemaker-code-editor
References
- External library and kernel installation
- Create a lifecycle configuration to install Code Editor extensions
- Code Editor in Amazon SageMaker Studio
Modules
1. vpc.tf
Provisioned Resources:
- VPC: Defines a Virtual Private Cloud (VPC) for network isolation with DNS support enabled.
- Public Subnets: Dynamically creates public subnets across two availability zones.
- Internet Gateway: Enables internet access for public subnets.
- Route Tables: Configures routing for public subnets with default routes to the internet gateway.
Key Design:
- Uses dynamic subnet CIDR allocation based on the VPC CIDR block.
- Automatically selects availability zones with "opt-in-not-required" status.
Dependencies:
- Provides networking for SageMaker Studio and other resources.
- Security groups (from
security_groups.tf
) are tied to this VPC.
2. security_groups.tf
Provisioned Resources:
- SageMaker Security Group: Allows outbound internet access for SageMaker Studio.
Dependencies:
- Used by resources in
sagemaker.tf
for network access control.
3. secrets_manager.tf
Note: As of 2025, code repository integration is only supported for JupyterLab and JupyterServer apps, not the Code Editor in SageMaker Studio. Additionally, private repositories are not yet supported. Therefore, this secret remains unused in the current configuration. Repository must be cloned manually using this secret if needed.
Provisioned Resources:
- Secrets: GitHub Personal Access Token for GitHub repository.
4. sagemaker.tf
Provisioned Resources:
- SageMaker Domain: Creates the Studio domain with IAM authentication and public internet access.
- User Profile: Configures a named user profile with specific execution roles and security settings.
- SageMaker Space: Sets up a "private" space within the domain.
- Lifecycle Configuration: Defines a Code Editor lifecycle configuration using the setup script.
Key Design:
- Docker access is enabled for container development.
- Code Editor app settings with specified instance types.
- EBS storage configuration with default and maximum volume sizes.
- Auto-mounting of EFS home directories.
Dependencies:
- Depends on IAM roles (
iam.tf
), public subnets (vpc.tf
), security groups (security_groups.tf
), and Secrets Manager secrets (secrets_manager.tf
).
References
- Amazon SageMaker AI domain overview
- Amazon SageMaker domain user profiles
- Connect your local Visual Studio Code to SageMaker spaces with remote access
- Amazon SageMaker Studio spaces
- Amazon EFS auto-mounting in Studio
5. s3.tf
Provisioned Resources:
- S3 Bucket: Provides storage for datasets, artifacts, and outputs.
Dependencies:
- The IAM role (
iam.tf
) grants permissions to SageMaker Studio for accessing this bucket.
6. iam.tf
Provisioned Resources:
- SageMaker Execution Role: Creates an IAM role that SageMaker can assume to access AWS resources.
- Custom IAM Policies:
- S3 Policy: Grants full access to the specified S3 bucket.
- Remote Access Policy: Enables SageMaker remote access feature.
- Policy Attachments: Attaches the following policies to the SageMaker execution role:
- S3 Policy: Grants access to the S3 bucket created in
s3.tf
and the external S3 bucket used for Terraform remote state - Remote Access Policy: Grants
sagemaker:StartSession
permission to enable remote access to SageMaker Studio compute instances AmazonSageMakerFullAccess
(AWS managed policy)AmazonEC2ContainerRegistryFullAccess
(AWS managed policy)SecretsManagerReadWrite
(AWS managed policy)
Dependencies:
- SageMaker Studio user assumes this execution role to interact with S3, ECR, Secrets Manager, and other AWS services.
- The role is referenced in the SageMaker domain, user profiles, and for accessing S3 buckets.
7. ecr.tf
Provisioned Resources:
- ECR Repository: Creates an Elastic container registry repository for storing Docker images.
- Lifecycle Policy: Configures automatic cleanup of images with two rules:
- Retains only the most recent images tagged as "latest"
- Expires untagged images after a specified number of days
Dependencies:
- The IAM role in
iam.tf
grants SageMaker permission to pull images from this repository. - Used to store custom training, processing, or inference images.