How to Deploy JupyterHub on Kubernetes with Dynamic AWS EFS Storage

September 22, 2025

Running JupyterHub on Kubernetes is powerful — it allows multiple users to spin up isolated Jupyter notebook servers on demand. But to make this production-ready, you need persistent storage that survives pod restarts and scales with users.

In this guide, we’ll walk through how to configure AWS EFS + CSI driver to dynamically provision per-user home directories for JupyterHub.


1. Why EFS for JupyterHub?

JupyterHub spawns one pod per user. Without persistent storage:

  • Notebooks disappear when pods restart.
  • Users cannot resume their work.

AWS Elastic File System (EFS) solves this:

  • Elastic: Scales automatically, no pre-provisioning required.
  • Shared: Multiple pods can mount the same file system.
  • Dynamic provisioning: Each user gets their own directory via EFS Access Points.

2. Prerequisites

  • Kubernetes cluster on EC2 (not EKS in this case).
  • Helm installed.
  • kubectl access to your cluster.
  • AWS CLI configured. (Optional if operating on AWS Console)
  • An EFS File System created in the same VPC, Region, and Security Group as your nodes.

3. Install the EFS CSI Driver

Deploy the official EFS CSI driver:

kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=v2.1.10" -n kube-system kubectl -n kube-system get pods -l app=efs-csi-controller kubectl -n kube-system get ds -l app=efs-csi-node

This installs the controller + node DaemonSet.


4. Attach IAM Permissions

Your EC2 worker nodes need IAM permissions for the driver to create access points dynamically:

{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "elasticfilesystem:DescribeFileSystems", "elasticfilesystem:DescribeMountTargets", "elasticfilesystem:DescribeAccessPoints", "elasticfilesystem:CreateAccessPoint", "elasticfilesystem:DeleteAccessPoint", "elasticfilesystem:TagResource", "elasticfilesystem:UntagResource" ], "Resource": "*" }] }

Attach this policy to the instance profile of your worker nodes.

  • Create role for EC2 (e.g., K8sWorkerRole), attach EFSCSIDriverAccess.
  • Ensure an instance profile (e.g., K8sWorkerProfile) with that role is associated to each worker instance.
  • Set IMDS:
aws ec2 modify-instance-metadata-options \ --instance-id <i-...> --http-endpoint enabled \ --http-tokens optional --http-put-response-hop-limit 2

Sanity test from a worker:

aws sts get-caller-identity aws efs describe-file-systems --region us-east-1

Create role for EC2 (e.g., K8sWorkerRole), attach EFSCSIDriverAccess.

Ensure an instance profile (e.g., K8sWorkerProfile) with that role is associated to each worker instance.

Set IMDS:

aws ec2 modify-instance-metadata-options
--instance-id <i-...> --http-endpoint enabled
--http-tokens optional --http-put-response-hop-limit 2

Sanity test from a worker:

aws sts get-caller-identity aws efs describe-file-systems --region us-east-1


5. Create a StorageClass for Dynamic Provisioning

apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: jupyterhub-user-notebook-efs-sc provisioner: efs.csi.aws.com parameters: provisioningMode: efs-ap fileSystemId: fs-XXXXXXX # your EFS filesystem directoryPerms: "750" gidRangeStart: "1000" gidRangeEnd: "2000" basePath: "/dynamic-pvc" reclaimPolicy: Delete volumeBindingMode: Immediate

What each parameter means (and why you care)

  • provisioningMode: efs-aptrue dynamic provisioning using Access Points (AP).
  • fileSystemId → which EFS to use.
  • basePath → where per-PVC dirs live (good housekeeping).
  • directoryPerms → POSIX mode on AP root (must be "700", "750", etc.).
  • gidRangeStart/End → unique GID per AP to isolate users.
  • reclaimPolicyRetain keeps data when PVCs are deleted; Delete cleans up APs automatically.

These do not limit “how many PVs” you can make. They define how each AP directory is created/owned.


6. Configure JupyterHub Helm Chart (values.yaml)

Key sections:

hub: config: JupyterHub: admin_access: true authenticator_class: google Authenticator: admin_users: - your-admin@gmail.com allowed_domains: - gmail.com GoogleOAuthenticator: client_id: "<client-id>" client_secret: "<client-secret>" oauth_callback_url: "https://<your-domain>/hub/oauth_callback" scope: ["openid", "email", "profile"] singleuser: storage: type: dynamic dynamic: storageClass: jupyterhub-user-notebook-efs-sc pvcNameTemplate: claim-{username} volumeNameTemplate: volume-{user_server} storageAccessModes: [ReadWriteMany] homeMountPath: /home/jovyan image: name: liveget/notebook tag: "0.0.3" startTimeout: 600 cmd: jupyterhub-singleuser

7. Deploy JupyterHub

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ helm upgrade --install jupyterhub jupyterhub/jupyterhub -f values.yaml

Check the pods:

kubectl get pods -n jupyterhub

8. Verify Dynamic PVCs

When a user logs in:

  • A new PVC is created (claim-user@domain).
  • The EFS CSI driver creates an Access Point in your file system.
  • The PVC is automatically bound to the pod.

Check PVCs:

kubectl get pvc -n jupyterhub

9. Common Pitfalls & Fixes

  • ❌ ProvisioningFailed: no credentials → Fix: Attach IAM role with EFS permissions to worker nodes.
  • ❌ Permission denied writing to home → Fix: Align container user (uid/gid) with EFS AP config (gidRangeStart: 1000).
  • ❌ Pod stuck waiting for volume → Fix: Ensure mount targets exist in every AZ where worker nodes run.

✅ Conclusion

With this setup:

  • Every user gets an isolated, persistent home directory on EFS.
  • Storage scales automatically — no manual PV management.
  • JupyterHub is ready for multi-user, production workloads

Join the Discussion

Share your thoughts and insights about this tutorial.