EMR on EKS
Create a New Namespace for EMR Job
Create IAM Identity Mapping
Create EMR Virtual Cluster and Job Role
STACK_NAME="<stack name>"
PROJECT_NAME="<project name>"
REGION="<region code>"
### EMR Configuration - General
VirtualClusterName="" # [REQUIRED] The name of EMR virtual cluster.
### EMR Configuration - EKS
EksClusterName="" # [REQUIRED] The name of EKS cluster.
NamespaceName="" # [REQUIRED] The name of Kubernetes' namespace which you want to use EMR.
### EMR Configuration - Job Role
RoleName="" # [REQUIRED] The name of job role.
curl -LO https://raw.githubusercontent.com/marcus16-kang/cloudformation-templates/main/emr/emr-container.yaml
aws cloudformation deploy \
--template-file ./emr-container.yaml \
--parameter-overrides \
ProjectName=$PROJECT_NAME \
VirtualClusterName=$VirtualClusterName \
EksClusterName=$EksClusterName \
NamespaceName=$NamespaceName \
RoleName=$RoleName \
--tags project=$PROJECT_NAME \
--disable-rollback \
--region $REGION
$STACK_NAME="<stack name>"
$PROJECT_NAME="<project name>"
$REGION="<region code>"
### EMR Configuration - General
$VirtualClusterName="" # [REQUIRED] The name of EMR virtual cluster.
### EMR Configuration - EKS
$EksClusterName="" # [REQUIRED] The name of EKS cluster.
$NamespaceName="" # [REQUIRED] The name of Kubernetes' namespace which you want to use EMR.
### EMR Configuration - Job Role
$RoleName="" # [REQUIRED] The name of job role.
curl.exe -LO https://raw.githubusercontent.com/marcus16-kang/cloudformation-templates/main/emr/emr-container.yaml
aws cloudformation deploy `
--template-file ./emr-container.yaml `
--parameter-overrides `
ProjectName=$PROJECT_NAME `
VirtualClusterName=$VirtualClusterName `
EksClusterName=$EksClusterName `
NamespaceName=$NamespaceName `
RoleName=$RoleName `
--tags project=$PROJECT_NAME `
--disable-rollback `
--region $REGION
Tip
If you want to access S3 in EMR job, you should add AmazonS3FullAccess
or other any policies to EMR job role.
Update Job Role Trust Policy
Start EMR Job
pod.yaml | |
---|---|
Warning
You can't define tolerations to job's controller pod. (Driver and executor only can define tolerations.)