Skip to content

Commit 61f0879

Browse files
authored
#98 Feature/bottlerocket nodepool (#107)
* #94 bumped provider versions * #94 updated lock files * #94 updated acm, eks and r53 modules versions * #98 updated default eks cluster version to 1.20 * #98 fixed s3 vpc endpoint * #98 added bottlerocket self-managed worker group * #98 added bottlerocket worker group params * #98 migrated to node groups * #98 fixed br labels * #98 changed affinity for node groups in all layer2 services * #98 renamed node_pool to node_group, updated eks module, added force update parameter * #98 renamed node_pool to node_group, updated eks module, added force update parameter * #98 removed worker groups variables, updated k8s version to 1.21 * #98 moved workers additional policies into variable * #98 added descriptions for variables and updated Readme * #98 removed eks version from demo tfvars * #98 fixed typo * #98 added vpc endpoint routes to public subnets * #98 removed unused template
1 parent 69e2e58 commit 61f0879

26 files changed

+299
-175
lines changed

examples/echoserver-deployment.yaml

+1-2
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ metadata:
4444
kubernetes.io/ingress.class: alb
4545
alb.ingress.kubernetes.io/scheme: internet-facing
4646
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
47-
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:730809894724:certificate/aa3c4d2f-c661-49b8-b312-1f1cf1b7ef51
47+
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:730809894724:certificate/fa029132-86ab-4342-96e2-8e1fd5c56c29
4848
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
4949
alb.ingress.kubernetes.io/target-type: ip
5050
external-dns.alpha.kubernetes.io/hostname: echo.maddevs.org
@@ -59,4 +59,3 @@ spec:
5959
backend:
6060
serviceName: echoserver
6161
servicePort: 80
62-

terraform/layer1-aws/README.md

+13-6
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33
| Name | Version |
44
|------|---------|
55
| terraform | 0.15.1 |
6-
| aws | 3.38.0 |
7-
| kubernetes | 2.1.0 |
6+
| aws | 3.53.0 |
7+
| kubernetes | 2.4.1 |
88

99
## Providers
1010

1111
| Name | Version |
1212
|------|---------|
13-
| aws | 3.38.0 |
13+
| aws | 3.53.0 |
1414

1515
## Inputs
1616

@@ -25,15 +25,22 @@
2525
| domain\_name | Main public domain name | `any` | n/a | yes |
2626
| ecr\_repo\_retention\_count | number of images to store in ECR | `number` | `50` | no |
2727
| ecr\_repos | List of docker repositories | `list(any)` | <pre>[<br> "demo"<br>]</pre> | no |
28-
| eks\_cluster\_version | Version of the EKS K8S cluster | `string` | `"1.18"` | no |
28+
| eks\_cluster\_enabled\_log\_types | A list of the desired control plane logging to enable. For more information, see Amazon EKS Control Plane Logging documentation (https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html). Possible values: api, audit, authenticator, controllerManager, scheduler | `list(string)` | <pre>[<br> "audit"<br>]</pre> | no |
29+
| eks\_cluster\_encryption\_config\_enable | Enable or not encryption for k8s secrets with aws-kms | `bool` | `false` | no |
30+
| eks\_cluster\_log\_retention\_in\_days | Number of days to retain log events. Default retention - 90 days. | `number` | `90` | no |
31+
| eks\_cluster\_version | Version of the EKS K8S cluster | `string` | `"1.21"` | no |
2932
| eks\_map\_roles | Additional IAM roles to add to the aws-auth configmap. | <pre>list(object({<br> rolearn = string<br> username = string<br> groups = list(string)<br> }))</pre> | `[]` | no |
30-
| eks\_worker\_groups | EKS Worker groups configuration | `map` | <pre>{<br> "ci": {<br> "asg_desired_capacity": 0,<br> "asg_max_size": 3,<br> "asg_min_size": 0,<br> "override_instance_types": [<br> "t3.medium",<br> "t3a.medium"<br> ],<br> "spot_instance_pools": 2<br> },<br> "ondemand": {<br> "asg_desired_capacity": 1,<br> "asg_max_size": 6,<br> "instance_type": "t3a.medium"<br> },<br> "spot": {<br> "asg_desired_capacity": 1,<br> "asg_max_size": 5,<br> "asg_min_size": 0,<br> "override_instance_types": [<br> "t3.medium",<br> "t3a.medium"<br> ],<br> "spot_instance_pools": 2<br> }<br>}</pre> | no |
33+
| eks\_workers\_additional\_policies | Additional IAM policy attached to EKS worker nodes | `list(any)` | <pre>[<br> "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"<br>]</pre> | no |
3134
| eks\_write\_kubeconfig | Flag for eks module to write kubeconfig | `bool` | `false` | no |
3235
| environment | Env name in case workspace wasn't used | `string` | `"demo"` | no |
33-
| name | Project name, required to form unique resource names | `any` | n/a | yes |
36+
| name | Project name, required to create unique resource names | `any` | n/a | yes |
37+
| node\_group\_ci | Node group configuration | <pre>object({<br> instance_types = list(string)<br> capacity_type = string<br> max_capacity = number<br> min_capacity = number<br> desired_capacity = number<br> force_update_version = bool<br> })</pre> | <pre>{<br> "capacity_type": "SPOT",<br> "desired_capacity": 0,<br> "force_update_version": true,<br> "instance_types": [<br> "t3a.medium",<br> "t3.medium"<br> ],<br> "max_capacity": 5,<br> "min_capacity": 0<br>}</pre> | no |
38+
| node\_group\_ondemand | Node group configuration | <pre>object({<br> instance_types = list(string)<br> capacity_type = string<br> max_capacity = number<br> min_capacity = number<br> desired_capacity = number<br> force_update_version = bool<br> })</pre> | <pre>{<br> "capacity_type": "ON_DEMAND",<br> "desired_capacity": 1,<br> "force_update_version": true,<br> "instance_types": [<br> "t3a.medium"<br> ],<br> "max_capacity": 5,<br> "min_capacity": 1<br>}</pre> | no |
39+
| node\_group\_spot | Node group configuration | <pre>object({<br> instance_types = list(string)<br> capacity_type = string<br> max_capacity = number<br> min_capacity = number<br> desired_capacity = number<br> force_update_version = bool<br> })</pre> | <pre>{<br> "capacity_type": "SPOT",<br> "desired_capacity": 1,<br> "force_update_version": true,<br> "instance_types": [<br> "t3a.medium",<br> "t3.medium"<br> ],<br> "max_capacity": 5,<br> "min_capacity": 0<br>}</pre> | no |
3440
| region | Default infrastructure region | `string` | `"us-east-1"` | no |
3541
| short\_region | The abbreviated name of the region, required to form unique resource names | `map` | <pre>{<br> "ap-east-1": "ape1",<br> "ap-northeast-1": "apn1",<br> "ap-northeast-2": "apn2",<br> "ap-south-1": "aps1",<br> "ap-southeast-1": "apse1",<br> "ap-southeast-2": "apse2",<br> "ca-central-1": "cac1",<br> "cn-north-1": "cnn1",<br> "cn-northwest-1": "cnnw1",<br> "eu-central-1": "euc1",<br> "eu-north-1": "eun1",<br> "eu-west-1": "euw1",<br> "eu-west-2": "euw2",<br> "eu-west-3": "euw3",<br> "sa-east-1": "sae1",<br> "us-east-1": "use1",<br> "us-east-2": "use2",<br> "us-gov-east-1": "usge1",<br> "us-gov-west-1": "usgw1",<br> "us-west-1": "usw1",<br> "us-west-2": "usw2"<br>}</pre> | no |
3642
| single\_nat\_gateway | Flag to create single nat gateway for all AZs | `bool` | `true` | no |
43+
| worker\_group\_bottlerocket | Bottlerocket worker group configuration | <pre>object({<br> instance_types = list(string)<br> capacity_type = string<br> max_capacity = number<br> min_capacity = number<br> desired_capacity = number<br> spot_instance_pools = number<br> })</pre> | <pre>{<br> "capacity_type": "SPOT",<br> "desired_capacity": 0,<br> "instance_types": [<br> "t3a.medium",<br> "t3.medium"<br> ],<br> "max_capacity": 5,<br> "min_capacity": 0,<br> "spot_instance_pools": 2<br>}</pre> | no |
3744
| zone\_id | R53 zone id for public domain | `any` | `null` | no |
3845

3946
## Outputs

terraform/layer1-aws/aws-acm.tf

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ module "acm" {
55
create_certificate = var.create_acm_certificate
66

77
domain_name = local.domain_name
8+
zone_id = local.zone_id
89
subject_alternative_names = [
910
"*.${local.domain_name}"]
10-
zone_id = local.zone_id
1111

1212
tags = local.tags
1313
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
data "aws_ami" "bottlerocket_ami" {
2+
most_recent = true
3+
owners = ["amazon"]
4+
filter {
5+
name = "name"
6+
values = ["bottlerocket-aws-k8s-${var.eks_cluster_version}-x86_64-*"]
7+
}
8+
}

terraform/layer1-aws/aws-eks.tf

+103-44
Original file line numberDiff line numberDiff line change
@@ -18,15 +18,15 @@ locals {
1818
{
1919
"key" = "k8s.io/cluster-autoscaler/${local.name}"
2020
"propagate_at_launch" = "false"
21-
"value" = "true"
21+
"value" = "owned"
2222
}
2323
]
2424
}
2525

2626
#tfsec:ignore:aws-vpc-no-public-egress-sgr tfsec:ignore:aws-eks-enable-control-plane-logging tfsec:ignore:aws-eks-encrypt-secrets tfsec:ignore:aws-eks-no-public-cluster-access tfsec:ignore:aws-eks-no-public-cluster-access-to-cidr
2727
module "eks" {
2828
source = "terraform-aws-modules/eks/aws"
29-
version = "17.1.0"
29+
version = "17.3.0"
3030

3131
cluster_name = local.name
3232
cluster_version = var.eks_cluster_version
@@ -50,53 +50,115 @@ module "eks" {
5050
}
5151
] : []
5252

53-
worker_groups_launch_template = [
54-
{
55-
name = "spot"
56-
override_instance_types = var.eks_worker_groups.spot.override_instance_types
57-
spot_instance_pools = var.eks_worker_groups.spot.spot_instance_pools
58-
asg_max_size = var.eks_worker_groups.spot.asg_max_size
59-
asg_min_size = var.eks_worker_groups.spot.asg_min_size
60-
asg_desired_capacity = var.eks_worker_groups.spot.asg_desired_capacity
61-
subnets = module.vpc.private_subnets
62-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
63-
public_ip = false
64-
additional_userdata = file("${path.module}/templates/eks-x86-nodes-userdata.sh")
53+
map_roles = local.eks_map_roles
54+
write_kubeconfig = var.eks_write_kubeconfig
55+
56+
# Create security group rules to allow communication between pods on workers and pods in managed node groups.
57+
# Set this to true if you have AWS-Managed node groups and Self-Managed worker groups.
58+
# See https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1089
59+
worker_create_cluster_primary_security_group_rules = true
60+
61+
workers_additional_policies = var.eks_workers_additional_policies
6562

66-
tags = local.worker_tags
63+
node_groups_defaults = {
64+
ami_type = "AL2_x86_64"
65+
disk_size = 100
66+
}
67+
68+
node_groups = {
69+
spot = {
70+
desired_capacity = var.node_group_spot.desired_capacity
71+
max_capacity = var.node_group_spot.max_capacity
72+
min_capacity = var.node_group_spot.min_capacity
73+
instance_types = var.node_group_spot.instance_types
74+
capacity_type = var.node_group_spot.capacity_type
75+
subnets = module.vpc.private_subnets
76+
77+
force_update_version = var.node_group_spot.force_update_version
78+
79+
k8s_labels = {
80+
Environment = local.env
81+
nodegroup = "spot"
82+
}
83+
additional_tags = {
84+
Name = "${local.name}-spot"
85+
}
6786
},
68-
{
69-
name = "ondemand"
70-
instance_type = var.eks_worker_groups.ondemand.instance_type
71-
asg_desired_capacity = var.eks_worker_groups.ondemand.asg_desired_capacity
72-
subnets = module.vpc.private_subnets
73-
asg_max_size = var.eks_worker_groups.ondemand.asg_max_size
74-
cpu_credits = "unlimited"
75-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=ondemand"
76-
public_ip = false
77-
additional_userdata = file("${path.module}/templates/eks-x86-nodes-userdata.sh")
78-
79-
tags = local.worker_tags
87+
ondemand = {
88+
desired_capacity = var.node_group_ondemand.desired_capacity
89+
max_capacity = var.node_group_ondemand.max_capacity
90+
min_capacity = var.node_group_ondemand.min_capacity
91+
instance_types = var.node_group_ondemand.instance_types
92+
capacity_type = var.node_group_ondemand.capacity_type
93+
subnets = module.vpc.private_subnets
94+
95+
force_update_version = var.node_group_ondemand.force_update_version
96+
97+
k8s_labels = {
98+
Environment = local.env
99+
nodegroup = "ondemand"
100+
}
101+
additional_tags = {
102+
Name = "${local.name}-ondemand"
103+
}
80104
},
105+
ci = {
106+
desired_capacity = var.node_group_ci.desired_capacity
107+
max_capacity = var.node_group_ci.max_capacity
108+
min_capacity = var.node_group_ci.min_capacity
109+
instance_types = var.node_group_ci.instance_types
110+
capacity_type = var.node_group_ci.capacity_type
111+
subnets = module.vpc.private_subnets
112+
113+
force_update_version = var.node_group_ci.force_update_version
114+
115+
k8s_labels = {
116+
Environment = local.env
117+
nodegroup = "ci"
118+
}
119+
additional_tags = {
120+
Name = "${local.name}-ci"
121+
}
122+
taints = [
123+
{
124+
key = "nodegroup"
125+
value = "ci"
126+
effect = "NO_SCHEDULE"
127+
}]
128+
}
129+
}
130+
131+
worker_groups_launch_template = [
81132
{
82-
name = "ci"
83-
override_instance_types = var.eks_worker_groups.ci.override_instance_types
84-
spot_instance_pools = var.eks_worker_groups.ci.spot_instance_pools
85-
asg_max_size = var.eks_worker_groups.ci.asg_max_size
86-
asg_min_size = var.eks_worker_groups.ci.asg_min_size
87-
asg_desired_capacity = var.eks_worker_groups.ci.asg_desired_capacity
88-
subnets = module.vpc.public_subnets
89-
cpu_credits = "unlimited"
90-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot --node-labels=purpose=ci --register-with-taints=purpose=ci:NoSchedule"
91-
public_ip = true
92-
additional_userdata = file("${path.module}/templates/eks-x86-nodes-userdata.sh")
133+
name = "bottlerocket-spot"
134+
ami_id = data.aws_ami.bottlerocket_ami.id
135+
override_instance_types = var.worker_group_bottlerocket.instance_types
136+
spot_instance_pools = var.worker_group_bottlerocket.spot_instance_pools
137+
asg_max_size = var.worker_group_bottlerocket.max_capacity
138+
asg_min_size = var.worker_group_bottlerocket.min_capacity
139+
asg_desired_capacity = var.worker_group_bottlerocket.desired_capacity
140+
subnets = module.vpc.private_subnets
141+
public_ip = false
142+
userdata_template_file = "${path.module}/templates/userdata-bottlerocket.tpl"
143+
userdata_template_extra_args = {
144+
enable_admin_container = false
145+
enable_control_container = true
146+
}
147+
additional_userdata = <<EOT
148+
[settings.kubernetes.node-labels]
149+
"eks.amazonaws.com/capacityType" = "SPOT"
150+
"nodegroup" = "bottlerocket"
151+
152+
[settings.kubernetes.node-taints]
153+
"nodegroup" = "bottlerocket:NoSchedule"
154+
EOT
93155

94156
tags = concat(local.worker_tags, [{
95-
"key" = "k8s.io/cluster-autoscaler/node-template/label/purpose"
157+
"key" = "k8s.io/cluster-autoscaler/node-template/label/nodegroup"
96158
"propagate_at_launch" = "true"
97-
"value" = "ci"
159+
"value" = "bottlerocket"
98160
}])
99-
},
161+
}
100162
]
101163

102164
fargate_profiles = {
@@ -116,7 +178,4 @@ module "eks" {
116178
})
117179
}
118180
}
119-
120-
map_roles = local.eks_map_roles
121-
write_kubeconfig = var.eks_write_kubeconfig
122181
}

terraform/layer1-aws/aws-vpc-endpoints.tf

+10-7
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,25 @@ data "aws_security_group" "default" {
55
vpc_id = module.vpc.vpc_id
66
}
77

8-
module "vpc_endpoints" {
8+
module "vpc_gateway_endpoints" {
99
source = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"
1010
version = "3.2.0"
1111

1212
vpc_id = module.vpc.vpc_id
1313

14-
security_group_ids = [
15-
data.aws_security_group.default.id]
16-
1714
endpoints = {
1815
s3 = {
19-
service = "s3"
16+
service = "s3"
17+
service_type = "Gateway"
18+
route_table_ids = flatten([
19+
module.vpc.intra_route_table_ids,
20+
module.vpc.private_route_table_ids,
21+
module.vpc.public_route_table_ids
22+
])
2023
tags = {
21-
Name = "s3-vpc-endpoint"
24+
Name = "${local.name}-s3"
2225
}
23-
},
26+
}
2427
}
2528

2629
tags = local.tags

terraform/layer1-aws/demo.tfvars.example

+7-22
Original file line numberDiff line numberDiff line change
@@ -18,30 +18,15 @@ single_nat_gateway = true
1818
##########
1919
# EKS
2020
##########
21-
eks_cluster_version = "1.19"
22-
2321
eks_cluster_encryption_config_enable = true
2422

25-
eks_worker_groups = {
26-
spot = {
27-
override_instance_types = ["t3.medium", "t3a.medium"]
28-
spot_instance_pools = 2
29-
asg_max_size = 5
30-
asg_min_size = 0
31-
asg_desired_capacity = 1
32-
},
33-
ondemand = {
34-
instance_type = "t3a.medium"
35-
asg_desired_capacity = 1
36-
asg_max_size = 6
37-
},
38-
ci = {
39-
override_instance_types = ["t3.medium", "t3a.medium"]
40-
spot_instance_pools = 2
41-
asg_max_size = 3
42-
asg_min_size = 0
43-
asg_desired_capacity = 0
44-
}
23+
node_group_ondemand = {
24+
instance_types = ["m5a.medium"]
25+
capacity_type = "ON_DEMAND"
26+
max_capacity = 5
27+
min_capacity = 1
28+
desired_capacity = 1
29+
force_update_version = false
4530
}
4631

4732
eks_write_kubeconfig = false

terraform/layer1-aws/templates/eks-x86-nodes-userdata.sh

-6
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# https://github.com/bottlerocket-os/bottlerocket/blob/develop/README.md#description-of-settings
2+
${pre_userdata}
3+
[settings.kubernetes]
4+
api-server = "${endpoint}"
5+
cluster-certificate = "${cluster_auth_base64}"
6+
cluster-name = "${cluster_name}"
7+
8+
${additional_userdata}
9+
10+
# Hardening based on https://github.com/bottlerocket-os/bottlerocket/blob/develop/SECURITY_GUIDANCE.md
11+
12+
# Enable kernel lockdown in "integrity" mode.
13+
# This prevents modifications to the running kernel, even by privileged users.
14+
[settings.kernel]
15+
lockdown = "integrity"
16+
17+
# The admin host container provides SSH access and runs with "superpowers".
18+
# It is disabled by default, but can be disabled explicitly.
19+
[settings.host-containers.admin]
20+
enabled = ${enable_admin_container}
21+
22+
# The control host container provides out-of-band access via SSM.
23+
# It is enabled by default, and can be disabled if you do not expect to use SSM.
24+
# This could leave you with no way to access the API and change settings on an existing node!
25+
[settings.host-containers.control]
26+
enabled = ${enable_control_container}

0 commit comments

Comments
 (0)