You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add support for ParallelCluster versions 3.9.0 and 3.9.1 (#232)
Add support for rhel9 and rocky9.
Had to update some of the ansible playbooks to mimic rhel8 changes.
Resolves#229
Set SubmitterInstanceTags based on RESEnvironmentName.
Remove SubmitterSecurityGroupIds parameter.
This option added rules to existing security groups and if they were used by multiple clusters then the number of security group rules would exceed the maximum allowed.
With the addition of adding security groups to the head and compute nodes the
customer should supply their own security groups that meet the slurm cluster requirements, attach them to their login nodes and configure them as additional security groups for the head and compute nodes.
Resolves#204
Update CallSlurmRestApiLambda from Python 3.8 to 3.9.
Resolves#230
Update CDK version to 2.111.0.
This is the latest version supported by nodejs 16.
Really need to move to nodejs 20, but it isn't supported on Amazon Linux 2 or
RHEL 7 family.
Would require either running in a Docker container or on a newer OS version.
I think that I'm going to change the prerequisites for the OS distribution
so that I can stay on the latest tools.
For example, I can't update to Python 3.12 until I do this.
Update DeconfigureRESUsersGroupsJson to pass if last statement fails.
Fix bug in create_slurm_accounts.py
Resolves#231
@@ -249,7 +247,7 @@ See the [ParallelCluster docs](https://docs.aws.amazon.com/parallelcluster/lates
249
247
250
248
See the [ParallelCluster docs](https://docs.aws.amazon.com/parallelcluster/latest/ug/Image-v3.html#yaml-Image-CustomAmi) for the custom AMI documentation.
251
249
252
-
**NOTE**: A CustomAmi must be provided for Rocky8.
250
+
**NOTE**: A CustomAmi must be provided for Rocky8 or Rocky9.
253
251
All other distributions have a default AMI that is provided by ParallelCluster.
254
252
255
253
#### Architecture
@@ -491,12 +489,6 @@ Additional security groups that will be added to the head node instance.
491
489
492
490
List of Amazon Resource Names (ARNs) of IAM policies for Amazon EC2 that will be added to the head node instance.
493
491
494
-
### SubmitterSecurityGroupIds
495
-
496
-
External security groups that should be able to use the cluster.
497
-
498
-
Rules will be added to allow it to interact with Slurm.
499
-
500
492
### SubmitterInstanceTags
501
493
502
494
Tags of instances that can be configured to submit to the cluster.
Before you deploy a cluster you need to create a configuration file.
@@ -108,6 +178,7 @@ Ideally you should version control this file so you can keep track of changes.
108
178
109
179
The schema for the config file along with its default values can be found in [source/cdk/config_schema.py](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L230-L445).
110
180
The schema is defined in python, but the actual config file should be in yaml format.
181
+
See [Configuration File Format](config.md) for documentation on all of the parameters.
111
182
112
183
The following are key parameters that you will need to update.
113
184
If you do not have the required parameters in your config file then the installer script will fail unless you specify the `--prompt` option.
@@ -120,7 +191,6 @@ You should save your selections in the config file.
120
191
| [Region](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L368-L369) | Region where VPC is located | | `$AWS_DEFAULT_REGION`
121
192
| [VpcId](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L372-L373) | The vpc where the cluster will be deployed. | vpc-* | None
122
193
| [SshKeyPair](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L370-L371) | EC2 Keypair to use for instances | | None
123
-
| [slurm/SubmitterSecurityGroupIds](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L480-L485) | Existing security groups that can submit to the cluster. For SOCA this is the ComputeNodeSG* resource. | sg-* | None
124
194
| [ErrorSnsTopicArn](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L379-L380) | ARN of an SNS topic that will be notified of errors | `arn:aws:sns:{{region}}:{AccountId}:{TopicName}` | None
125
195
| [slurm/InstanceConfig](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/cdk/config_schema.py#L491-L543) | Configure instance types that the cluster can use and number of nodes. | | See [default_config.yml](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/resources/config/default_config.yml)
126
196
@@ -137,7 +207,9 @@ all nodes must have the same architecture and Base OS.
137
207
| CentOS 7 | x86_64
138
208
| RedHat 7 | x86_64
139
209
| RedHat 8 | x86_64, arm64
210
+
| RedHat 9 | x86_64, arm64
140
211
| Rocky 8 | x86_64, arm64
212
+
| Rocky 9 | x86_64, arm64
141
213
142
214
You can exclude instances types by family or specific instance type.
143
215
By default the InstanceConfig excludes older generation instance families.
Copy file name to clipboardExpand all lines: docs/res_integration.md
+7-2
Original file line number
Diff line number
Diff line change
@@ -11,11 +11,12 @@ The intention is to completely automate the deployment of ParallelCluster and se
11
11
|-----------|-------------|------
12
12
| VpcId | VPC id for the RES cluster | vpc-xxxxxx
13
13
| SubnetId | Subnet in the RES VPC. | subnet-xxxxx
14
-
| SubmitterSecurityGroupIds | The security group names and ids used by RES VDIs. The name will be something like *EnvironmentName*-vdc-dcv-host-security-group | *EnvironmentName*-*VDISG*: sg-xxxxxxxx
15
14
| SubmitterInstanceTags | The tag of VDI instances. | 'res:EnvironmentName': *EnvironmentName*'
16
15
| ExtraMounts | The mount parameters for the /home directory. This is required for access to the home directory. |
17
16
| ExtraMountSecurityGroups | Security groups that give access to the ExtraMounts. These will be added to compute nodes so they can access the file systems.
18
17
18
+
You must also create security groups as described in [Security Groups for Login Nodes](deployment-prerequisites.md#security-groups-for-login-nodes) and specify the SlurmHeadNodeSG in the `slurm/SlurmCtl/AdditionalSecurityGroups` parameter and the SlurmComputeNodeSG in the `slurm/InstanceConfig/AdditionalSecurityGroups` parameter.
19
+
19
20
When you specify **RESEnvironmentName**, a lambda function will run SSM commands to create a cron job on a RES domain joined instance to update the users_groups.json file every hour. Another lambda function will also automatically configure all running VDI hosts to use the cluster.
20
21
21
22
The following example shows the configuration parameters for a RES with the EnvironmentName=res-eda.
@@ -51,11 +52,15 @@ slurm:
51
52
Database:
52
53
DatabaseStackName: pcluster-slurm-db-res
53
54
54
-
SlurmCtl: {}
55
+
SlurmCtl:
56
+
AdditionalSecurityGroups:
57
+
- sg-12345678 # SlurmHeadNodeSG
55
58
56
59
# Configure typical EDA instance types
57
60
# A partition will be created for each combination of Base OS, Architecture, and Spot
Copy file name to clipboardExpand all lines: docs/soca_integration.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,8 @@ Set the following parameters in your config file.
11
11
| Parameter | Description | Value
12
12
|-----------|-------------|------
13
13
| VpcId | VPC id for the SOCA cluster | vpc-xxxxxx
14
-
| SubmitterSecurityGroupIds | The ComputeNode security group name and id | *cluster-id*-*ComputeNodeSG*: sg-xxxxxxxx
14
+
| slurm/SlurmCtl/AdditionalSecurityGroups | Security group ids that give desktop instances access to the head node and that give the head node access to VPC resources such as file systems.
15
+
| slurm/InstanceConfig/AdditionalSecurityGroups | Security group ids that give desktop instances access to the compute nodes and that give compute nodes access to VPC resources such as file systems.
15
16
| ExtraMounts | Add the mount parameters for the /apps and /data directories. This is required for access to the home directory. |
0 commit comments