Skip to content

chore(e2e): create multiple VMs in one go through the CLI #2181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 30, 2025

Conversation

JGAntunes
Copy link
Member

What this PR does / why we need it:

Based on the following thread:

Seems like a lot of the E2E test flakiness and errors we've been getting with CMX can be addressed if instead of manually creating the network and each node in parallel we leverage the VM API and create all the required VMs in one go (which will be under the same network by default).

Which issue(s) this PR fixes:

NA

Does this PR require a test?

NONE

Does this PR require a release note?

NONE

Does this PR require documentation?

NONE

@JGAntunes JGAntunes requested a review from sgalsaleh May 22, 2025 11:55
@JGAntunes JGAntunes self-assigned this May 22, 2025

args := []string{
"vm", "create",
"--name", nodeName,
"--network", networkID,
"--name", "ec-test-suite",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're creating multiple nodes now they'll all gonna have the same name, I've replaced the ID in most of the logs in here and also added a couple of initial log lines post cluster creation that maps the index in the Nodes slice to the node ID and private IP. This is how it looks:

   cluster.go:64: creating 3 nodes
    cluster.go:136: node 0 created with ID b8309fa6 and private IP 10.0.0.247
    cluster.go:136: node 1 created with ID 1a08d35f and private IP 10.0.0.192
    cluster.go:136: node 2 created with ID 5054f8b6 and private IP 10.0.0.229
    cluster.go:56: cluster created with network ID 624bf1704b41342375aa887fcce206124bfbd01da194720556d06976ec9f435f
    restore_test.go:649: 2025-05-22T12:46:48+01:00: deploying minio on node 0
    restore_test.go:655: 2025-05-22T12:49:16+01:00: downloading airgap files
    airgap.go:92: downloaded airgap bundle on node 0 to /assets/ec-release.tgz (2.0 GB) in 1m17.617605375s
    airgap.go:92: downloaded airgap bundle on node 0 to /assets/ec-release-upgrade.tgz (2.1 GB) in 1m21.8185815s
    restore_test.go:668: 2025-05-22T12:50:37+01:00: installing expect package on node 0
    restore_test.go:673: 2025-05-22T12:50:42+01:00: installing expect package on node 2
    restore_test.go:678: 2025-05-22T12:50:46+01:00: airgapping cluster
    cluster.go:236: node 0 is airgapped successfully
    cluster.go:236: node 1 is airgapped successfully
    cluster.go:236: node 2 is airgapped successfully

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok that all workflow runs will have the same name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine it's fine since we do all the operations based on node IDs, it's also not that different from having multiple runs using `node0´ for example.

Copy link

github-actions bot commented May 22, 2025

This PR has been released (on staging) and is available for download with a embedded-cluster-smoke-test-staging-app license ID.

Online Installer:

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci/appver-dev-1c9588d" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Airgap Installer (may take a few minutes before the airgap bundle is built):

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci-airgap/appver-dev-1c9588d?airgap=true" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Happy debugging!

sgalsaleh
sgalsaleh previously approved these changes May 22, 2025
@JGAntunes JGAntunes enabled auto-merge (squash) May 22, 2025 13:14
@JGAntunes JGAntunes disabled auto-merge May 22, 2025 17:53
@divolgin
Copy link
Member

divolgin commented May 22, 2025

Just heads up, this does not help with the "host unreachable"/"No route to host" error. We should have a fix for that soon though.

@JGAntunes JGAntunes dismissed sgalsaleh’s stale review May 23, 2025 10:40

The merge-base changed after approval.

@JGAntunes JGAntunes force-pushed the chore/cmx-e2e-multiple-hosts branch from ba88a0a to ea640f8 Compare May 23, 2025 10:40
@JGAntunes JGAntunes requested a review from sgalsaleh May 23, 2025 10:50
sgalsaleh
sgalsaleh previously approved these changes May 23, 2025
@JGAntunes JGAntunes force-pushed the chore/cmx-e2e-multiple-hosts branch from ea640f8 to a3634fc Compare May 28, 2025 08:51
@JGAntunes JGAntunes enabled auto-merge (squash) May 28, 2025 08:53
@JGAntunes JGAntunes requested a review from sgalsaleh May 30, 2025 10:12
Copy link
Member

@emosbaugh emosbaugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question but lgtm


args := []string{
"vm", "create",
"--name", nodeName,
"--network", networkID,
"--name", "ec-test-suite",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok that all workflow runs will have the same name?

@JGAntunes JGAntunes merged commit 36076c9 into main May 30, 2025
183 of 190 checks passed
@JGAntunes JGAntunes deleted the chore/cmx-e2e-multiple-hosts branch May 30, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants