Open
Description
Problem Description
When running tests with many workers (e.g., pytest -n 50
), all worker processes start simultaneously. This can cause several issues:
- System overload: Sudden spike in CPU, memory, and I/O usage can make the system unresponsive
- Resource contention: All workers competing for resources at once can actually slow down test execution
- CI/CD failures: Resource-constrained environments may fail when too many processes start at once
- Database/service overload: If tests connect to external services, simultaneous connections can overwhelm them
Proposed Solution
Add a --ramp
option that gradually starts workers over a specified time period, similar to JMeter's ramp-up period for load testing.
Example Usage
# Start 10 workers over 10 seconds (one worker per second)
pytest -n 10 --ramp 10s
# Start 50 workers over 5 minutes
pytest -n 50 --ramp 5m
# Start 100 workers over 1 hour
pytest -n 100 --ramp 1h
How it would work
With --ramp 10s
and -n 10
:
- Worker 0: starts immediately
- Worker 1: starts after 1 second
- Worker 2: starts after 2 seconds
- ... and so on
- Worker 9: starts after 9 seconds
Benefits
- Prevents system overload by distributing the resource usage spike over time
- Better for shared environments where sudden resource spikes affect other users/processes
- Improves test reliability in resource-constrained CI/CD environments
- Useful for load testing scenarios where gradual ramp-up is desired
- Backward compatible - no impact when
--ramp
is not specified
Implementation Considerations
- Support common time formats: seconds (s), minutes (m), hours (h)
- Workers should delay test execution, not just process initialization
- Provide progress feedback during ramp-up period
- Should work with all existing distribution modes (
--dist
)
Use Cases
- Large test suites: Organizations with thousands of tests that need many workers
- Shared CI/CD environments: Where resource usage must be controlled
- Database-heavy tests: Preventing connection pool exhaustion
- Performance testing: Gradually increasing load on the system under test
- Development machines: Preventing system freeze when running tests locally
Related Issues/Context
This feature is similar to:
- JMeter's ramp-up period for thread groups
- Kubernetes' rolling deployments
- Database connection pool gradual scaling
I've been able to bang this out using AI hopefully with enough tests and documentation and have used it against my own suite. Creating this issue as a holder. Hoping to create PR in the near future.