Performance issue with LoadScopeScheduling when a node has a large number of tests

I've noticed the use of `index()` in this [code block](https://github.com/pytest-dev/pytest-xdist/blob/v3.7.0/src/xdist/scheduler/loadscope.py#L274-L280) causes a significant performance issue when a node has a large number of tests. This can be improved by precomputing a mapping from node ID to index using a dictionary and look it up in the loop.

eg.
```
index_map = {nodeid: i for i, nodeid in enumerate(worker_collection)}
nodeids_indexes = [
    index_map[nodeid]
    for nodeid, completed in work_unit.items()
    if not completed
]
```


To reproduce the issue, you can run `pytest -n 2 --dist=loadscope` with the following test code:
```
import pytest

@pytest.mark.parametrize("i", range(100000))
def test_something(i):
    pass
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance issue with LoadScopeScheduling when a node has a large number of tests #1212

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance issue with LoadScopeScheduling when a node has a large number of tests #1212

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions