Skip to content

Add options for max diffs before restarting process #202

Open
@Mr0grog

Description

@Mr0grog

Folks at Internet Archive would like to see an option to limit the number of requests/diffs that can be handled by a single child process of the diff server (once we hit that limit, we should kill that process and start a new one). The idea is to have something similar to Gunicorn’s max_requests and maybe max_requests_jitter.

The server currently runs all diffs via a ProcessPoolExecutor:

def get_diff_executor(self, reset=False):
if self.application.terminating:
raise RuntimeError('Diff executor is being shut down.')
executor = self.settings.get('diff_executor')
if reset or not executor:
if executor:
try:
# NOTE: we don't need await this; we just want to make sure
# the old executor gets cleaned up.
shutdown_executor_in_loop(executor)
except Exception:
pass
executor = concurrent.futures.ProcessPoolExecutor(
DIFFER_PARALLELISM,
initializer=initialize_diff_worker)
self.settings['diff_executor'] = executor
return executor

ProcessPoolExecutor has a max_tasks_per_child option that basically does this, so we might be able to just lean on that. Doing so doesn’t give us a way to do jitter, but that might be fine.

Most of our config options come in via environment variables, so we should probably use env vars for this, too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestserverSpecific to the diffing server, rather than diff algorithms

    Type

    No type

    Projects

    Status

    Prioritized

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions