[Perf] API-server scaleout with many-to-many server-engine comms by njhill · Pull Request #17546 · vllm-project/vllm (original) (raw)

added 19 commits

April 4, 2025 17:04

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

…-engines

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/v1/engine/core_client.py

vllm/v1/utils.py

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/config.py

vllm/engine/arg_utils.py

vllm/v1/engine/core.py

vllm/v1/engine/core_client.py

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/v1/engine/core.py

vllm/v1/engine/core_client.py

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

…-engines

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/config.py

vllm/v1/engine/core.py

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/v1/engine/core_client.py

vllm/v1/utils.py

@njhill

@njhill

@njhill

…-engines

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

vllm/v1/engine/core.py

vllm/v1/engine/core_client.py

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Avoid exception but still needs more work to be functional with multiple api server procs.

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

@njhill

@njhill

Signed-off-by: Nick Hill nhill@redhat.com

Conflicts:

tests/v1/engine/test_engine_core.py

vllm/v1/engine/core.py

vllm/v1/engine/core_client.py

njhill added a commit to njhill/vllm that referenced this pull request

May 31, 2025

@njhill

Introduced in vllm-project#17546. We should only call mark_process_dead when we're using prometheus multiprocessing mode (with > 1 API servers).

Signed-off-by: Nick Hill nhill@redhat.com

amitm02 pushed a commit to amitm02/vllm that referenced this pull request

Jun 1, 2025

@njhill @amitm02

amitm02 pushed a commit to amitm02/vllm that referenced this pull request

Jun 1, 2025

@njhill @amitm02

0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request

May 19, 2026

@njhill

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})