Skip to content

Conversation

@charlifu
Copy link
Contributor

@charlifu charlifu commented Dec 2, 2025

Use ROCM_AITER_FA backend for flash attn on rocm.

Signed-off-by: charlifu <charlifu@amd.com>
@mergify mergify bot added the rocm Related to AMD ROCm label Dec 2, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the attention backend for ROCm in the test utilities, changing it from FLASH_ATTN to ROCM_AITER_FA. This aligns with using the Aiter Flash Attention backend on ROCm platforms. My review includes a suggestion to update a related log message for consistency and to prevent confusion during debugging.

charlifu and others added 2 commits December 2, 2025 13:51
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Charlie Fu <Charlie.Fu@amd.com>
Signed-off-by: charlifu <charlifu@amd.com>
@divakar-amd
Copy link
Contributor

Hi @charlifu , can you also update it here:

with monkeypatch.context() as m:
if "Llama-4-Scout" in model_setup[1] and attn_backend == "FLASH_ATTN":
# Scout requires default backend selection
# because vision encoder has head_dim 88 being incompatible
# with FLASH_ATTN and needs to fall back to Flex Attn
pass
else:
m.setenv("VLLM_MLA_DISABLE", "1")
m.setenv("VLLM_ATTENTION_BACKEND", attn_backend)
if attn_backend == "TRITON_ATTN" and not current_platform.is_rocm():
pytest.skip(
"TRITON_ATTN does not support "
"multi-token eagle spec decode on current platform"
)
if attn_backend == "FLASH_ATTN" and current_platform.is_rocm():
if "deepseek" in model_setup[1].lower():
pytest.skip("FLASH_ATTN for deepseek not supported on ROCm platform")

@DarkLight1337 DarkLight1337 requested a review from tjtanaa December 3, 2025 02:46
@mergify
Copy link

mergify bot commented Dec 3, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @charlifu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 3, 2025
@charlifu charlifu force-pushed the amd_ci/fix_test_max_len branch from 87c1f4d to d38904b Compare December 3, 2025 17:24
Signed-off-by: charlifu <charlifu@amd.com>
@mergify mergify bot removed the needs-rebase label Dec 3, 2025
Signed-off-by: charlifu <charlifu@amd.com>
@tjtanaa
Copy link
Collaborator

tjtanaa commented Dec 4, 2025

@charlifu after you done fixing it, can you share the tests status of the files that you have fixed?

@tjtanaa
Copy link
Collaborator

tjtanaa commented Dec 4, 2025

on mi300x, I am getting OOM issues on two of the tests. May I know which machine are you validating on? mi325?

Copy link
Collaborator

@robertgshaw2-redhat robertgshaw2-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing approval since @tjtanaa is on point

Signed-off-by: charlifu <charlifu@amd.com>
@charlifu
Copy link
Contributor Author

charlifu commented Dec 4, 2025

@charlifu after you done fixing it, can you share the tests status of the files that you have fixed?

pytest -sv tests/v1/spec_decode/test_max_len.py

9 passed, 4 warnings in 202.83s (0:03:22)

pytest -sv tests/v1/spec_decode/test_eagle.py

51 passed, 4 warnings in 37.66s

@tjtanaa
Copy link
Collaborator

tjtanaa commented Dec 5, 2025

@charlifu how about the status of tests/v1/e2e/test_spec_decode.py ?

@charlifu
Copy link
Contributor Author

charlifu commented Dec 5, 2025

@charlifu how about the status of tests/v1/e2e/test_spec_decode.py ?

Still failing, but that's not what this PR aiming for. Can we get this merged first?

@micah-wil
Copy link
Contributor

@charlifu how about the status of tests/v1/e2e/test_spec_decode.py ?

Fails for unrelated reason that is addressed in #29926

Signed-off-by: charlifu <charlifu@amd.com>
@charlifu
Copy link
Contributor Author

charlifu commented Dec 6, 2025

Another fix here:

pytest -sv tests/basic_correctness/test_basic_correctness.py

=============== 34 passed, 6 warnings in 480.39s (0:08:00) ==========

Copy link
Collaborator

@tjtanaa tjtanaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's not any more tests. Open another PR if there are new tests.

@tjtanaa tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 6, 2025
@tjtanaa tjtanaa enabled auto-merge (squash) December 6, 2025 14:13
@tjtanaa
Copy link
Collaborator

tjtanaa commented Dec 7, 2025

@micah-wil I saw you have merged the bugfix PR into this PR by syncing from main. Are the test cases in tests/v1/e2e/test_spec_decode.py all green? I think this test is yet to be enabled in AMD CI. Can you share the status of tests/v1/e2e/test_spec_decode.py.

@tjtanaa tjtanaa removed the ready ONLY add when PR is ready to merge/full CI is needed label Dec 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants