Skip to content

Conversation

@Mercykid-bash
Copy link
Contributor

@Mercykid-bash Mercykid-bash commented Dec 5, 2025

Signed-off-by: Mercykid-bash <ruanche0218@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an AttributeError when calling the warm-up function for the PolicyFlashlb. The current fix resolves the crash by changing the call from a method on an instance to a module-level function. However, this approach is problematic as it warms up a new, separate instance with a hardcoded configuration, not the actual policy instance being used. This could lead to performance issues. My review suggests a better fix: refactor the warm_up function into a method of the FlashLB class. This aligns with the original code's intent and ensures the correct instance is warmed up with its proper configuration.

Comment on lines +32 to +33
warm_up()
return policy_instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While this change fixes the AttributeError, it introduces a significant logical issue. The warm_up() function being called is a module-level function that creates its own separate FlashLB instance with a hardcoded configuration. Consequently, the policy_instance that was just created is not the one being warmed up.

This approach is misleading and potentially incorrect. The warm-up uses a hardcoded configuration that might differ from the runtime configuration of policy_instance, which could lead to suboptimal performance or JIT recompilation.

The original code (policy_instance.warm_up()) suggests the correct intent. The fix should be in vllm_ascend/eplb/core/policy/policy_flashlb.py by making warm_up a method of the FlashLB class. This would ensure the correct instance is warmed up with its own configuration. I recommend reverting the changes in this file and applying the fix in policy_flashlb.py instead.

Suggested change
warm_up()
return policy_instance
policy_instance.warm_up()
return policy_instance

from .policy_dynamic_ep import DynamicEplb
from .policy_dynamic_ep_v2 import DynamicEplbV2
from .policy_flashlb import FlashLB
from .policy_flashlb import FlashLB, warm_up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This import of warm_up supports a fix that is logically flawed. The recommended approach is to make warm_up a method of the FlashLB class, which would make this import unnecessary and lead to a more correct and maintainable solution. This would involve reverting this change and modifying policy_flashlb.py.

Suggested change
from .policy_flashlb import FlashLB, warm_up
from .policy_flashlb import FlashLB

@github-actions
Copy link

github-actions bot commented Dec 5, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant