-
Notifications
You must be signed in to change notification settings - Fork 629
BugFix: Resolve PolicyFlashlb warm up function attribute error #4741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mercykid-bash <ruanche0218@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to fix an AttributeError when calling the warm-up function for the PolicyFlashlb. The current fix resolves the crash by changing the call from a method on an instance to a module-level function. However, this approach is problematic as it warms up a new, separate instance with a hardcoded configuration, not the actual policy instance being used. This could lead to performance issues. My review suggests a better fix: refactor the warm_up function into a method of the FlashLB class. This aligns with the original code's intent and ensures the correct instance is warmed up with its proper configuration.
| warm_up() | ||
| return policy_instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this change fixes the AttributeError, it introduces a significant logical issue. The warm_up() function being called is a module-level function that creates its own separate FlashLB instance with a hardcoded configuration. Consequently, the policy_instance that was just created is not the one being warmed up.
This approach is misleading and potentially incorrect. The warm-up uses a hardcoded configuration that might differ from the runtime configuration of policy_instance, which could lead to suboptimal performance or JIT recompilation.
The original code (policy_instance.warm_up()) suggests the correct intent. The fix should be in vllm_ascend/eplb/core/policy/policy_flashlb.py by making warm_up a method of the FlashLB class. This would ensure the correct instance is warmed up with its own configuration. I recommend reverting the changes in this file and applying the fix in policy_flashlb.py instead.
| warm_up() | |
| return policy_instance | |
| policy_instance.warm_up() | |
| return policy_instance |
| from .policy_dynamic_ep import DynamicEplb | ||
| from .policy_dynamic_ep_v2 import DynamicEplbV2 | ||
| from .policy_flashlb import FlashLB | ||
| from .policy_flashlb import FlashLB, warm_up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import of warm_up supports a fix that is logically flawed. The recommended approach is to make warm_up a method of the FlashLB class, which would make this import unnecessary and lead to a more correct and maintainable solution. This would involve reverting this change and modifying policy_flashlb.py.
| from .policy_flashlb import FlashLB, warm_up | |
| from .policy_flashlb import FlashLB |
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Uh oh!
There was an error while loading. Please reload this page.