-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Bug]: AgentWorkflow problem with VLLM initialization #18519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello @mahmoudmohey97! I'm Dosu, here to help you solve bugs, answer questions, and become a contributor while you wait for a human maintainer. Let’s tackle this issue together! To resolve the "Not Implemented" error when using VLLM with the Here's an example of how you can use the from llama_index.llms.vllm import Vllm
# specific functions to format for mistral instruct
def messages_to_prompt(messages):
prompt = "\n".join([str(x) for x in messages])
return f"<s>[INST] {prompt} [/INST] </s>\n"
def completion_to_prompt(completion):
return f"<s>[INST] {completion} [/INST] </s>\n"
llm = Vllm(
model="mistralai/Mistral-7B-Instruct-v0.1",
tensor_parallel_size=4,
max_new_tokens=256,
vllm_kwargs={"swap_space": 1, "gpu_memory_utilization": 0.5},
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
)
response = llm.complete("What is a black hole?")
print(response.text) Make sure to adjust your workflow to use these methods instead of the ones that are not implemented [1]. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
@mahmoudmohey97 |
@logan-markewich you mean like the following:
Then use llm variable in AgentWorkflow? |
@mahmoudmohey97 yea, except don't use VllmServer class at all, I meant just launch the server from the CLI and then use OpenAILike to connect 👍🏻 |
@logan-markewich Thank you logan it's working now, but i have a question do you know which call tool cool parser to select from when using qwe2.5? |
Bug Description
I was trying to create agent "AgentWorkflow.from_tools_or_functions" using LLM initialized from VLLM.
I used the code in docs from llama-index to initialize model using VLLM.
When i use HuggingFaceLLM, this problem doesn't happen.
VLLM version: 0.8.4
Version
0.12.31
Steps to Reproduce
Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: