Prompt Format Updates for LLama3 #1035

IAINATDBI · 2024-04-19T19:28:49Z

I've tried the following in the .env.local file but get parsing errors (Error: Parse error on line 7:
...ssistant}}<|eot_id|>)

"chatPromptTemplate" : "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\r\n\r\n{{ You are a friendly assistant }}<|eot_id|><|start_header_id|>user<|end_header_id|>\r\n\r\n{{#each messages}}{{#ifUser}}{{content}}{{/ifUser}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\r\n\r\n{{#each messages}}{{#ifAssistant}}{{ content }}{{/ifAssistant}}<|eot_id|>",

Any thoughts?

mlim15 · 2024-04-20T03:21:13Z

This seems to be working-ish based on what I've seen passed around elsewhere (e.g. ollama's prompt template or the sample provided on the llama.cpp pull request) :

{
    "name": "Llama 3",
    "preprompt": "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}",
    "parameters": {
        (...snip...)
        "stop": ["<|end_of_text|>", "<|eot_id|>"] // Verify that this is correct.
    },
    (...snip...)
}

There are still some issues with the response not ending (ollama/ollama#3759) and the stop button not working (#890) that I'm still running into. That's probably related to the specific thing I've set as "stop" in the above definition here, as well as the tokenizer config when the model is converted to GGUF (if you do that). Apparently you can edit the tokenizer config JSON to fix some of these issues. See ongoing discussions floating around about Llama 3's stop tokens: ggml-org/llama.cpp#6770, ggml-org/llama.cpp#6745 (comment), ggml-org/llama.cpp#6751 (comment).

IAINATDBI · 2024-04-20T11:22:42Z

Thank you @mlim15 that worked just fine. I spun up the 70B Instruct model and it appears to stop when intended. I do see some special tokens (start and stop header) streamed at the start but those are tidied up at the end of streaming. That's maybe the chat ui code rather than the model.

iChristGit · 2024-04-21T11:29:30Z

This seems to be working-ish based on what I've seen passed around elsewhere (e.g. ollama's prompt template or the sample provided on the llama.cpp pull request) :
{
    "name": "Llama 3",
    "preprompt": "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}",
    "parameters": {
        (...snip...)
        "stop": ["<|end_of_text|>", "<|eot_id|>"] // Verify that this is correct.
    },
    (...snip...)
}
There are still some issues with the response not ending (ollama/ollama#3759) and the stop button not working (#890) that I'm still running into. That's probably related to the specific thing I've set as "stop" in the above definition here, as well as the tokenizer config when the model is converted to GGUF (if you do that). Apparently you can edit the tokenizer config JSON to fix some of these issues. See ongoing discussions floating around about Llama 3's stop tokens: ggerganov/llama.cpp#6770, ggerganov/llama.cpp#6745 (comment), ggerganov/llama.cpp#6751 (comment).

Thank you @mlim15 that worked just fine. I spun up the 70B Instruct model and it appears to stop when intended. I do see some special tokens (start and stop header) streamed at the start but those are tidied up at the end of streaming. That's maybe the chat ui code rather than the model.

Hello!
I tried using the recommended template you provided, but the response are never stopping, and the LLM wont choose a topic for the conversation (no title - just "New Chat")
Can you link the whole .env.local?

iChristGit · 2024-04-21T11:35:00Z

Also, I am using Text-Generation-Webui, do you use the same?
Edit:
I was using the original meta fp16 model, now when generating with the GGUF version it works fine!

BlueskyFR · 2024-04-23T10:48:27Z

was anyone able to make it work?

nsarrazin · 2024-04-23T11:41:48Z

In prod for HuggingChat this is what we use:

    "tokenizer" : "philschmid/meta-llama-3-tokenizer",
    "parameters": {
      "stop": ["<|eot_id|>"]
    }

chat-ui supports using the template that is stored in the tokenizer config so that should work. Let me know if it doesn't, maybe there's some endpoint specific thing going on.

iChristGit · 2024-04-23T12:57:16Z

In prod for HuggingChat this is what we use:
    "tokenizer" : "philschmid/meta-llama-3-tokenizer",
    "parameters": {
      "stop": ["<|eot_id|>"]
    }
chat-ui supports using the template that is stored in the tokenizer config so that should work. Let me know if it doesn't, maybe there's some endpoint specific thing going on.

"name": "Llama-3",
"chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}} {{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}} <|eot_id|>{{/ifAssistant}}{{/each}}",
"preprompt": "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.",

"stop": ["<|end_of_text|>", "<|eot_id|>"]

Im using this config, if I want to use "tokenizer" : "philschmid/meta-llama-3-tokenizer", should I remove chatPromptTemplate and Preprompt ?

nsarrazin · 2024-04-23T12:58:14Z

You can keep preprompt but you should get rid of the chatPromptTemplate yes!

iChristGit · 2024-04-23T12:59:13Z

You can keep preprompt but you should get rid of the chatPromptTemplate yes!

Il try that! although the current config works flawlessly!
Thank you

nsarrazin · 2024-04-23T13:00:41Z

At the end of the day, use what works for you 🤗 We support both custom prompt templates with chatPromptTemplate but for easy setup sometimes it's nicer if you can get the chat template directly from the tokenizer

BlueskyFR · 2024-04-23T14:02:10Z

@nsarrazin thanks for the answer, I'll try it soon!
Though is there a place we could find all the configs for the models for our .env.local? For instance could we get the list you use in production? It would be easier IMO

nsarrazin · 2024-04-23T14:50:25Z

If you want a list of templates we've used in the past, you got PROMPTS.md

If you want to see the current HuggingChat prod config it's .env.template

and ideally try to see if the model you want has a tokenizer_config.json file on the hub. if it does you can just do

"tokenizer": "namespace/on-the-hub" in your config and it should pick up the template. see the .env.template for some examples

avirgos mentioned this issue Sep 24, 2024

Header prompt displayed using Llama3.1 with ollama #1484

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt Format Updates for LLama3 #1035

Prompt Format Updates for LLama3 #1035

IAINATDBI commented Apr 19, 2024

mlim15 commented Apr 20, 2024 •

edited

Loading

IAINATDBI commented Apr 20, 2024

iChristGit commented Apr 21, 2024

iChristGit commented Apr 21, 2024 •

edited

Loading

BlueskyFR commented Apr 23, 2024

nsarrazin commented Apr 23, 2024

iChristGit commented Apr 23, 2024 •

edited

Loading

nsarrazin commented Apr 23, 2024

iChristGit commented Apr 23, 2024

nsarrazin commented Apr 23, 2024

BlueskyFR commented Apr 23, 2024

nsarrazin commented Apr 23, 2024 •

edited

Loading

Prompt Format Updates for LLama3 #1035

Prompt Format Updates for LLama3 #1035

Comments

IAINATDBI commented Apr 19, 2024

mlim15 commented Apr 20, 2024 • edited Loading

IAINATDBI commented Apr 20, 2024

iChristGit commented Apr 21, 2024

iChristGit commented Apr 21, 2024 • edited Loading

BlueskyFR commented Apr 23, 2024

nsarrazin commented Apr 23, 2024

iChristGit commented Apr 23, 2024 • edited Loading

nsarrazin commented Apr 23, 2024

iChristGit commented Apr 23, 2024

nsarrazin commented Apr 23, 2024

BlueskyFR commented Apr 23, 2024

nsarrazin commented Apr 23, 2024 • edited Loading

mlim15 commented Apr 20, 2024 •

edited

Loading

iChristGit commented Apr 21, 2024 •

edited

Loading

iChristGit commented Apr 23, 2024 •

edited

Loading

nsarrazin commented Apr 23, 2024 •

edited

Loading