Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool-call: support Command R7B (+ return tool_plan "thoughts" in API) #11585

Merged
merged 7 commits into from
Feb 2, 2025

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Feb 2, 2025

  • Command R7B native tool call format:
    • Extract normal responses vs. tool calls w/ planning
    • Returning message.tool_plan in the API if available)
      • Note: may need to revisit this when supporting thinking tokens of R1 and such: will we want a single thinking field? What does DeepSeek's API do?
    • Ensured neither CohereForAI/c4ai-command-r-v01 nor CohereForAI/c4ai-command-r-plus trigger detection (different format)
  • Cleaned up false triggers --> introduced preserved_tokens
  • Cleaned up tests: only test grammars from first trigger
  • Updated README w/ models that work well with this agent tutorial

Note

Needs a template override:

llama-server --jinja -fa -hf bartowski/c4ai-command-r7b-12-2024-GGUF:Q6_K_L \
  --chat-template-file <( python scripts/get_chat_template.py CohereForAI/c4ai-command-r7b-12-2024 tool_use )

cf. #9639

@github-actions github-actions bot added testing Everything test related examples server labels Feb 2, 2025
Comment on lines 383 to 386
if (ids.size() == 1) {
LOG_DBG("Preserved token: %d\n", ids[0]);
params.sampling.preserved_tokens.insert(ids[0]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we don't handle ids.size() > 1

Copy link
Collaborator Author

@ochafik ochafik Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point, just added a comment + debug warning log for now, should only happen when using a native tool call format with an incompatible model (e.g. wrong template override)

@ochafik ochafik merged commit bfcce4d into ggml-org:master Feb 2, 2025
45 checks passed
@ochafik ochafik deleted the command-r7b branch February 2, 2025 09:25
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…I) (ggml-org#11585)

* `tool-call`: support Command R7B (w/ tool_plan return)

* `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override

* `tool-call`: test cleanup / handle lazy grammar triggers
orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025
…I) (ggml-org#11585)

* `tool-call`: support Command R7B (w/ tool_plan return)

* `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override

* `tool-call`: test cleanup / handle lazy grammar triggers
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
…I) (ggml-org#11585)

* `tool-call`: support Command R7B (w/ tool_plan return)

* `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override

* `tool-call`: test cleanup / handle lazy grammar triggers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants