Support logit_bias outside of server

In llama.cpp you can use logit bias to affect how likely specific tokens are, like this:
```cpp
./main -m models/llama-2-7b.Q4_K_M.gguf -n 100 -p 'this is a prompt' --top-p 0.5 --top-k 3 --logit-bias 15043+1
```

Which would increase the likelihood of token `15043` by 1

It seems to have been mentioned a couple times in issues before and even seems like something was implemented, but I haven't been able to find any reference to "logit_bias" in the docs, and wasn't able to make much sense of bias.py

When I try to run this
```py
llm = Llama(model_path="./models/llama-2-7b.Q2_K.gguf", logits_all=True)

biases = {"29991": -100, "1556": -100} # assume it's a dict

output = llm(my_prompt, max_tokens=100 ,temperature=0.8, logit_bias=my_biases)
```

It comes back with this error:

`TypeError: Llama.__call__() got an unexpected keyword argument 'logit_bias'`

Is this an available feature, and I'm just missing how to do it somehow?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support logit_bias outside of server #827

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support logit_bias outside of server #827

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions