-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
Mixtral is getting added to llama.cpp now -
ggml-org/llama.cpp#4406
Using weights here downloaded to models/mixtral-8x7b
.
These steps work (Mac M2 32GB) in llama.cpp -
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
git checkout mixtral
make -j && ./main -m models/mixtral-8x7b/mixtral-8x7b-v0.1.Q4_K_M.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e
Trying the same w/ llama-cpp-python-0.2.22
:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.llms import LlamaCpp
llm = LlamaCpp(
model_path="/Users/rlm/Desktop/Code/llama.cpp/models/mixtral-8x7b/mixtral-8x7b-instruct-v0.1.Q2_K.gguf",
n_gpu_layers=1,
n_batch=512,
n_ctx=2048,
f16_kv=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
verbose=True,
)
Error:
error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
abetlen, agno-nymous, yumemio, dillfrescott, brandonrobertz and 18 moreyumemio, brarrow, yourbuddyconner and umbertogriffo
Metadata
Metadata
Assignees
Labels
No labels