-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
TL;DR:
It's likely a 1-line fix, see the the Fix section at the bottom.
Expected Behavior
llama_cpp.Llama.__init__
should behave the same when called in a program regardless if the program is run via a debugger or not (in this case debugpy
).
Current Behavior
When running via debugger (debugpy
), llama_cpp.Llama.__init__
throws the following ValueError
from line 225
of llama_cpp\llama.py
:
ValueError: cannot resize an array that references or is referenced
by another array in this way.
Use the np.resize function or refcheck=False
Environment and Context
- CPU - AMD Ryzen 9 5950X
- RAM - 64.0 GB
- x64 bit
Windows 10 Pro
- Version - 22H2
- OS build - 19045.2965
Steps to Reproduce the bug
pip install llama-cpp-python==0.1.57 --force-reinstall --no-cache-dir
This will install the version of llama-cpp-python that I Know has this problem. (also this is the exact command that I used to install lamma-cpp-python.pip install debugpy==1.6.4
this will install the debugger that VSCode gave me, if you already have VSCode or this debugger skip this step.python main.py
This should workpython -m debugpy --listen localhost:5678 main.py
This will crash. (Note that--listen localhost:5678
is not relevant and is only present to getdebugpy
to function naked without VSCode.)
main.py
Here is main.py
.
from llama_cpp import Llama # type: ignore
def main() -> None:
print("Launching Application ...")
llama_model = Llama(model_path="E:\LLaMA\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin") # throws ValueError from numpy.ndarray.resize()
print("Running ...")
print(llama_model("Hello!\n"))
if __name__ == "__main__":
main()
Failure Logs
Relevant part of the Traceback
File "main.py", line 14, in <module>
main()
File "main.py", line 7, in main
llama_model = Llama(model_path="E:\LLaMA\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin") # throws ValueError from numpy.ndarray.resize()
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 225, in __init__
self._candidates_data.resize(3, self._n_vocab)
ValueError: cannot resize an array that references or is referenced
by another array in this way.
Use the np.resize function or refcheck=False
Full Crash Output
Click me to expand
Launching Application ...
llama.cpp: loading model from E:\LLaMA\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.07 MB
llama_model_load_internal: mem required = 6612.59 MB (+ 1026.00 MB per state)
.
llama_init_from_file: kv self size = 256.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Traceback (most recent call last):
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\__main__.py", line 39, in <module>
cli.main()
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\server\cli.py", line 430, in main
run()
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\server\cli.py", line 284, in run_file
runpy.run_path(target, run_name="__main__")
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "main.py", line 14, in <module>
main()
File "main.py", line 7, in main
llama_model = Llama(model_path="E:\LLaMA\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin") # throws ValueError from numpy.ndarray.resize()
File "C:\Users\TarpeyD12\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 225, in __init__
self._candidates_data.resize(3, self._n_vocab)
ValueError: cannot resize an array that references or is referenced
by another array in this way.
Use the np.resize function or refcheck=False
Causes
Looking at the error it's clear that it is being raised by numpy
and looking at line 225
of llama_cpp\llama.py
we see:
self._candidates_data.resize(3, self._n_vocab)
Since self._candidates_data
is of type numpy.ndarray
as it is defined immediately before on line 219
, we should look at the documentation for numpy.ndarray.resize()
.
It shows the scenario for raising a ValueError
:
Raises:
ValueError
If a does not own its own data or references or views to it exist, and the data memory must be changed. PyPy only: will
always raise if the data memory must be changed, since there is no reliable way to determine if references or views to it exist.
Given that it states that the ValueError
will always be raised when under PyPy, it seems that this is likely to be the root cause of the exception being raised under debugpy
.
Fix
The most likely fix is to change line 225
of llama_cpp\llama.py
from:
self._candidates_data.resize(3, self._n_vocab)
to:
self._candidates_data.resize(3, self._n_vocab, refcheck=False)
I have made this edit to my local installation of llama-cpp-python
and it has solved the issue for me, but I am unsure if this has other side-effects that would effect other instillations.
Environment info:
Versions:
Python 3.10.6
llama-cpp-python 0.1.57
debugpy 1.6.4
fastapi 0.95.0
numpy 1.24.3
sse-starlette 1.6.1
uvicorn 0.21.1
Sorry for the wall of text for such a small bug.