Skip to content

Commit 23a2219

Browse files
authored
Documenting server usage (abetlen#768)
1 parent c21edb6 commit 23a2219

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@ To install the server package and get started:
164164
pip install llama-cpp-python[server]
165165
python3 -m llama_cpp.server --model models/7B/llama-model.gguf
166166
```
167+
167168
Similar to Hardware Acceleration section above, you can also install with GPU (cuBLAS) support like this:
168169

169170
```bash
@@ -173,6 +174,8 @@ python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35
173174

174175
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
175176

177+
To bind to `0.0.0.0` to enable remote connections, use `python3 -m llama_cpp.server --host 0.0.0.0`.
178+
Similarly, to change the port (default is 8000), use `--port`.
176179

177180
## Docker image
178181

0 commit comments

Comments
 (0)