Feature - Internal ggml precision GGML_TYPE_F16 support

It might be too much to ask for now, given it's rooting deep into ggml but in longterm I believe it's important to support 16 bit precision.
Especially as GPU support is finding more and more grip in GGML the 32 bit requirement is a significant performance burden while not providing any benefit on the multiplications.
After all the multiplications inside the GPU are all 16 bit, converting src1 from 32 bit to 16 bit for every calculation costs quite noticeable performance.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature - Internal ggml precision GGML_TYPE_F16 support #1492

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature - Internal ggml precision GGML_TYPE_F16 support #1492

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions