Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix link to available UI settings examples server
#24169 opened Jun 5, 2026 by wariuccio Loading…
mtmd : add Apple CoreML backend for vision encoding examples python python script changes
#24163 opened Jun 5, 2026 by tc-mb Contributor Loading…
[WIP] DeepSeek V4 ggml changes relating to the ggml tensor library for machine learning model Model specific python python script changes
#24162 opened Jun 5, 2026 by am17an Contributor Draft
opencl: improve get_rows, cpy, concat and q6_k flat gemv ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#24160 opened Jun 5, 2026 by lhez Contributor Draft
server : return HTTP 400 on invalid grammar (#24144) examples python python script changes server
#24154 opened Jun 5, 2026 by Anuj-Attri Loading…
Sycl --split-mode tensor ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24152 opened Jun 5, 2026 by Spruill-1 Loading…
server: add "schema" and validation examples server
#24150 opened Jun 4, 2026 by ngxson Contributor Loading…
ui: add ignore-scripts=true to npmrc examples server/ui
#24149 opened Jun 4, 2026 by ngxson Contributor Loading…
Fix/server prompt cache no consume on load examples python python script changes server
#24143 opened Jun 4, 2026 by alainnothere Loading…
Update quantization readme examples
#24133 opened Jun 4, 2026 by pcuenca Contributor Loading…
HIP: add gfx1152 and gfx1153 to RDNA3.5 ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#24129 opened Jun 4, 2026 by harkgill-amd Loading…
CUDA: refactor MMQ kernel configuration ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#24127 opened Jun 4, 2026 by JohannesGaessler Contributor Loading…
Add ctx-per-slot argument for unified KV cache examples server
#24124 opened Jun 4, 2026 by bartowski1182 Contributor Loading…
vulkan: add v_dot2_f32_f16 support in matrix-matrix multiplication and Flash Attention ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#24123 opened Jun 4, 2026 by 0cc4m Contributor Loading…
rpc: reduce small-message overhead and tune cache probing examples ggml changes relating to the ggml tensor library for machine learning python python script changes
#24122 opened Jun 4, 2026 by Donovoi Draft
test(server): homogeneize slow tests eviction devops improvements to build systems and github actions examples python python script changes server
#24114 opened Jun 4, 2026 by dacorvo Loading…
test(server): fix flaky completion test examples python python script changes server
#24113 opened Jun 4, 2026 by dacorvo Loading…
fix: don't build AMX by default with Apple clang ggml changes relating to the ggml tensor library for machine learning
#24094 opened Jun 3, 2026 by banksio Loading…
ProTip! Add no:assignee to see everything that’s not assigned.