server : vision support via libmtmd by ngxson · Pull Request #12898 · ggml-org/llama.cpp (original) (raw)

added 2 commits

April 11, 2025 17:46

gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request

May 9, 2025

origin/master: (39 commits) server : vision support via libmtmd (ggml-org#12898) sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (ggml-org#12858) metal : optimize MoE for large batches (ggml-org#13388) CUDA: FA support for Deepseek (Ampere or newer) (ggml-org#13306) llama : do not crash if there is no CPU backend (ggml-org#13395) CUDA: fix crash on large batch size for MoE models (ggml-org#13384) imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (ggml-org#13389) llama-run: add support for downloading models from ModelScope (ggml-org#13370) mtmd : fix batch_view for m-rope (ggml-org#13397) llama : one-off chat template fix for Mistral-Small-2503 (ggml-org#13398) rpc : add rpc_msg_set_tensor_hash_req (ggml-org#13353) vulkan: Allow up to 4096 elements for mul_mat_id row_ids (ggml-org#13326) server : (webui) rename has_multimodal --> modalities (ggml-org#13393) ci : limit write permission to only the release step + fixes (ggml-org#13392) mtmd : Expose helper_decode_image_chunk (ggml-org#13366) server : (webui) fix a very small misalignment (ggml-org#13387) server : (webui) revamp the input area, plus many small UI improvements (ggml-org#13365) convert : support rope_scaling type and rope_type (ggml-org#13349) mtmd : fix the calculation of n_tokens for smolvlm (ggml-org#13381) context : allow cache-less context for embeddings (ggml-org#13108) ...

This was referenced

May 15, 2025

aropb mentioned this pull request

Jun 20, 2025

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request

Apr 26, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request

May 6, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request

May 15, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request

May 15, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

phibya pushed a commit to ziee-ai/llama.cpp that referenced this pull request

May 29, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

fewtarius pushed a commit to fewtarius/CachyLLama that referenced this pull request

May 30, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request

Jun 2, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request

Jun 2, 2026

server : (experimental) vision support via libmtmd
mtmd : add more api around mtmd_image_tokens
mtmd : add more api around mtmd_image_tokens
mtmd : ability to calc image hash
shared_ptr for mtmd_image_tokens
move hash to user-define ID (fixed)
abstract out the batch management
small fix
refactor logic adding tokens to batch
implement hashing image
use FNV hash, now hash bitmap instead of file data
allow decoding image embedding to be split into batches
rm whitespace
disable some features when mtmd is on
fix --no-mmproj-offload
mtmd_context_params no timings
refactor server_inp to server_tokens
fix the failing test case
init
wip
working version
add mtmd::bitmaps
add test target
rm redundant define
test: mtmd_input_chunks_free
rm outdated comment
fix merging issue
explicitly create mtmd::input_chunks
mtmd_input_chunk_copy
add clone()
improve server_input struct
clip : fix confused naming ffn_up and ffn_down
rm ffn_i/o/g naming
rename n_embd, n_ff
small fix
no check n_ff
fix detokenize
add const to various places
add warning about breaking changes
add c api
helper: use mtmd_image_tokens_get_n_pos
fix ctx_shift
fix name shadowing
more strict condition
support remote image_url
remote image_url log
add CI test
do not log base64
add "has_multimodal" to /props
remove dangling image
speculative: use slot.cache_tokens.insert
Apply suggestions from code review

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

rm can_be_detokenized
on prmpt processing done, assert cache_tokens.size
handle_completions_impl returns void
adapt the new web ui
update docs and hot topics
rm assert
small fix (2)

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})