Qwen/Qwen3-VL-4B-Thinking-FP8

#1451
by SkyMind - opened

Let's do https://huggingface.co/Qwen/Qwen3-VL-4B-Thinking instead as we can't conveart an already quantized model into a GGUF.

Unfortinately Qwen3VLForConditionalGeneration is not currently supported by llama.cpp nor do I currently see any pull requests that are working on adding support for it.

https://github.com/ggml-org/llama.cpp/issues/16207

Awesome. Thanks for linking this PR. I must have missed it due to it already being 3 weeks since work on it started. I will follow it and do this model as soon as it is merged.

Awesome. Thanks for linking this PR. I must have missed it due to it already being 3 weeks since work on it started. I will follow it and do this model as soon as it is merged.

https://github.com/ggml-org/llama.cpp/pull/16780 merged

Sign up or log in to comment