Qwen3 Next Thinking gguf
Make sure you have enough memory/GPU.
Use the model in ollama
First download and install ollama.
Note: the official ollama models do not have Qwen3-Next support yet. You need do the following.
Command
in windows command line (or mac os, linux), or in terminal in ubuntu, type:
ollama run hf.co/John1604/Qwen3-Next-80B-A3B-Thinking-gguf:q3_k_m
(q3_k_m is the model quant type, q3_k_s, q4_k_m, ..., can also be used)
C:\Users\developer>ollama run hf.co/John1604/Qwen3-Next-80B-A3B-Thinking-gguf-gguf:q3_k_m
pulling manifest
...
writing manifest
success
>>> Send a message (/? for help)
After you run command: ollama run hf.co/John1604/Qwen3-Next-80B-A3B-Thinking-gguf:q3_k_m, it will appear in ollama UI - you may select this model hf.co/John1604/Qwen3-Next-80B-A3B-Thinking-gguf:q3_k_m from the model list, and run it the same way as other ollama supported models.
Use the model in LM Studio
download and install LM Studio
Discover models
In the LM Studio, click "Discover" icon. "Mission Control" popup window will be displayed.
In the "Mission Control" search bar, type "John1604/Qwen3-Next-80B-A3B-Thinking-gguf" and check "GGUF", the model should be found.
Download a quantized model.
Load the quantized model.
Ask questions.
quantized models
| Type | Bits | Quality | Description |
|---|---|---|---|
| Q2_K | 2-bit | 🟥 Low | Minimal footprint; only for tests |
| Q3_K_S | 3-bit | 🟧 Low | “Small” variant (less accurate) |
| Q3_K_M | 3-bit | 🟧 Low–Med | “Medium” variant |
| Q4_K_S | 4-bit | 🟨 Med | Small, faster, slightly less quality |
| Q4_K_M | 4-bit | 🟩 Med–High | “Medium” — best 4-bit balance |
| Q5_K_S | 5-bit | 🟩 High | Slightly smaller than Q5_K_M |
| Q5_K_M | 5-bit | 🟩🟩 High | Excellent general-purpose quant |
| Q6_K | 6-bit | 🟩🟩🟩 Very High | Almost FP16 quality, larger size |
| Q8_0 | 8-bit | 🟩🟩🟩🟩 | Near-lossless baseline |
- Downloads last month
- 182
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
6-bit
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for John1604/Qwen3-Next-80B-A3B-Thinking-gguf
Base model
Qwen/Qwen3-Next-80B-A3B-Thinking