cpatonn
/

Qwen3-Next-80B-A3B-Instruct-AWQ-4bit

Text Generation

compressed-tensors

Model card Files Files and versions

Resources

View closed (0)

Performance evaluation for v1.0.0 Model

#12 opened 14 days ago by

Bloody hell!! running perfectly on 3x 3090 at 160k context, speeds between 65tk/s to 30tk/s (depending on lenght) , my script:

#11 opened about 2 months ago by

Did anyone get speculative decode working?

#10 opened about 2 months ago by

Successfully Running Qwen3-Next-80B-A3B-Instruct-AWQ-4bit on 3x RTX 3090s

#9 opened 2 months ago by

sorta works on vllm now

#8 opened 3 months ago by

Recent update throws error: KeyError: 'layers.30.mlp.shared_expert.down_proj.weight'

#7 opened 3 months ago by

gibberish still persists?

#6 opened 3 months ago by

MTP Accepted throughput always at 0.00 tokens/s

#5 opened 3 months ago by

Experiencing excessive response latency.

#4 opened 3 months ago by

Does this quantized version support running on machines like V100 and V100S?

#3 opened 3 months ago by

Error on inputting lots of prompts

#2 opened 3 months ago by

Error when running in VLLM

#1 opened 3 months ago by