---
license: apache-2.0
language:
- en
- zh
base_model:
- Qwen/Qwen3-30B-A3B-Thinking-2507
- Qwen/Qwen3-30B-A3B-Instruct-2507
pipeline_tag: text-generation
tags:
- merge
---
> *This is an auto-thinking-switching model built with model merging and expert substitution techniques: it answers simple questions directly, gives brief thoughts to moderate ones, and delves deeply into difficult ones.*
# *Model Highlights:*

- ***merge method**: `arcee_fusion`*

- ***Highest precision**: `dtype: float32` + `out_dtype: bfloat16`*

- ***Context length**: `262,144`&`1010000`*

# *Parameter Settings*:
## *Auto-Thinking Mode*
> [!NOTE]
> *`Temperature=0.6`, `TopP=0.95`, `TopK=20`,`MinP=0`.*

## *Step1: Hybrid Instruct Model and Thinking Model*
*Conduct initial mixing of the instruction model and reasoning model.*
```yaml
models:
  - model: Qwen/Qwen3-30B-A3B-Thinking-2507
merge_method: arcee_fusion
base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
dtype: float32
out_dtype: bfloat16
tokenizer_source: base
name: Qwen3-30B-A3B-YOYO-AutoThink-preview
```
## *Step2: Expert replacement*
*Inspired by this [paper](https://arxiv.org/abs/2506.14794) , we use the following regular expression: `^model\.layers\.\d+\.mlp\.experts\.\d+\.(down_proj|gate_proj|up_proj)\.weight$` for expert replacement — all experts in Qwen3-30B-A3B-YOYO-AutoThink-preview that match the regex are replaced with those from Qwen3-30B-A3B-Thinking-2507.*