view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 5 days ago • 53
view article Article Introducing swift-huggingface: The Complete Swift Client for Hugging Face 4 days ago • 21
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 348
view post Post 2266 NEW: @mistralai released a fantastic family of multimodal models, Ministral 3. You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPOLink to the notebooks:- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb- TRL and more examples: https://huggingface.co/docs/trl/index See translation 2 replies · 🔥 7 7 + Reply
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated 6 days ago • 240
Qwen3-VL Collection Qwen's new multimodal vision models in GGUF, safetensor, and dynamic Unsloth formats. • 56 items • Updated 6 days ago • 17
Ministral 3 - Additional Checkpoints Collection Different formats and Quantized versions of our Ministral 3 family; 14B/8B/3B Instruct/Reasoning GGUF, 3B Instruct ONNX and 14B/8B/3B Instruct BF16. • 13 items • Updated 6 days ago • 12