Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.06221

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 208
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29 • 98
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23 • 60
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20 • 52
Performance Trade-offs of Optimizing Small Language Models for E-Commerce

Paper • 2510.21970 • Published Oct 24 • 2

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Paper • 2508.21365 • Published Aug 29 • 29
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey

Paper • 2511.07448 • Published Nov 5 • 2
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published 20 days ago • 105

HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10 • 90.8k • • 834
HuggingFaceFW/fineweb

Viewer • Updated Jul 11 • 52.5B • 202k • 2.48k
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
p-e-w/Llama-3.1-8B-Instruct-heretic

Text Generation • 8B • Updated 24 days ago • 941 • 6

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23 • 4
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14 • 144

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21 • 35
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Paper • 2510.08551 • Published Oct 9 • 32
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Paper • 2510.04212 • Published Oct 5 • 23
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14 • 26

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 8.01k • 1.22k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 141 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey

Paper • 2511.07448 • Published Nov 5 • 2
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published 20 days ago • 105

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10 • 90.8k • • 834
HuggingFaceFW/fineweb

Viewer • Updated Jul 11 • 52.5B • 202k • 2.48k
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
p-e-w/Llama-3.1-8B-Instruct-heretic

Text Generation • 8B • Updated 24 days ago • 941 • 6

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 208
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23 • 4
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14 • 144

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29 • 98
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23 • 60
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20 • 52
Performance Trade-offs of Optimizing Small Language Models for E-Commerce

Paper • 2510.21970 • Published Oct 24 • 2

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21 • 35
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Paper • 2510.08551 • Published Oct 9 • 32
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Paper • 2510.04212 • Published Oct 5 • 23
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14 • 26

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Paper • 2508.21365 • Published Aug 29 • 29
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published about 1 month ago • 128

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 8.01k • 1.22k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 141 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs