Phi-4 Technical Report

Marah I Abdin; Jyoti Aneja; Harkirat Behl; Sébastien Bubeck; Ronen Eldan; Suriya Gunasekar; Michael Harrison; Russell J. Hewett; Mojan Javaheripi; Piero Kauffmann; James R. Lee; Yin Tat Lee; Yuanzhi  Li; Weishung Liu; Caio CT Mendes; Anh Nguyen; Eric Price; Gustavo de Rosa; Olli Saarikivi; Adil Salim; Shital Shah; Xin Wang; Rachel Ward; Yue Wu; Dingli Yu; Cyril Zhang; Yi Zhang

Phi-4 Technical Report

Marah I Abdin ,
Jyoti Aneja ,
Harkirat Behl ,
Sébastien Bubeck ,
Ronen Eldan ,
Suriya Gunasekar ,
Michael Harrison ,
Russell J. Hewett ,
Mojan Javaheripi ,
Piero Kauffmann ,
James R. Lee ,
Yin Tat Lee ,
Yuanzhi Li ,
Weishung Liu ,
Caio CT Mendes ,
Anh Nguyen ,
Eric Price ,
Gustavo de Rosa ,
Olli Saarikivi ,
Adil Salim ,
Shital Shah ,
Xin Wang ,
Rachel Ward ,
Yue Wu ,
Dingli Yu ,
Cyril Zhang ,
Yi Zhang

MSR-TR-2024-57 | December 2024

Published by Microsoft

PDF | Related File

Download BibTex

We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality. Unlike most language models, where pre-training is based primarily on organic data sources such as web content or code, phi-4 strategically incorporates synthetic data throughout the training process. While previous models in the Phi family largely distill the capabilities of a teacher model (specifically GPT-4), phi-4 substantially surpasses its teacher model on STEM-focused QA capabilities, giving evidence that our data-generation and post-training techniques go beyond distillation. Despite minimal changes to the phi-3 architecture, phi-4 achieves strong performance relative to its size– especially on reasoning-focused benchmarks– due to improved data, training curriculum, and innovations in the post-training scheme.

Publication Downloads

Phi-4

June 18, 2025

Phi-4-multimodal and Phi-4-mini, the newest models in Microsoft’s Phi family of small language models (SLMs) are now available. These models are designed to empower developers with advanced AI capabilities. Phi-4-multimodal, with its ability to process speech, vision, and text simultaneously, opens new possibilities for creating innovative and context-aware applications. Phi-4-mini, on the other hand, excels in text-based tasks, providing high accuracy and scalability in a compact form.

Download Data