Paper page - Textbooks Are All You Need II: phi-1.5 technical report (original) (raw)

Abstract

A new 1.3 billion parameter Transformer-based language model, phi-1.5, demonstrates comparable performance to much larger models on common sense reasoning and complex tasks despite the absence of web data.

We continue the investigation into the power of smaller Transformer-based language models as initiated by TinyStories -- a 10 million parameter model that can produce coherent English -- and the follow-up work onphi-1, a 1.3 billion parameter model with Python coding performance close to the state-of-the-art. The latter work proposed to use existing Large Language Models (LLMs) to generate ``textbook quality" data as a way to enhance the learning process compared to traditional web data. We follow the ``Textbooks Are All You Need" approach, focusing this time on common sense reasoning in natural language, and create a new 1.3 billion parameter model named phi-1.5, with performance on natural language tasks comparable to models 5x larger, and surpassing most non-frontier LLMs on more complex reasoning tasks such as grade-school mathematics and basic coding. More generally, phi-1.5 exhibits many of the traits of much larger LLMs, both good -- such as the ability to ``think step by step" or perform some rudimentary in-context learning -- and bad, including hallucinations and the potential for toxic and biased generations -- encouragingly though, we are seeing improvement on that front thanks to the absence of web data. We open-source phi-1.5 to promote further research on these urgent topics.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2309.05463

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 22

microsoft/phi-1_5 Text Generation • 1B • Updated Nov 24, 2025 • 69.8k • 1.36k

Open-Orca/oo-phi-1_5 Text Generation • Updated Nov 22, 2023 • 628 • 32

TKDKid1000/phi-1_5-GGUF Text Generation • 1B • Updated Dec 19, 2023 • 1.86k • 8

fzmnm/TinyStoriesAdv_92M Text Generation • Updated Aug 1, 2024 • 14 • 3

Browse 22 models citing this paper

Datasets citing this paper 11

HuggingFaceTB/cosmopedia Viewer • Updated Aug 12, 2024• 31.1M • 20.4k • 690

Maple222/llmtcl Preview • Updated Nov 26, 2025 • 829

nampdn-ai/tiny-textbooks Viewer • Updated Jul 3, 2024• 420k • 574 • 173

fzmnm/TinyStoriesAdv-zh Preview • Updated Aug 21, 2024 • 363 • 10

Browse 11 datasets citing this paper

Spaces citing this paper 240

Collections including this paper 42

Browse 42 collections that include this paper