Paper page - Textbooks Are All You Need (original) (raw)

Published on Jun 20, 2023

·

Submitted by AK

on Jun 21, 2023

#1 Paper of the day

Authors:

,

,

,

,

,

,

,

,

,

,

Abstract

A new compact Transformer-based large language model for code, phi-1, achieves high accuracy on coding benchmarks despite having fewer parameters than competing models.

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2306.11644

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 9

microsoft/phi-1 Text Generation • 1B • Updated Nov 24, 2025 • 8.97k • 220

kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2 Text Classification • Updated Jun 26, 2025 • 230 • 28

professorf/phi-1-gguf Text Generation • 1B • Updated Aug 27, 2024 • 16 • 1

michaelfeil/ct2fast-phi-1 Text Generation • Updated Nov 30, 2023 • 11

Browse 9 models citing this paper

Datasets citing this paper 17

HuggingFaceTB/cosmopedia Viewer • Updated Aug 12, 2024• 31.1M • 20.4k • 690

nampdn-ai/tiny-codes Viewer • Updated Sep 30, 2023• 1.63M • 1.9k • 288

maywell/korean_textbooks Viewer • Updated Jan 10, 2024• 4.42M • 1.34k • 124

goendalf666/sales-conversations Viewer • Updated Oct 4, 2023• 3.41k • 294 • 43

Browse 17 datasets citing this paper

Spaces citing this paper 77

Collections including this paper 40

Browse 40 collections that include this paper