SWE-smith (original) (raw)

Scaling Data for Software Engineering Agents

April 30, 2025

Creating training data for software engineering agents is difficult. Until now.

Introducing SWE-smith: Generate 100s to 1000s of task instances for any GitHub repository.

We've generated 50k+ task instances for 128 popular GitHub repositories, then trained our own LM for SWE-agent.

The result? SWE-agent-LM-32B achieve 40% pass@1 on SWE-bench Verified.

Now, we've open-sourced everything, and we're excited to see what you build with it!

Check out the tutorial below to generate 100 task instances for_any_ GitHub repository in 10 minutes.

Click here for an extended discussion.

️🔥 Excited about SWE-smith? Build with us!

> Create new bug generation techinques.

> Expand to non-Python repositories.

> Train better SWE-agents!

Authors

John Yang,Kilian Lieret,Carlos E. Jimenez,Alexander Wettig,Kabir Khandpur,Yanzhe Zhang,Binyuan Hui,Ofir Press,Ludwig Schmidt,Diyi Yang

Affiliations

Stanford University,Stanford SALT Lab,Princeton Language & Intelligence,Alibaba Qwen

Citation

@misc{yang2025swesmith, title={SWE-smith: Scaling Data for Software Engineering Agents}, author={John Yang and Kilian Lieret and Carlos E. Jimenez and Alexander Wettig and Kabir Khandpur and Yanzhe Zhang and Binyuan Hui and Ofir Press and Ludwig Schmidt and Diyi Yang}, year={2025}, eprint={2504.21798}, archivePrefix={arXiv}, primaryClass={cs.SE}, url={https://arxiv.org/abs/2504.21798}, }