Hazy Research (original) (raw)

Posts

The Great American AI Race

Mar 24, 2025 · Chris Ré

BASED ✌️: our one year retrospective

Mar 24, 2025 · Simran Arora

ThunderKittens Now on Blackwells!

Mar 15, 2025 · Benjamin Spector, Aaryan Singhal, Dan Fu, Chris Ré

ThunderMLA: FlashMLA, Faster and Fused-er!

Mar 4, 2025 · Benjamin Spector, Aaryan Singhal, Dan Fu, Chris Ré

Minions: the rise of small, on-device LMs

Feb 24, 2025 · Sabri Eyuboglu*, Dan Biderman*, and Avanika Narayan*.

Hazy Alumni Life Updates (2024)

Dec 11, 2024 · Christopher Ré

Smoothie: a label-free approach for inference-time LLM routing

Dec 10, 2024 · Neel Guha, Mayee F. Chen, Trevor Chow, Ishan S. Khare, Christopher Ré

ThunderMittens For Your ThunderKittens

Nov 28, 2024 · Conner Takehana, Aaryan Singhal

ThunderKittens: Bringing fp8 to theaters near you

Nov 27, 2024 · Simran Arora

An Unserious Person’s Take on Axiomatic Knowledge in the Era of Foundation Models

Nov 18, 2024 · Chris Ré

Easier, Better, Faster, Cuter

Oct 29, 2024 · Benjamin Spector, Simran Arora, Aaryan Singhal, Daniel Y. Fu, Chris Ré

LoLCATs Blog Part 2: How to Linearize LLMs for Me and You

Oct 14, 2024 · Michael Zhang, Simran Arora

Linearizing LLMs with LoLCATs

Oct 14, 2024 · Michael Zhang, Simran Arora

Just read twice: closing the recall gap for recurrent language models

Jul 7, 2024 · Simran Arora

Efficient language models as arithmetic circuits

Jun 22, 2024 · Simran Arora, Sabri Eyuboglu, Atri Rudra

Announcing LoCoV1 and the Latest M2-BERT Models

May 20, 2024 · Jon Saad-Falcon, Dan Fu, Simran Arora

ECLAIR: A Treat for the Enterprise

May 18, 2024 · Avanika Narayan*, Michael Wornow*, Chris Ré.

GPUs Go Brrr

May 12, 2024 · Benjamin Spector, Aaryan Singhal, Simran Arora, Chris Re

ThunderKittens: A Simple Embedded DSL for AI kernels

May 12, 2024 · Benjamin Spector, Aaryan Singhal, Simran Arora, Chris Re

Learning from DNA: a grand challenge in biology

Mar 14, 2024 · Eric Nguyen, Michael Poli

Based: Simple linear attention language models balance the recall-throughput tradeoff

Mar 3, 2024 · Sabri Eyuboglu*, Simran Arora*, Michael Zhang*

Long-Context Retrieval Models with Monarch Mixer

Jan 11, 2024 · Jon Saad-Falcon, Dan Fu, Simran Arora

Zoology (Blogpost 2): Simple, Input-Dependent, and Sub-Quadratic Sequence Mixers

Dec 11, 2023 · Simran Arora*, Michael Zhang*, Sabri Eyuboglu*, Chris Ré

Zoology (Blogpost 1): Measuring and Improving Recall in Efficient Language Models

Dec 11, 2023 · Sabri Eyuboglu*, Simran Arora*, Michael Zhang.

Zoology (Blogpost 0): Overview

Dec 11, 2023 · Simran Arora*, Michael Zhang*, Sabri Eyuboglu*.

Monarchs and Butterflies: Towards Sub-Quadratic Scaling in Model Dimension

Dec 11, 2023 · Dan Fu, with work from past & present students of Hazy Research.

Long Convolutions for GPT-like Models: Polynomials, Fast Fourier Transforms and Causality

Dec 11, 2023 · Chris Ré, Dan Fu.

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Nov 13, 2023 · Dan Fu*, Hermann Kumbong*, Eric Nguyen, Chris Ré

A Paradigm Shift in ML Validation: Evaluating Workflows, Not Tasks

Aug 21, 2023 · Arjun Desai

Embroid: Correcting and Improving LLM Predictions Without Labels

Aug 12, 2023 · Neel Guha*, Mayee F. Chen*, Kush Bhatia*, Azalia Mirhoseini, Chris Ré.

Monarch Mixer: Revisiting BERT, Without Attention or MLPs

Jul 25, 2023 · Dan Fu*, Simran Arora*, Chris Ré

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Jul 17, 2023 · Tri Dao

HyenaDNA: learning from DNA with 1 Million token context

Jun 29, 2023 · Eric Nguyen*, Michael Poli*, Marjan Faizi*

Why is in-context learning lower quality than fine-tuning? And…what if it wasn't?

Jun 13, 2023 · Kush Bhatia*, Avanika Narayan*, Chris De Sa, Chris Ré.

The Safari of Deep Signal Processing: Hyena and Beyond

Jun 8, 2023 · Michael Poli, Stefano Massaroli, Simran Arora, Dan Fu, Stefano Ermon, Chris Ré.

The Eroding Technical Moat of AI and the Power of Open Source

May 5, 2023 · Chris Ré.

Understanding the Ingredients in ChatGPT is Simpler Than You Think

Apr 20, 2023 · Chris Ré.

In The ChatGPT Era, Your Data is More Valuable Than Ever

Apr 20, 2023 · Chris Ré.

Ask Me Anything: Leveraging Foundation Models for Private & Personalized Systems

Apr 18, 2023 · Simran Arora, Chris Ré.

Batch computing and the coming age of AI systems

Apr 12, 2023 · Sabri Eyuboglu, Brandon Yang, Chris Ré.

From Deep to Long Learning?

Mar 28, 2023 · Dan Fu, Michael Poli, Chris Ré.

Is AI Rare or Everywhere?

Mar 23, 2023 · Chris Ré.

First-Mile vs. Last-Mile AI Systems in the Era of Foundation Models

Mar 15, 2023 · Chris Ré.

Hyena Hierarchy: Towards Larger Convolutional Language Models

Mar 7, 2023 · Michael Poli*, Stefano Massaroli*, Eric Nguyen*, Dan Fu, Tri Dao, Stephen A. Baccus, Yoshua Bengio, Stefano Ermon, and Chris Ré.

Meerkat and the Path to Foundation Models as a Reliable Software Abstraction

Mar 1, 2023 · Karan Goel*, Sabri Eyuboglu*, Arjun Desai*, James Zou, Chris Ré.

Simple Long Convolutions for Sequence Modeling

Feb 15, 2023 · Dan Fu, Elliot Epstein, Eric Nguyen, Armin Thomas, Michael Zhang, Tri Dao, Atri Rudra, and Chris Ré.

AI's Linux Moment: An Open-Source AI Model Love Note

Jan 30, 2023 · Chris Ré.

H3: Language Modeling with State Space Models and (Almost) No Attention

Jan 23, 2023 · Dan Fu, Tri Dao, Khaled Saab, Armin Thomas, Atri Rudra, and Chris Ré.

Data Wrangling with Foundation Models

Jan 13, 2023 · Avanika Narayan, Ines Chami, Laurel Orr, and Chris Ré.

FlashAttention: Fast Transformer Training with Long Sequences

Jan 13, 2023 · Tri Dao

How Foundation Models Changed our Work

Nov 16, 2022 · Chris Ré

Fast Stable Diffusion with FlashAttention + Diffusers

Oct 13, 2022 · Dan Fu, Tri Dao, and Chris Ré.

Foundation Models are Entering their Data-Centric Era

Oct 11, 2022 · Chris Ré and Simran Arora

Simplifying S4

Jun 21, 2022 · Chris Ré, Dan Fu, Karan Goel, Khaled Saab.

Can Longer Sequences Help Take the Next Leap in AI?

Jun 9, 2022 · Chris Ré, Tri Dao, Dan Fu, Karan Goel

TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval

Apr 19, 2022 · Megan Leszczynski, Dan Fu, Mayee Chen, and Chris Ré.

Improving Transfer and Robustness in Supervised Contrastive Learning

Apr 19, 2022 · Mayee Chen, Dan Fu, Avanika Narayan, Michael Zhang, Zhao Song, Kayvon Fatahalian, and Chris Ré.

Advances in Understanding, Improving, and Applying Contrastive Learning

Apr 19, 2022 · Dan Fu, Mayee Chen, Megan Leszczynski, and Chris Ré.

An Introduction to Slice Discovery with Domino

Apr 2, 2022 · Sabri Eyuboglu, Maya Varma, Khaled Saab, Jared Dunnmon, James Zou and Chris Ré.

Pixelated Butterfly: Simple and Efficient Sparse Training for Neural Network Models

Jan 17, 2022 · Beidi Chen, Tri Dao and Chris Ré.

Structured State Spaces: Combining Continuous-Time, Recurrent, and Convolutional Models

Jan 14, 2022 · Albert Gu, Karan Goel, Khaled Saab, and Chris Ré

Structured State Spaces for Sequence Modeling (S4)

Jan 14, 2022 · Albert Gu, Karan Goel, Khaled Saab, and Chris Ré

What can we accomplish without changing the architecture? A thought experiment in incorporating knowledge through data!

Oct 14, 2021 · Simran Arora, Sen Wu, Enci Liu, and Chris Ré

What Data Centric AI is Not

Sep 26, 2021 · Chris Ré and Simran Arora

The Road to Software 2.0 or Data-Centric AI

Jun 20, 2021 · Chris Ré

HiPPO: Recurrent Memory with Optimal Polynomial Projections

Dec 5, 2020 · Albert Gu*, Tri Dao*, Stefano Ermon, Atri Rudra, and Chris Ré

Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation

Nov 10, 2020 · Laurel Orr, Megan Leszczynski, Simran Arora, Neel Guha, Xiao Ling, Sen Wu, and Chris Ré

The Coming Wave of ML Systems

Oct 13, 2020 · Chris Ré, Piero Molino, Dan Fu, Karan Goel, Fiodar Kazhamakia, and Matei Zaharia

Addressing Hidden Stratification: Fine-Grained Robustness in Coarse-Grained Classification Problems

Jul 1, 2020 · Nimit Sohoni, Jared Dunnmon, Geoffrey Angus, Albert Gu, and Chris Ré

Ivy: Instrumental Variable Synthesis for Causal Inference

Apr 14, 2020 · Charles Kuang, Frederic Sala, Nimit Sohoni, James Priest, and Christopher Ré

Weak Supervision for Science and Medicine: A Year in Review

Mar 2, 2020 · Jared Dunnmon and Chris Ré. Referencing work by other members of Hazy Research.

When Multi-Task Learning Works -- And When It Doesn’t

Mar 1, 2020 · Sen Wu, Hongyang Zhang and Chris Ré.

Software 2.0 and Data Programming: Lessons Learned, and What’s Next

Feb 28, 2020 · Dan Fu, Laurel Orr, and students of HazyResearch

Towards Interactive Weak Supervision with FlyingSquid

Feb 28, 2020 · Dan Fu, Mayee Chen, Fred Sala, Sarah Hooper, Kayvon Fatahalian, and Chris Ré

Automating the Art of Data Augmentation

Part IV New Direction

Feb 26, 2020 · Karan Goel, Albert Gu, Sharon Li and Chris Ré

Automating the Art of Data Augmentation

Part III Theory

Feb 26, 2020 · Edited by Hongyang Zhang, Sharon Li and Chris Ré. Referencing work by many other members of Hazy Research.

Automating the Art of Data Augmentation

Part II Practical Methods

Feb 26, 2020 · Sharon Li and Chris Ré.

Automating the Art of Data Augmentation

Part I Overview

Feb 26, 2020 · Series edited by Sharon Li and Chris Ré. Referencing work by many other members of Hazy Research.

Into the Wild: Machine Learning In Non-Euclidean Spaces

Oct 10, 2019 · Fred Sala, Ines Chami, Adva Wolf, Albert Gu, Beliz Gunel and Chris Ré

Why Train What You Can Code? Rekall: A Compositional Approach to Video Analysis

Oct 9, 2019 · Dan Fu, Chris Ré, Kayvon Fatahalian

Powerful Abstractions for Programming Your Training Data

Jun 15, 2019 · Sen Wu, Vincent S. Chen, Braden Hancock, Alex Ratner, Chris Ré, and other members of Hazy Lab

Butterflies Are All You Need: A Universal Building Block for Structured Linear Maps

Jun 13, 2019 · Tri Dao, Albert Gu, Matthew Eichhorn, Megan Leszczynski, Nimit Sohoni, Amit Blonder, Atri Rudra, and Chris Ré

Learning Dependency Structures in Weak Supervision

Jun 12, 2019 · Fred Sala, Paroma Varma, Chris Ré

Massive Multi-Task Learning with Snorkel MeTaL: Bringing More Supervision to Bear

Mar 22, 2019 · Braden Hancock, Clara McCreery, Ines Chami, Vincent S. Chen, Sen Wu, Jared Dunnmon, Paroma Varma, Max Lam, and Chris Ré

Debugging Machine Learning - Reflections from DAWN Retreat

Sep 27, 2018 · Paroma Varma, Chris Ré, and other members of DAWN

Fonduer: Knowledge Base Construction from Richly Formatted Data

Mar 16, 2017 · Sen Wu, Luke Hsiao, and Chris Ré