Hyena — NVIDIA NeMo Framework User Guide (original) (raw)

Introduction to Hyena and Evo 2#

Introduction#

The Hyena architecture represents a significant advancement in neural network design, specifically in the form of convolutional multi-hybrid architectures. As described in the Hyena paper, these architectures provide substantial efficiency gains through co-designed convolution operators and hardware-aware algorithms, enabling faster training and inference compared to traditional Transformers. At the 40 billion parameter scale, Hyena-based models train 1.2 to 2.9 times faster than optimized Transformers, with the StripedHyena 2 architecture achieving two-fold throughput improvement over linear attention and state-space models on H100 GPUs.

Evo 2 is a powerful transformer-hyena hybrid architecture designed for biological sequence modeling. Trained on 9.3 trillion DNA base pairs spanning all domains of life, Evo 2 features an unprecedented 1 million token context window with single-nucleotide resolution. Available in 1B, 7B, and 40B parameter versions, it can accurately predict functional impacts of genetic variation without task-specific fine-tuning, autonomously learning biological features including exon-intron boundaries, transcription factor binding sites, and protein structural elements. The model also enables controllable generation of genomic sequences and epigenomic structure through inference-time search.

Hyena-Based Models#

Available Models#

The Hyena architecture is utilized in various models, with Evo 2 being a prominent example. Evo 2 is available in the following configurations:

Training Recipes#

We provide pre-defined recipes for pre-training and fine-tuning Hyena-based models using NeMo 2.0 and NeMo-Run. These recipes configure a run.Partial for one of the nemo.collections.llm api functions introduced in NeMo 2.0. The recipes are hosted in recipes folder (for example hyena_1b.py).

Pre-Training:

from nemo.collections import llm

For 1B model

pretrain_1b = llm.hyena_1b.pretrain_recipe( name="hyena_1b_pretraining", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallel_size=1, global_batch_size=8, micro_batch_size=1, vocab_file="/path/to/vocab.json", )

For 7B model

pretrain_7b = llm.hyena_7b.pretrain_recipe( name="hyena_7b_pretraining", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallelism=8, vocab_file="/path/to/vocab.json", )

For 40B model

pretrain_40b = llm.hyena_40b.pretrain_recipe( name="hyena_40b_pretraining", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallelism=8, vocab_file="/path/to/vocab.json", )

Configure and assign your dataloader

dataloader = a_function_that_configures_your_custom_dataset( gbs=8, # Adjust as needed for your model mbs=1, # Adjust as needed for your model seq_length=pretrain_1b.model.config.seq_length, # Use appropriate model ) pretrain_1b.data = dataloader # Assign to whichever model you're using

Fine-Tuning:

from nemo.collections import llm

For 1B model

finetune_1b = llm.hyena_1b.finetune_recipe( resume_path="/path/to/nemo/checkpoint", name="hyena_1b_finetuning", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallel_size=1, global_batch_size=8, micro_batch_size=1, vocab_file="/path/to/vocab.json", )

For 7B model

finetune_7b = llm.hyena_7b.finetune_recipe( resume_path="/path/to/nemo/checkpoint", name="hyena_7b_finetuning", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallelism=8, vocab_file="/path/to/vocab.json", )

For 40B model

finetune_40b = llm.hyena_40b.finetune_recipe( resume_path="/path/to/nemo/checkpoint", name="hyena_40b_finetuning", dir="/path/to/checkpoints", num_nodes=1, num_gpus_per_node=8, tensor_parallelism=8, vocab_file="/path/to/vocab.json", )

Configure and assign your dataloader

dataloader = a_function_that_configures_your_custom_dataset( gbs=8, # Adjust as needed for your model mbs=1, # Adjust as needed for your model seq_length=finetune_1b.model.config.seq_length, # Use appropriate model ) finetune_1b.data = dataloader # Assign to whichever model you're using

Note

For pre-training and fine-tuning, the recipes use placeholder datamodules for the data argument. You are expected to replace these with your custom dataset.

Note

The configuration in the recipes is done using the NeMo-Run run.Config and run.Partial configuration objects. Please review the NeMo-Run documentation to learn more about its configuration and execution system.

Running the Training:

Once you have your final configuration ready, you can execute it on any of the NeMo-Run supported executors:

import nemo_run as run

For pre-training - choose the appropriate model

run.run(pretrain_1b, executor=run.LocalExecutor()) # For 1B model

or

run.run(pretrain_7b, executor=run.LocalExecutor()) # For 7B model

or

run.run(pretrain_40b, executor=run.LocalExecutor()) # For 40B model

For fine-tuning - choose the appropriate model

run.run(finetune_1b, executor=run.LocalExecutor()) # For 1B model

or

run.run(finetune_7b, executor=run.LocalExecutor()) # For 7B model

or

run.run(finetune_40b, executor=run.LocalExecutor()) # For 40B model

Alternatively, you can run it directly in the same Python process:

Choose the appropriate model

run.run(pretrain_1b, direct=True) # For 1B pre-training

or

run.run(finetune_7b, direct=True) # For 7B fine-tuning

BioNeMo Integration with Evo 2#

NVIDIA’s BioNeMo Framework provides specialized support for Evo 2 models in genomics and biological applications. BioNeMo adapts the Hyena architecture specifically for biological sequence modeling tasks.

The BioNeMo Evo 2 documentation provides comprehensive details about:

Model architecture and capabilities
Available model variants (1B, 7B, and 40B)
Training diagnostics and benchmarks
Performance characteristics across different context lengths and cluster sizes
Zero-shot variant effect prediction for BRCA1 genes

For users interested in applying Evo 2 to their biological data, BioNeMo provides a fine-tuning tutorial that walks through:

Data preparation for genomic sequences
Fine-tuning process with biological datasets
Evaluation of model performance on biological tasks
Best practices for biological sequence modeling

The BioNeMo implementation achieves comparable or better accuracy than the original models, with the BioNeMo Evo 2 7B model reaching an AUROC of 0.87 on BRCA1 variant effect prediction tasks.