mims-harvard/TxAgent-T1-Llama-3.1-8B · Hugging Face (original) (raw)

TxAgent-T1 Model

Model Information

TxAgent-T1-Llama-3.1-8B is instruction tuned with TxAgent-Instruct dataset, a diverse synthetic multi-step reasoning and massive function call training dataset anchored in biomedical knowledge. TxAgent-T1-Llama-3.1-8B is finetuned based on Llama3.1-88-Instrcut.

Introduction

Precision therapeutics require multimodal adaptive models that generate personalized treatment recommendations. We introduce TxAgent, an AI agent that leverages multi-step reasoning and real-time biomedical knowledge retrieval across a toolbox of 211 tools to analyze drug interactions, contraindications, and patient-specific treatment strategies.

TxAgent evaluates how drugs interact at molecular, pharmacokinetic, and clinical levels, identifies contraindications based on patient comorbidities and concurrent medications, and tailors treatment strategies to individual patient characteristics, including age, genetic factors, and disease progression.
TxAgent retrieves and synthesizes evidence from multiple biomedical sources, assesses interactions between drugs and patient conditions, and refines treatment recommendations through iterative reasoning. It selects tools based on task objectives and executes structured function calls to solve therapeutic tasks that require clinical reasoning and cross-source validation.
The ToolUniverse consolidates 211 tools from trusted sources, including all US FDA-approved drugs since 1939 and validated clinical insights from Open Targets.

TxAgent outperforms leading LLMs, tool-use models, and reasoning agents across five new benchmarks: DrugPC, BrandPC, GenericPC, TreatmentPC, and DescriptionPC, covering 3,168 drug reasoning tasks and 456 personalized treatment scenarios.

It achieves 92.1% accuracy in open-ended drug reasoning tasks, surpassing GPT-4o by up to 25.8% and outperforming DeepSeek-R1 (671B) in structured multi-step reasoning.
TxAgent generalizes across drug name variants and descriptions, maintaining a variance of < 0.01 between brand, generic, and description-based drug references, exceeding existing tool-use LLMs by over 55%.

By integrating multi-step inference, real-time knowledge grounding, and tool- assisted decision-making, TxAgent ensures that treatment recommendations align with established clinical guidelines and real-world evidence, reducing the risk of adverse events and improving therapeutic decision-making.

Model Training and Data

To generate TxAgent-Instruct dataset, we construct three datasets—a tooling dataset, a comprehensive therapeutic question dataset, and a reasoning trace dataset—using the auxiliary agent systems. The tooling dataset consists of augmented versions of 211 tools from ToolUniverse, where each tool's description is randomly rephrased to enhance variability. This enables TxAgent to learn how to use new tools rather than simply memorizing those in ToolUniverse. The comprehensive therapeutic question dataset includes 85,340 therapeutic questions and functional instructions designed to train TxAgent's abilities. These are generated by the QuestionGen agent system. The reasoning trace dataset comprises 85,340 detailed reasoning traces for answering therapeutic questions. These traces collectively encompass 177,626 reasoning steps and 281,695 function calls, all generated by the TraceGen agent system. By processing the data from these three datasets, we construct TxAgent-Instruct dataset, which comprises 378,027 instruction-tuning data samples. The agent systems generate training data grounded in biomedical knowledge and therapeutic scenarios by randomly sampling drugs and disease information from verified sources. Drug information is sourced from FDA drug labeling documents, while the disease list is derived from PrimeKG. Associations among drugs, diseases, phenotypes, and targets are compiled from Open Targets. To avoid any leakage of evaluation data into the training set, we exclude all drugs approved after 2023 from the training set and use drugs approved in 2024 as the source of evaluation data.

Please refer to our project page for more details: Project Page.

How to use

Install ToolUniverse:

# Install from source code:
git clone https://github.com/mims-harvard/ToolUniverse.git
cd ToolUniverse
python -m pip install . --no-cache-dir

# Install from pypi:
pip install tooluniverse

Install TxAgent:

# Install from source code:
git clone https://github.com/mims-harvard/TxAgent.git

python -m pip install . --no-cache-dir
# Install from pypi:
pip install txagent

Run the example with the run_example.py.

Run the gradio demo with the run_txagent_app.py.

Citation

@misc{gao2025txagent,
      title={TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools}, 
      author={Shanghua Gao and Richard Zhu and Zhenglun Kong and Ayush Noori and Xiaorui Su and Curtis Ginder and Theodoros Tsiligkaridis and Marinka Zitnik},
      year={2025},
      eprint={2503.10970},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2503.10970}, 
}

Contact

If you have any questions or suggestions, please email Shanghua Gao and Marinka Zitnik.