Synthesizer Quickstart — SciPhi (original) (raw)
Synthesizer Quickstart#
Welcome to the Synthesizer quickstart guide! Synthesizer, or ΨΦ, is your portal to combining Retrieval-Augmented Generation (RAG) with large language models (LLMs) like OpenAI’s models, Anthropic, HuggingFace, and vLLM.
This guide will introduce you to:
- Using the RAG provider interface.
- Evaluating your RAG pipeline.
Let’s get started!
Setting Up Your Environment#
Before you start, ensure you’ve installed Synthesizer:
pip install sciphi-synthesizer
For additional details, refer to the installation guide.
Using Synthesizer#
- Generate synthetic question answer pairs
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
python -m synthesizer.scripts.data_augmenter run --dataset="wiki_qa"
tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl
{ "formatted_prompt": "... ### Question:\nwhat country did wine originate in\n\n### Input:\n1. URL: https://en.wikipedia.org/wiki/History%20of%20wine (Score: 0.85)\nTitle:History of wine....",
{ "completion": Wine originated in the South Caucasus, which is now part of modern-day Armenia ... - Evaluate RAG pipeline performance
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25
...
INFO:main:Now generating completions...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:29<00:00, 3.40it/s]
INFO:main:Final Accuracy=0.42
Note
This is a basic introduction to Synthesizer. Check back later for more detailed and intricate documentation that delves deeper into advanced features and customization options.
Developing with Synthesizer#
Here’s how you can use Synthesizer to quickly set up and RAG augmented generation, without diving deep into intricate configurations:
Requires a valid SCIPHI_API_KEY in env ...
Imports
from synthesizer.core import LLMProviderName, RAGProviderName from synthesizer.interface import ( LLMInterfaceManager, RAGInterfaceManager, ) from synthesizer.llm import GenerationConfig
RAG Provider Settings
rag_interface = RAGInterfaceManager.get_interface_from_args( RAGProviderName("agent-search"), limit_hierarchical_url_results=rag_limit_hierarchical_url_results, limit_final_pagerank_results=rag_limit_final_pagerank_results, ) rag_context = rag_interface.get_rag_context(query)
LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args( LLMProviderName("openai"), )
generation_config = GenerationConfig( model_name=llm_model_name, max_tokens_to_sample=llm_max_tokens_to_sample, temperature=llm_temperature, top_p=llm_top_p, # other generation params here ... )
formatted_prompt = raw_prompt.format(rag_context=rag_context) completion = llm_interface.get_completion( formatted_prompt, generation_config )