Vector search
Search GenAIWiki
Query the full knowledge graph. Results rank by semantic similarity across all six libraries.
Search results for “retrieval augmented generation RAG patterns”
Tutorials
16Graph RAG for Entity-Heavy Domains
Explore the use of Graph Retrieval-Augmented Generation (RAG) for domains with complex entities, requiring knowledge of graph databases and RAG techniques.
Best match
Graph RAG for Entity-Heavy Domains: A Practical Guide
This tutorial delves into using Graph RAG (Retrieval-Augmented Generation) techniques for domains rich in entities, such as legal and healthcare sectors. Prerequisites include understanding of RAG and graph database concepts.
Best match
Golden-Set Design for RAG Faithfulness
Understand how to design a golden set for evaluating the faithfulness of Retrieval-Augmented Generation (RAG) models. Prerequisites include familiarity with RAG systems and evaluation metrics.
Optimizing Golden-Set Design for RAG in Healthcare Applications
This tutorial covers the design of golden sets for ensuring RAG (Retrieval-Augmented Generation) faithfulness in healthcare applications. It requires an understanding of RAG principles and access to domain-specific datasets.
Golden-Set Design for RAG Faithfulness in Healthcare Applications
This tutorial focuses on designing golden sets for retrieval-augmented generation (RAG) systems in healthcare, ensuring the generated responses are faithful and reliable. Prerequisites include understanding RAG systems and familiarity with healthcare data.
Golden-Set Design for RAG Faithfulness in Financial Services
This tutorial discusses the design of golden sets to ensure the faithfulness of retrieval-augmented generation (RAG) systems in financial services. Prerequisites include experience with RAG systems and access to financial datasets.
Implementing Cost Controls in RAG: Batching vs Streaming Tokens for E-commerce
This tutorial provides a comprehensive guide on implementing cost controls in retrieval-augmented generation (RAG) systems, focusing on the balance between batching and streaming tokens in e-commerce applications. It covers the implications of each approach on performance and cost. Prerequisites include familiarity with RAG systems and token management.
Comparing Structured Outputs and JSON Mode for RAG in E-commerce
This tutorial examines the trade-offs between structured outputs and JSON mode in RAG systems tailored for e-commerce applications. It requires a basic understanding of RAG and JSON data formats.
Implementing Cost Controls in RAG: Batching vs Streaming Tokens in Financial Services
This tutorial explores the cost implications of batching versus streaming token usage in RAG systems for financial services. It requires familiarity with RAG tokenization and financial data processing.
Structured Outputs vs JSON Mode Tradeoffs in Financial Services
This tutorial explores the trade-offs between structured outputs and JSON mode in retrieval-augmented generation (RAG) systems specifically for financial services applications. It highlights how structured outputs can improve data integrity and ease of processing but may limit flexibility compared to JSON mode. Prerequisites include a basic understanding of RAG systems and their applications in finance.
Ensuring PII Handling in RAG Pipelines for Legal Firms
This tutorial focuses on best practices for handling Personally Identifiable Information (PII) in RAG pipelines within legal firms. It requires knowledge of legal compliance and data protection standards.
Ensuring PII Handling in RAG Pipelines for Healthcare Applications
This tutorial outlines best practices for handling Personally Identifiable Information (PII) in retrieval-augmented generation (RAG) pipelines within healthcare settings. It emphasizes the importance of compliance and security measures. Prerequisites include knowledge of healthcare data regulations and RAG systems.
Evaluating Tool-Calling Reliability Under Load in IT Support
This tutorial provides a framework for assessing the reliability of tool-calling in RAG systems under high load conditions, specifically for IT support applications. It requires knowledge of system performance metrics and load testing methodologies.
Agent Memory: Scratchpad vs Vector Store
This tutorial compares scratchpad memory and vector store memory in AI agents, focusing on their use cases and performance characteristics. Prerequisites include a basic understanding of AI memory architectures.
Pgvector Index Tuning (HNSW vs IVF)
Learn how to tune pgvector indexes using HNSW and IVF algorithms for optimal performance. Prerequisites include familiarity with PostgreSQL and vector databases.
Metadata Filters and ACL-aware Retrieval
Explore how to implement metadata filters and Access Control List (ACL)-aware retrieval in your applications. Prerequisites include knowledge of metadata management and ACL concepts.
Not finding exactly what you need?
Ask GenAIWiki →Tools
10Chroma
Chroma is an open-source embedding database designed for managing and searching embeddings efficiently. It provides robust performance with sub-100ms latency for retrieval tasks.
Best match
Supabase Vector
Postgres-based platform with pgvector support, managed database operations, and integrated auth/storage features for building retrieval-enabled full-stack applications.
Best match
Qdrant
Vector database focused on high-performance similarity search with strong payload filtering, hybrid retrieval features, and both open-source and managed cloud options.
FAISS
FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It allows for millions of items to be searched with latency typically under 100ms for nearest neighbor searches.
Redis Vector
Redis Vector Search extends Redis with vector similarity queries alongside familiar key, JSON, and search capabilities—useful when you already run Redis for caching or features and want co-located embeddings with low-latency hybrid retrieval without adding a separate database cluster.
Pinecone
Managed vector database for semantic search and RAG systems with metadata filtering, namespaces, and cloud-hosted reliability for production retrieval workloads.
LlamaIndex
Data framework for LLM applications focused on ingestion pipelines, indexing, retrieval, and query orchestration over private and enterprise content sources.
Weaviate
Open source vector database with hybrid search, metadata filtering, and flexible deployment options across self-hosted clusters and managed cloud environments.
AutoGen
AutoGen is a Microsoft Research–driven framework for building multi-agent conversations and tool-using agents with flexible conversation patterns—aimed at experimentation and production agents that coordinate LLMs, humans, and tools in complex flows.
Groq
GroqCloud offers very low-latency, high-throughput LLM inference using Groq’s LPU-style hardware, with OpenAI-compatible APIs for select open and partner models aimed at interactive and batch production workloads.
Glossary
5generative-adversarial-networks
A class of machine learning frameworks that generate new data samples via adversarial training.
Best match
generative-models
Models that can generate new data instances similar to the training data.
Best match
synthetic-data-generation
The process of creating artificial data that mimics real-world data for training machine learning models.
graph-attention-network
A neural network architecture that employs attention mechanisms to process graph-structured data.
adaptive-learning
A method where the system optimizes its learning process based on user interactions and performance.
Models
2Mixtral
Mixtral integrates large language processing with generative capabilities, managing up to 16,384 tokens while delivering high-quality content creation and response generation.
Best match
GPT-4 Turbo
GPT-4 Turbo is optimized for speed and efficiency, providing rapid text generation with a 16k token context window. It is designed for applications requiring fast responses without sacrificing quality.
Best match
Comparisons
2Gemini 1.5 Pro vs GPT-4o
Google’s long-context Gemini 1.5 Pro versus OpenAI’s GPT-4o: choose between multimodal + huge context (Gemini) and ubiquitous API + tool ecosystem (GPT-4o) for RAG and assistants.
Best match
Chroma vs Milvus
Chroma optimizes developer ergonomics for embedded and lightweight RAG; Milvus targets large-scale distributed vector search. Choose based on corpus size, team ops skills, and whether you need a cluster-scale engine from day one.
Best match