Semantic search

Vector search

Search GenAIWiki

Query the full knowledge graph. Results rank by semantic similarity across all six libraries.

Search results for “local embeddings”

Tutorials

Cold-start Embeddings for New Tenants

Learn how to implement cold-start embeddings to improve the onboarding experience for new tenants in multi-tenant applications. Prerequisites include basic understanding of embeddings and tenant management.

Best match

Cold-Start Embeddings for New Tenants in SaaS Applications

This tutorial covers strategies for implementing cold-start embeddings for new tenants in SaaS applications, focusing on leveraging existing data and models to generate initial embeddings. Prerequisites include familiarity with machine learning concepts and access to a dataset for training.

Best match

Embedding Drift Monitoring in Production

Learn how to implement embedding drift monitoring in production systems to ensure model reliability. Prerequisites include familiarity with machine learning models and data pipelines.

Embedding Drift Monitoring in Production for E-commerce

This tutorial covers how to implement embedding drift monitoring in production systems specifically for e-commerce applications. It focuses on detecting shifts in user behavior and product interactions that can affect recommendation systems. Prerequisites include familiarity with machine learning models and data pipelines.

Embedding Drift Monitoring in Production for Healthcare Applications

This tutorial covers the implementation of embedding drift monitoring in production systems for healthcare applications, ensuring model accuracy over time. Prerequisites include knowledge of machine learning models and monitoring techniques.

Embedding Drift Monitoring in Financial Services

Monitoring embedding drift is crucial for financial services to ensure model accuracy over time. Prerequisites include a data pipeline that captures embeddings and a monitoring framework.

Hybrid Search: BM25 + Dense Re-Ranking

This tutorial explores the integration of BM25 and dense re-ranking techniques to enhance search accuracy. Prerequisites include familiarity with information retrieval concepts and basic machine learning.

Embedding Drift Monitoring in Production for Financial Services

This tutorial focuses on techniques for monitoring embedding drift in production environments specifically tailored for financial services. Prerequisites include understanding of machine learning embeddings and production systems.

Cross-Encoder Re-Rankers at Scale for E-commerce Personalization

This tutorial covers the implementation of cross-encoder re-rankers to improve product recommendations in e-commerce platforms. Prerequisites include familiarity with machine learning concepts and access to a dataset of product interactions.

Cross-Encoder Re-Rankers at Scale

Understand how to implement cross-encoder re-rankers for large-scale information retrieval systems. Prerequisites include knowledge of ranking algorithms and machine learning.

Cross-Encoder Re-Rankers at Scale for Content Recommendation

This tutorial focuses on implementing cross-encoder re-rankers for large-scale content recommendation systems, emphasizing their performance and scalability. Prerequisites include experience with machine learning and recommendation systems.

Hybrid Search: BM25 + Dense Re-Ranking for Academic Research

This tutorial explores the integration of BM25 and dense re-ranking for enhancing academic search engines. Familiarity with information retrieval concepts is required.

Not finding exactly what you need?

Ask GenAIWiki →

Glossary

graph-embedding

A technique for transforming graph-structured data into a continuous vector space while preserving its properties.

Best match

autoencoder

An autoencoder is a type of neural network used for unsupervised learning of efficient representations.

Best match

convolutional-encoder

A neural network component that applies convolutional operations to extract features from input data.

variational-autoencoder

A generative model that learns to represent data in a latent space using variational inference.

graph-convolutional-network

A type of neural network designed to process data structured as graphs.

multi-modal-learning

An approach that integrates multiple types of data modalities to improve model performance.

distributed-learning

A machine learning paradigm where the training data is distributed across multiple devices or nodes.

energy-based-model

A probabilistic model that associates a scalar energy value with each configuration of variables to model distributions.

modalities

Different forms or types of data used in machine learning, such as text, images, or audio.

Tools

Redis Vector

Redis Vector Search extends Redis with vector similarity queries alongside familiar key, JSON, and search capabilities—useful when you already run Redis for caching or features and want co-located embeddings with low-latency hybrid retrieval without adding a separate database cluster.

Best match

Chroma

Chroma is an open-source embedding database designed for managing and searching embeddings efficiently. It provides robust performance with sub-100ms latency for retrieval tasks.

Best match

Hugging Face

Hub for open models, datasets, and Spaces demos, plus Inference Endpoints, Transformers, and enterprise features for teams that train, fine-tune, or serve open-weight and partner models at scale.

Weaviate

Open source vector database with hybrid search, metadata filtering, and flexible deployment options across self-hosted clusters and managed cloud environments.

Ollama

Local model runtime for running and serving open LLMs on developer machines and private infrastructure, with simple pull/run workflows and API access.

OpenAI Playground

Provider of widely used frontier model APIs for text, vision, and audio, with strong developer tooling and broad ecosystem adoption across production AI applications.

Pinecone

Managed vector database for semantic search and RAG systems with metadata filtering, namespaces, and cloud-hosted reliability for production retrieval workloads.

Vertex AI

Google Cloud Vertex AI is a managed platform for training, tuning, and serving models—including Gemini and partner models—with IAM integration, VPC-SC, and data residency options for enterprises that already standardize on Google Cloud for analytics and data lakes.

Qdrant

Vector database focused on high-performance similarity search with strong payload filtering, hybrid retrieval features, and both open-source and managed cloud options.

Fireworks AI

Fireworks AI offers fast, serverless inference APIs for leading open and proprietary models with a focus on low-latency chat and batch workloads, plus deployment options for teams standardizing on a single inference surface for production assistants and eval harnesses.

Milvus

An open-source vector database designed for high-performance similarity search and analysis of large-scale vector data. It handles millions of vectors efficiently with a query latency of under 100ms for similarity searches.

FAISS

FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It allows for millions of items to be searched with latency typically under 100ms for nearest neighbor searches.

Models

Gemini Flash

Gemini Flash focuses on fast inference with a 4k token limit, ideal for applications requiring quick responses while maintaining decent accuracy in language tasks.

Best match

LLaMA 3 8B

LLaMA 3 8B is a compact model with 8 billion parameters, designed for efficient text generation and understanding with a context window of 8k tokens.

Best match

LLaMA 3 70B

LLaMA 3 70B features 70 billion parameters and a context window of 32k tokens, optimized for high-performance text generation and understanding across diverse tasks.