Grounded search
Search GenAIWiki
Search across models, tools, comparisons, tutorials, and glossary entries — with sources shown.
GenAIWiki
·Grounded AI answer — wiki index sources only
Searching GenAIWiki index…
Grounded search
Search across models, tools, comparisons, tutorials, and glossary entries — with sources shown.
GenAIWiki
·Grounded AI answer — wiki index sources only
Searching GenAIWiki index…
All matches for “local LLM”, grouped by content type.
Fireworks AI
Fireworks AI offers fast, serverless inference APIs for leading open and proprietary models with a focus on low-latency chat and batch workloads, plus deployment options for teams standardizing on a single inference surface for production assistants and eval harnesses.
Strong match
Groq
GroqCloud offers very low-latency, high-throughput LLM inference using Groq’s LPU-style hardware, with OpenAI-compatible APIs for select open and partner models aimed at interactive and batch production workloads.
Strong match
Ollama
Local model runtime for running and serving open LLMs on developer machines and private infrastructure, with simple pull/run workflows and API access.
Together AI
Inference platform for open-source and frontier model APIs with broad model catalog coverage, cost controls, and production endpoints for text and multimodal workloads.
LangChain
Application framework for orchestrating LLM workflows, tool calling, retrieval, and agents across multiple providers in Python and TypeScript ecosystems.
LlamaIndex
Data framework for LLM applications focused on ingestion pipelines, indexing, retrieval, and query orchestration over private and enterprise content sources.
Want a cited narrative answer?
Ask GenAIWiki →Experiment Design for A/B LLM - Advanced
In-depth guide for designing A/B tests specifically for large language models.
Strong match
Technical Workshop Lesson Plan
An organized lesson plan template for conducting technical workshops on LLMs and their applications.
Strong match
Dataset Card Draft for LLM Training (Advanced)
An advanced template for creating detailed dataset cards focusing on comprehensive metadata for LLM training datasets.
Dataset Card Draft for LLM Training
Specific guidelines for creating dataset cards for LLM training datasets.
Experiment Design for A/B LLM
A structured approach to designing experiments for A/B testing in language models.
Dataset Card Draft
Observability: Traces for LLM + Tool Spans
Implementing observability practices to trace interactions between large language models (LLMs) and external tools. Prerequisites include knowledge of observability tools and LLM architectures.
Strong match
Metadata Filters and ACL-Aware Retrieval in Legal Document Management
This tutorial outlines the implementation of metadata filters and Access Control List (ACL)-aware retrieval systems in legal document management applications. Prerequisites include knowledge of legal data structures and basic programming skills.
Strong match
Enhancing Observability with Traces for LLM and Tool Spans in Data Pipelines
This tutorial focuses on enhancing observability in data pipelines that utilize large language models (LLMs) by implementing tracing for both LLM and tool spans. Prerequisites include familiarity with observability concepts and experience with LLMs.
Multimodal Prompts for Document QA in Legal Settings
Using multimodal prompts can improve document question answering (QA) in legal contexts. Prerequisites include access to relevant legal documents and a model capable of processing multimodal inputs.
Llama 3.1 405B Instruct
Meta’s largest open-weights instruct checkpoint in the Llama 3.1 family, aimed at strong reasoning and coding quality with a permissive license for research and customization. It is typically served on dedicated GPU clusters or via partners (cloud inference, on-prem) rather than a single vendor API.
Strong match
Gemini 1.5 Pro
Google DeepMind Gemini 1.5 Pro targets long-context multimodal workloads—large effective context for retrieval-heavy document pipelines, plus image, audio, and video inputs on supported surfaces. It is often paired with Vertex AI or the Gemini API for enterprise workloads on GCP.
Strong match
Claude 3.5 Sonnet
Anthropic’s balanced Sonnet-tier model tuned for long-context reasoning, careful instruction following, and strong performance on coding and analysis workloads. It is a common enterprise choice on the Anthropic API and on AWS Bedrock when teams need large context for RAG and document review.
DeepSeek-V3
DeepSeek-V3 is a large-scale language model family noted for strong coding and math performance under open or research-friendly terms (verify the exact license for your deployment). Teams adopt it for cost-sensitive research, self-hosted inference, or comparison against frontier APIs.
distributed-learning
A machine learning paradigm where the training data is distributed across multiple devices or nodes.
Strong match
quantum-machine-learning
An interdisciplinary approach merging quantum computing with machine learning techniques.
Strong match
few-shot-learning
A machine learning paradigm that trains models with very few labeled examples.
A standardized template for documenting dataset characteristics, usage, and limitations for LLM training.
Mistral Large 2
Mistral’s frontier-class multilingual model emphasizing JSON adherence, agent-friendly behavior, and competitive reasoning within the Mistral API ecosystem. European teams often evaluate it for GDPR-adjacent deployment patterns alongside US-hosted alternatives.