Grounded search

Search GenAIWiki

Search across models, tools, comparisons, tutorials, and glossary entries — with sources shown.

GenAIWiki

Grounded AI answer — wiki index sources only

Searching GenAIWiki index…

Full results from the index

All matches for “local LLM”, grouped by content type.

Tools

Fireworks AI

Fireworks AI offers fast, serverless inference APIs for leading open and proprietary models with a focus on low-latency chat and batch workloads, plus deployment options for teams standardizing on a single inference surface for production assistants and eval harnesses.

Strong match

Groq

GroqCloud offers very low-latency, high-throughput LLM inference using Groq’s LPU-style hardware, with OpenAI-compatible APIs for select open and partner models aimed at interactive and batch production workloads.

Strong match

Ollama

Local model runtime for running and serving open LLMs on developer machines and private infrastructure, with simple pull/run workflows and API access.

Together AI

Inference platform for open-source and frontier model APIs with broad model catalog coverage, cost controls, and production endpoints for text and multimodal workloads.

Want a cited narrative answer?

Ask GenAIWiki →

Glossary

Indic LLM

An Indic LLM is a language model optimized for Indian languages, scripts, romanized text, code-mixing, and India-specific cultural or domain context.

Strong match

LLM evaluation

LLM evaluation measures whether a model or AI workflow is accurate, useful, safe, reliable, and cost-effective for a target task.

Strong match

Large language model

A large language model, or LLM, is a neural text model trained on large corpora to predict, generate, transform, and reason over language and code.

Sarvam AI

Sarvam AI is an India-based sovereign AI platform focused on Indian-language LLMs, speech, translation, document digitization, and enterprise AI agents.

distributed-learning

A machine learning paradigm where the training data is distributed across multiple devices or nodes.

Prompts

Experiment Design for A/B LLM - Advanced

In-depth guide for designing A/B tests specifically for large language models.

Strong match

Technical Workshop Lesson Plan

An organized lesson plan template for conducting technical workshops on LLMs and their applications.

Strong match

Dataset Card Draft for LLM Training (Advanced)

An advanced template for creating detailed dataset cards focusing on comprehensive metadata for LLM training datasets.

Dataset Card Draft for LLM Training

Specific guidelines for creating dataset cards for LLM training datasets.

Experiment Design for A/B LLM

A structured approach to designing experiments for A/B testing in language models.

Dataset Card Draft

Tutorials

Observability: Traces for LLM + Tool Spans

Implementing observability practices to trace interactions between large language models (LLMs) and external tools. Prerequisites include knowledge of observability tools and LLM architectures.

Strong match

Metadata Filters and ACL-Aware Retrieval in Legal Document Management

This tutorial outlines the implementation of metadata filters and Access Control List (ACL)-aware retrieval systems in legal document management applications. Prerequisites include knowledge of legal data structures and basic programming skills.

Strong match

Build Indian-Language LLM Apps with Sarvam AI

A practical guide to choosing Sarvam 30B vs Sarvam 105B, handling code-mixed Indian-language input, and evaluating Sarvam LLMs for production apps.

Enhancing Observability with Traces for LLM and Tool Spans in Data Pipelines

This tutorial focuses on enhancing observability in data pipelines that utilize large language models (LLMs) by implementing tracing for both LLM and tool spans. Prerequisites include familiarity with observability concepts and experience with LLMs.

Models

Llama 3.1 405B Instruct

Meta’s largest open-weights instruct checkpoint in the Llama 3.1 family, aimed at strong reasoning and coding quality with a permissive license for research and customization. It is typically served on dedicated GPU clusters or via partners (cloud inference, on-prem) rather than a single vendor API.

Strong match

Gemini 1.5 Pro

Google DeepMind Gemini 1.5 Pro targets long-context multimodal workloads—large effective context for retrieval-heavy document pipelines, plus image, audio, and video inputs on supported surfaces. It is often paired with Vertex AI or the Gemini API for enterprise workloads on GCP.

Strong match

Claude 3.5 Sonnet

Anthropic’s balanced Sonnet-tier model tuned for long-context reasoning, careful instruction following, and strong performance on coding and analysis workloads. It is a common enterprise choice on the Anthropic API and on AWS Bedrock when teams need large context for RAG and document review.

GPT-5.4

OpenAI's GPT-5.4 model, documented in the official OpenAI API model guide as part of the current GPT-5 family below the GPT-5.5 flagship lane.

DeepSeek-V3