Vector search
Search GenAIWiki
Query the full knowledge graph. Results rank by semantic similarity across all six libraries.
Search results for “local LLM”
Prompts
12Experiment Design for A/B LLM - Advanced
In-depth guide for designing A/B tests specifically for large language models.
Best match
Technical Workshop Lesson Plan
An organized lesson plan template for conducting technical workshops on LLMs and their applications.
Best match
Dataset Card Draft for LLM Training (Advanced)
An advanced template for creating detailed dataset cards focusing on comprehensive metadata for LLM training datasets.
Dataset Card Draft for LLM Training
Specific guidelines for creating dataset cards for LLM training datasets.
Experiment Design for A/B LLM
A structured approach to designing experiments for A/B testing in language models.
Dataset Card Draft
A standardized template for documenting dataset characteristics, usage, and limitations for LLM training.
A/B Testing Experiment Design
A structured template to design A/B tests for LLM applications, ensuring consistency in experiment setup.
ML Interview Evaluation Framework
A structured rubric for evaluating candidates in machine learning roles.
Localization Glossary Review (Duplicate)
A duplicate of the previous glossary review entry, focusing on terminology consistency.
HR Policy Q&A Framework with Citations
A framework for generating HR policy-related questions and answers with references to legal statutes or company guidelines.
ML Role Interview Rubric
A structured rubric designed for evaluating candidates in machine learning roles.
Contract Clause Extraction
Extract and summarize key clauses from legal contracts for easier review.
Not finding exactly what you need?
Ask GenAIWiki →Tools
7Ollama
Local model runtime for running and serving open LLMs on developer machines and private infrastructure, with simple pull/run workflows and API access.
Best match
LangChain
Application framework for orchestrating LLM workflows, tool calling, retrieval, and agents across multiple providers in Python and TypeScript ecosystems.
Best match
LlamaIndex
Data framework for LLM applications focused on ingestion pipelines, indexing, retrieval, and query orchestration over private and enterprise content sources.
OpenAI Playground
Provider of widely used frontier model APIs for text, vision, and audio, with strong developer tooling and broad ecosystem adoption across production AI applications.
LangGraph
LangGraph is a library for building stateful, cyclic agent and workflow graphs on top of LangChain—suited to multi-step tools, human-in-the-loop approvals, and durable execution patterns that go beyond linear chains.
Groq
GroqCloud offers very low-latency, high-throughput LLM inference using Groq’s LPU-style hardware, with OpenAI-compatible APIs for select open and partner models aimed at interactive and batch production workloads.
Hugging Face
Hub for open models, datasets, and Spaces demos, plus Inference Endpoints, Transformers, and enterprise features for teams that train, fine-tune, or serve open-weight and partner models at scale.
Tutorials
9Observability: Traces for LLM + Tool Spans
Implementing observability practices to trace interactions between large language models (LLMs) and external tools. Prerequisites include knowledge of observability tools and LLM architectures.
Best match
Metadata Filters and ACL-Aware Retrieval in Legal Document Management
This tutorial outlines the implementation of metadata filters and Access Control List (ACL)-aware retrieval systems in legal document management applications. Prerequisites include knowledge of legal data structures and basic programming skills.
Best match
Enhancing Observability with Traces for LLM and Tool Spans in Data Pipelines
This tutorial focuses on enhancing observability in data pipelines that utilize large language models (LLMs) by implementing tracing for both LLM and tool spans. Prerequisites include familiarity with observability concepts and experience with LLMs.
Multimodal Prompts for Document QA in Legal Settings
Using multimodal prompts can improve document question answering (QA) in legal contexts. Prerequisites include access to relevant legal documents and a model capable of processing multimodal inputs.
Chunking Strategies for Legal PDFs: Improving Document Retrieval
This tutorial focuses on optimizing chunking strategies for legal documents to enhance retrieval accuracy. Prerequisites include familiarity with document processing and retrieval systems.
SLI/SLO for Generative Endpoints
Establishing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for generative endpoints is crucial for maintaining quality and reliability. This tutorial outlines how to define and implement SLIs/SLOs effectively.
Canary Prompts for Regression Detection
Utilizing canary prompts to detect regressions in language models. Prerequisites include familiarity with regression testing and LLM evaluation metrics.
Chunking Strategies for Legal/Medical PDFs
Learn effective chunking strategies for processing legal and medical PDFs to enhance information retrieval. Prerequisites include familiarity with PDF processing and natural language processing concepts.
Ensuring PII Handling in RAG Pipelines for Legal Firms
This tutorial focuses on best practices for handling Personally Identifiable Information (PII) in RAG pipelines within legal firms. It requires knowledge of legal compliance and data protection standards.
Models
5LLaMA 3 70B
LLaMA 3 70B features 70 billion parameters and a context window of 32k tokens, optimized for high-performance text generation and understanding across diverse tasks.
Best match
LLaMA 3 8B
LLaMA 3 8B is a compact model with 8 billion parameters, designed for efficient text generation and understanding with a context window of 8k tokens.
Best match
Mistral Large
Mistral Large supports up to 16k tokens with a response latency of 150ms, targeting enterprise-level applications and complex document understanding.
Claude 3 Opus
Claude 3 Opus enhances AI's conversational abilities with a broader understanding of context and intent, featuring a context window of 16k tokens for improved engagement in dialogues.
Gemini Flash
Gemini Flash focuses on fast inference with a 4k token limit, ideal for applications requiring quick responses while maintaining decent accuracy in language tasks.
Glossary
3distributed-learning
A machine learning paradigm where the training data is distributed across multiple devices or nodes.
Best match
quantum-machine-learning
An interdisciplinary approach merging quantum computing with machine learning techniques.
Best match
few-shot-learning
A machine learning paradigm that trains models with very few labeled examples.