Meta
MetaVision-2024
multimodal · Release Aug 12, 2024 · commercial
A multimodal AI model that integrates vision and language for comprehensive understanding and interactive applications.
Modalities
What goes in and what comes out.
Inputs
text, image
Outputs
text, image
Capabilities
image-text alignment, visual question answering, content generation
Benchmarks snapshot
Structured JSON for reproducible comparisons.
{}Related on GenAIWiki
Same provider, tooling that cites the model, or prompts tuned for it.
Meta
LLaMA 3 70B
LLaMA 3 70B features 70 billion parameters and a context window of 32k tokens, optimized for high-performance text generation and understanding across diverse tasks.
Meta
LLaMA 3 8B
LLaMA 3 8B is a compact model with 8 billion parameters, designed for efficient text generation and understanding with a context window of 8k tokens.
Meta
Llama 3.1 405B Instruct
Large open-weights instruct model competitive on reasoning and coding benchmarks with permissive licensing for customization.