Optimization
Prompt caching
Prompt caching reuses a previously processed prompt prefix so repeated requests with the same context can be faster and cheaper.
Expanded definition
Prompt caching is useful when many requests share the same long system prompt, tool definitions, examples, documents, or conversation prefix. The provider caches a prefix, then future calls can resume from that cached state rather than reprocessing the entire repeated context. It is especially valuable for agent workflows with stable tool definitions, long reference documents, or repeated task instructions. Teams still need to understand cache lifetime, invalidation, prefix ordering, privacy policy, and first-request latency.
Related terms
Explore adjacent ideas in the knowledge graph.
Related
Comparisons, tools, and models that connect to this idea.