Knowledge Graph
AI Glossary
137 terms across 8 clusters. Every concept mapped to the decisions it informs.
Adaptive thinking is how Claude automatically decides how much reasoning effort to apply to a given question. Simple questions get fast, direct answers. Complex or ambiguous questions trigger more deliberate internal reasoning before a response. You don't have to configure this — Claude adjusts on its own. The practical result is that you get appropriate depth without having to manually switch modes or prompt for step-by-step reasoning.
The mechanisms by which an agent retains and retrieves information across steps or sessions — short-term (in-context) and long-term (external storage).
The Agent SDK is Anthropic's framework for building autonomous AI agents that use Claude as their brain. It handles the mechanics of an agent loop — giving Claude tools, tracking what it's done, deciding what to do next — so you can focus on what the agent should accomplish. Use it when you're building something where Claude needs to complete multi-step tasks on its own, not just answer a question.
A workflow where an AI agent handles multi-step tasks autonomously — replacing or augmenting human-driven processes with sequences of AI decisions and actions.
The organizational process of integrating AI into workflows and culture — where most AI initiatives actually fail, not in the technology.
An AI system that can take sequences of actions toward a goal — using tools, making decisions, and operating with varying degrees of autonomy.
Using AI to enhance human capabilities rather than replace human roles — the framing that consistently drives higher adoption rates and better outcomes than automation-first messaging.
Systematic skew in AI outputs that unfairly advantages or disadvantages particular groups — often inherited from training data and amplified by model behavior.
A centralized team that sets AI standards, shares best practices, and coordinates deployment across an organization — effective at scale, often premature for early-stage adoption.
Ensuring AI systems meet legal, regulatory, and industry-specific requirements — increasingly complex as AI regulations expand globally.
The property of AI systems treating individuals and groups equitably — a multidimensional concept with competing mathematical definitions that cannot all be satisfied simultaneously.
Organizational policies, processes, and structures for managing AI development and use — who decides what AI can do, how decisions are reviewed, and how accountability is assigned.
The technical work of connecting AI capabilities into existing systems, databases, and workflows — where good prompts meet real data and production constraints.
The baseline understanding of AI capabilities and limitations needed for effective use — the gap between AI literacy and actual deployment is where most enterprise value is lost.
A contained test of an AI solution before full deployment — valuable for proving ROI but frequently misstructured in ways that don't predict production success.
The return on investment from AI initiatives — notoriously difficult to measure because most ROI frameworks weren't designed for probabilistic, continuously-improving systems.
The field and practice of ensuring AI systems behave as intended without causing harm — spanning technical alignment research to deployment-level guardrails.
Anthropic's framework for classifying model risk levels and corresponding safety commitments — a structured approach to responsible capability scaling.
A deliberate plan for how an organization will use AI to create competitive advantage — distinguishing tactical tool adoption from structural capability building.
The structured process of identifying where AI creates genuine business value versus where it creates complexity — the highest-leverage early activity in any AI program.
The challenge of ensuring AI systems pursue the goals humans actually want rather than proxy goals that diverge at scale — a core research problem and practical deployment concern.
The AI safety company founded in 2021 that builds Claude — notable for Constitutional AI research, mechanistic interpretability work, and publishing the AI Safety Level (ASL) framework.
Anthropic's developer platform for testing prompts, managing API keys, running evaluations, and accessing usage analytics.
Anthropic's program for businesses building on Claude — providing technical support, go-to-market resources, and early access to new capabilities.
The programmatic interface for sending requests to AI models — enabling developers to integrate AI capabilities into applications without running their own models.
Artifacts are self-contained outputs Claude creates in a separate panel alongside your conversation — websites, interactive tools, code files, documents, data visualisations. Instead of just getting text you copy-paste, you get a working thing you can preview, edit, and download. Useful for anything that's more than a message: a calculator, a form, a landing page, a spreadsheet template.
The core innovation in modern LLMs — allows the model to weigh the relevance of every word against every other word in context, rather than reading sequentially.
Processing multiple AI requests together rather than individually — trading latency for cost efficiency, ideal for non-real-time workloads.
A standardized test used to compare model capabilities — useful for general capability comparison but often misaligned with real-world task performance.
The decision between building custom AI solutions versus using off-the-shelf products — rarely binary in practice, usually a spectrum of customization on top of foundation models.
A prompting technique that instructs the model to 'think step by step' before answering — improving accuracy on complex reasoning tasks by externalizing the reasoning process.
The organizational work of helping people adapt to AI-changed workflows — the most consistently underestimated cost in AI deployment projects.
An executive role responsible for organizational AI strategy and governance — emerging rapidly as AI moves from IT experiment to strategic priority.
Splitting documents into segments before indexing — a seemingly simple decision with major impact on retrieval quality. Chunk too large: low precision. Too small: lost context.
Anthropic's AI assistant and API — built with Constitutional AI for safety, known for long context, strong reasoning, and instruction-following. The focus of this knowledge hub.
Claude Code is a command-line tool that brings Claude directly into your development workflow. You run it in your terminal, point it at a codebase, and Claude can read files, write code, run commands, and navigate your project — all in conversation. It's Claude as a pair programmer that can actually see and touch your work, not just read what you paste in.
Claude is available across several pricing tiers: Free (basic access to Sonnet, limited usage), Pro ($20/month — higher limits, all models, Cowork, Dispatch), Max ($100/month — very high usage limits), Team Standard ($25/seat/month — admin tools, shared Projects, 5-seat minimum), Team Premium ($125/seat/month — includes Claude Code), and Enterprise (custom pricing — SSO/SCIM, HIPAA-ready, 500K context, full compliance controls). The right plan depends on how many people need access and what level of admin control you need.
Anthropic's consumer and business web interface for Claude — distinct from the API, offering Projects, file uploads, and team collaboration features.
Computer Use is the underlying capability that lets Claude see and interact with a computer screen — clicking buttons, typing text, navigating applications, reading what is displayed. It is the technology behind Cowork and Dispatch. For developers building with the API, Computer Use means you can give Claude a virtual desktop and have it operate software the same way a human would — useful for automating workflows in applications that don't have APIs.
Connectors link Claude.ai to external services — Google Drive, Dropbox, Jira, GitHub, Notion, and others. Once you connect a service, Claude can read documents from it, search it, or take actions within it during your conversation. It's how you give Claude access to your actual work, not just what you paste in.
Anthropic's approach to training Claude using a set of written principles (a 'constitution') to guide model behavior — reducing reliance on human labeling for harmlessness.
Automated or human-assisted filtering of AI inputs and outputs to detect and block harmful content — a deployment requirement in consumer-facing applications.
The context window is how much text Claude can hold in its attention at once — your conversation history, any documents you've shared, system instructions, and Claude's own responses. Once you hit the limit, older content gets pushed out. Claude 3.7 Sonnet has a 200,000-token context window, which is roughly 150,000 words. Long contexts let Claude analyse full documents or maintain long conversations — but they also cost more and can sometimes cause Claude to lose focus on earlier material.
Strategies for reducing AI infrastructure costs — spanning model selection, prompt efficiency, caching, batching, and routing — the unsexy work that determines whether AI projects are viable.
Cowork is Claude working directly on your computer — controlling your mouse, keyboard, browser, and applications to complete tasks alongside you. It can navigate websites, fill in forms, move files, use desktop apps, and work through multi-step workflows on your actual screen. Available on macOS and Windows for all paid Claude plans. It turns Claude from something you chat with into something that does the work with you.
An AI-native code editor built on VS Code — widely considered the most productive AI coding tool for professional developers.
Deep Research is Claude spending extended time crawling the web, reading multiple sources, and producing a comprehensive, cited research report on a topic. Unlike a quick web search that returns a few results, Deep Research follows leads across dozens of pages, cross-references sources, and synthesises everything into a structured document with citations. Use it when you need thorough research — competitive analysis, market landscape, technical evaluations — not just a quick answer.
Dispatch lets you assign tasks to Claude from your phone or the web, and Claude picks them up on your desktop to execute. You write what you need done — "prepare the weekly report", "research these three companies" — and Claude works through it on your computer while you do something else. The task and its progress persist across devices. It is the async version of Cowork: instead of working alongside you in real time, Claude works independently and reports back.
The process of processing, chunking, embedding, and storing documents so they can be retrieved later — the build phase of any RAG system.
A numerical representation of text (or other data) in high-dimensional space — where similar meanings cluster together. The foundation of semantic search.
Stored records of specific past interactions or events — enabling agents to recall what happened in previous sessions rather than just general knowledge.
Systematic tests for measuring model performance on specific tasks — the AI equivalent of unit tests, and the most underdeveloped practice in enterprise AI adoption.
The ability to describe why an AI system produced a particular output in human-understandable terms — required in regulated industries and high-stakes decisions.
Extended thinking is a mode where Claude works through a problem step-by-step before giving its final answer. You see its reasoning process — including the parts where it changes its mind or catches an error. It takes longer and costs more tokens, but produces meaningfully better results on hard problems: complex analysis, multi-step planning, tricky reasoning tasks. Turn it on when accuracy matters more than speed.
Providing example input-output pairs in the prompt to demonstrate the desired behavior — often significantly improving performance without any fine-tuning.
Continuing to train a pre-trained model on a smaller domain-specific dataset to improve performance on targeted tasks.
A large model trained on broad data at scale that can be adapted to a wide range of downstream tasks without being retrained from scratch.
Microsoft's AI coding assistant integrated into GitHub — the first mass-market AI developer tool and still the most widely deployed in enterprise engineering teams.
Google DeepMind's LLM family — notable for deep integration with Google Workspace and its multimodal capabilities.
OpenAI's flagship model family — the most widely deployed LLM in enterprise contexts and the benchmark most organizations use when evaluating alternatives.
Connecting model outputs to verified external sources or real-time data to reduce hallucination and increase factual reliability.
Technical and policy constraints that limit AI system behavior — preventing harmful outputs, enforcing scope, and maintaining compliance in production.
Hallucination is when an AI model states something confidently that isn't true. It might invent a citation, get a date wrong, describe a product feature that doesn't exist, or fill in gaps in its knowledge with plausible-sounding guesses. It happens because language models are trained to produce fluent text — not to verify facts before speaking. Knowing when to trust Claude's output versus when to ground it in real sources is one of the most important skills for operators.
The central hub for open-source AI models, datasets, and tooling — effectively the GitHub of the AI ecosystem.
A design pattern where humans review or approve AI decisions at critical points — balancing automation speed with oversight for high-stakes actions.
Combining keyword-based (BM25) and semantic (embedding) search — typically outperforming either method alone, especially for technical or domain-specific content.
A fine-tuning approach where models are trained on (instruction, response) pairs — the technique that transformed base LLMs into useful assistants.
Understanding the internal computations of neural networks — a research frontier at Anthropic aimed at understanding what's actually happening inside models.
An open-source framework for building LLM-powered applications — popular for RAG and agent workflows, though often criticized for adding complexity without proportional value.
A neural network trained on massive text datasets to predict and generate language — the core engine behind tools like Claude, GPT-4, and Gemini.
The time between sending a request and receiving a response — a critical UX factor that shapes whether AI features feel fluid or frustrating.
An open-source data framework optimized for connecting LLMs to external data — particularly strong for document indexing and retrieval pipeline construction.
Operational practices for deploying and managing LLM-based applications in production — encompassing monitoring, evaluation, versioning, and cost management.
Using a model's extended context window to process large documents directly — an alternative to RAG when document sets are manageable and latency is acceptable.
Persistent storage of information across agent sessions — enabling continuity, personalization, and compound learning over time.
A parameter-efficient fine-tuning technique that trains only small adapter matrices rather than all model weights — dramatically reducing compute and storage requirements.
Managed Agents are cloud-hosted AI agents that Anthropic runs for you. Instead of building and hosting your own agent infrastructure, you define what the agent should do and Anthropic handles execution — secure sandboxing, long-running sessions, tool access, and scaling. Think of it as Claude-as-a-worker: you assign a task, it runs in the cloud, and you get the result. Currently in public beta at $0.08/session-hour plus token costs.
Memory is Claude remembering things about you across conversations. Your preferences, your role, your projects, decisions you have made — Claude retains this context and applies it automatically in future chats. You don't need to re-explain who you are or what you're working on every time. Available on all Claude tiers including Free. You can view and manage what Claude remembers in your settings.
Meta's open-weight LLM family — the foundation for most open-source fine-tuning work and self-hosted AI deployments.
Using an LLM to write or improve prompts — automating prompt engineering at scale.
An architecture where different subsets of model parameters ('experts') activate for different inputs — enabling very large models without proportionally large compute costs.
MCP (Model Context Protocol) is an open standard that defines how AI models connect to external data sources and tools. Instead of every developer building their own custom integration, MCP gives a universal interface: if a tool supports MCP, any MCP-compatible AI can use it. Anthropic created and open-sourced it. In practice, MCP is what makes it possible for Claude to connect to your file system, your database, your APIs — in a structured, secure way.
Training a smaller 'student' model to replicate the behavior of a larger 'teacher' model — producing smaller faster models that retain most of the original's capability.
The process of running a trained model to generate outputs from new inputs — what happens every time you send a message to an AI.
Directing requests to different models based on complexity, cost, or capability — using cheaper models for simple tasks and reserving powerful models for complex ones.
Managing multiple versions of models and prompts in production — critical when model updates change behavior and you need rollback capability.
An architecture where multiple AI agents collaborate — each handling specialized tasks — coordinated by an orchestrator to accomplish complex goals.
A model that can process and generate multiple types of data — typically text and images, sometimes also audio, video, or code.
Visibility into what AI systems are doing in production — tracking inputs, outputs, costs, latency, and errors to detect drift and diagnose failures.
AI models released with public weights that can be run, modified, and deployed without API dependencies — a strategic alternative to proprietary APIs for cost, privacy, or control reasons.
The AI company behind GPT and ChatGPT — the organization that catalyzed mainstream AI adoption with ChatGPT's November 2022 launch.
The coordination layer that routes tasks to appropriate agents or tools — deciding what runs when, in what order, and how outputs feed into subsequent steps.
The numerical values inside a model that encode learned knowledge — a 70 billion parameter model has 70 billion tunable values. More parameters ≠ always better.
An AI-powered search engine that retrieves and synthesizes real-time web content — demonstrating RAG at consumer scale.
The process by which an agent determines the sequence of steps needed to achieve a goal — ranging from simple linear plans to dynamic replanning based on observations.
The initial training phase where a model learns general language patterns from vast datasets — before any task-specific adaptation.
Projects are persistent workspaces inside Claude.ai. Each project gets its own custom instructions, its own set of documents and files, and its own conversation history. Think of it as a dedicated assistant for a specific purpose — your marketing project brief, your company's support agent, your engineering runbook. Everything stays in one place and carries forward across sessions.
The input text you send to a model — the primary interface through which humans communicate intent to AI systems.
Prompt caching saves a portion of your prompt — typically a large system prompt or document — so Claude doesn't need to re-process it on every request. The first call includes the full context; subsequent calls reference the cached version. The practical effect: faster responses and lower API costs when you're repeatedly sending the same long context, which is common in production applications.
Connecting multiple prompts sequentially where each output feeds the next — enabling complex workflows without full agent autonomy.
Organizational processes for managing, versioning, reviewing, and controlling prompts used in production — the missing layer in most enterprise AI deployments.
An attack where malicious content in the environment (documents, web pages, tool outputs) hijacks an agent's instructions — a critical security concern for production agents.
Systematically improving prompts based on evaluation results — treating prompts as code that needs testing and iteration, not just writing.
A reusable prompt structure with placeholders for variable inputs — the foundation of any systematic AI workflow.
RAG (Retrieval-Augmented Generation) is an architecture where Claude retrieves relevant information from an external source before generating a response. Instead of relying purely on training data, Claude searches your documents, database, or knowledge base to find the right context, then uses that to answer. The result: Claude can answer questions about your specific products, policies, or data — things it couldn't have known from training alone.
Constraints on how many API requests can be made per unit time — a practical ceiling that affects system design for high-volume production deployments.
A prompting framework that interleaves reasoning traces with tool-use actions — letting agents think about what to do, do it, observe results, and reason further.
Deliberately attempting to elicit harmful, incorrect, or unintended behavior from an AI system — an adversarial evaluation practice borrowed from cybersecurity.
A second-pass step that reorders retrieved documents by relevance before passing them to the model — significantly improving RAG precision at modest cost.
A framework for developing and deploying AI in ways that are ethical, accountable, transparent, and fair — increasingly a regulatory and stakeholder expectation.
A training technique where human raters score model outputs and those preferences are used to steer model behavior toward more helpful and less harmful responses.
Assigning a persona or role to the model in the system prompt — 'You are an expert compliance officer' — to shape tone, depth, and framing of responses.
Search that finds results based on meaning rather than keyword matching — using embeddings to retrieve conceptually related content even when exact terms differ.
Skills are pre-built capabilities you can enable in Claude.ai — things like searching the web, running code, generating images, or reading files. Think of them as apps you turn on inside Claude. Each skill gives Claude access to a specific tool it can use during your conversation. You pick which ones to enable, and Claude will use them automatically when they're relevant.
Sending model output token-by-token as it's generated rather than waiting for the complete response — dramatically improving perceived latency for users.
Configuring a model to return data in a specific format (JSON, XML, etc.) rather than free text — essential for integrating AI into downstream systems.
A specialized agent operating under an orchestrator — focused on a narrow task domain rather than end-to-end goal completion.
A system prompt is the set of instructions you give Claude before the conversation starts — telling it who it is, what it's for, what it should and shouldn't do, and how to behave. Users see the conversation; they typically don't see the system prompt. It's the configuration layer between you as an operator and Claude as a product. Writing a good system prompt is one of the highest-leverage things you can do when deploying Claude.
Breaking a complex goal into smaller subtasks that can be assigned to agents or tools — a core capability of effective orchestration systems.
A parameter controlling response randomness — low temperature (0) produces deterministic outputs; high temperature (1+) increases creativity and variability.
The volume of requests a system can handle per unit time — the capacity dimension of AI infrastructure, as opposed to latency's speed dimension.
A token is the basic unit Claude uses to read and write text — roughly three-quarters of a word. "Hello, how are you?" is about 6 tokens. Everything you send to Claude (your message, system prompt, uploaded documents) and everything Claude sends back counts as tokens. Tokens determine both your usage limits and your API costs. Understanding tokens helps you manage costs, stay within limits, and structure your prompts efficiently.
The process of converting raw text into tokens — the numerical units a model actually reads. Different models use different tokenization schemes.
Tool use lets Claude call external functions and APIs during a conversation. You define the tools Claude has access to — search, database lookup, sending an email, running a calculation — and Claude decides when to call them, what inputs to send, and how to use the results in its response. It's the mechanism that turns Claude from a text generator into something that can take actions and interact with real systems.
A sampling strategy that restricts the model to choosing from the most probable tokens until their cumulative probability reaches p — balances diversity and coherence.
The full cost of an AI initiative including API costs, engineering time, evaluation overhead, change management, and ongoing maintenance — typically 3-5x the initial estimate.
The neural network architecture introduced in 2017 that underlies virtually all modern LLMs — characterized by attention mechanisms and parallel processing.
A database optimized for storing and querying high-dimensional embeddings — enabling fast similarity search across millions of documents.
Using AI tools to write code through natural language and iteration rather than line-by-line authorship — increasingly viable for prototyping but requiring engineering judgment to manage quality and security.
The specific numerical values of model parameters after training — the actual 'knowledge' stored in the model.
Using AI to execute or assist with repetitive process steps — the most common entry point for business AI adoption and frequently mistaken for AI strategy.
The information currently available to an agent within its active context — analogous to human short-term memory, limited by context window size.