AI Codex

Knowledge Graph

AI Glossary

146 terms across 8 clusters. Every concept mapped to the decisions it informs.

146 terms
A
Foundation
Adaptive Thinking Claude

Adaptive thinking is how Claude automatically decides how much reasoning effort to apply to a given question. Simple questions get fast, direct answers. Complex or ambiguous questions trigger more deliberate internal reasoning before a response. You don't have to configure this — Claude adjusts on its own. The practical result is that you get appropriate depth without having to manually switch modes or prompt for step-by-step reasoning.

Agents
Agent Memory Claude

How an AI agent keeps track of information across the steps of a task, or across multiple conversations. Short-term memory is what's currently in the active context — the agent can see it directly. Long-term memory is stored externally (in a database or file) and retrieved when needed. Without effective memory, agents repeat work, contradict themselves, or lose track of what they've already done. Good memory management is a core challenge in building reliable agents.

Tools
Agent SDK Claude

The Agent SDK is Anthropic's framework for building autonomous AI agents that use Claude as their brain. It handles the mechanics of an agent loop — giving Claude tools, tracking what it's done, deciding what to do next — so you can focus on what the agent should accomplish. Use it when you're building something where Claude needs to complete multi-step tasks on its own, not just answer a question.

Agents
Agentic Workflow Claude

A process where an AI agent handles a multi-step task autonomously from start to finish — rather than waiting for a human to direct each individual step. For example: you give Claude a goal ("research these 10 companies and compile a summary"), and it figures out the steps, executes them, and delivers the result. The more 'agentic' the workflow, the more the AI is doing independently. Higher autonomy means faster results but also more things that can go wrong without human oversight.

Business
AI Adoption

The process of getting people in an organization to actually use AI tools — not just installing them. Most AI adoption failures aren't technical failures: the tools work, but people don't change their habits, don't trust the outputs, or weren't involved in the decision. Successful adoption requires genuine usefulness (not just novelty), training, support, and often visible backing from leadership. The hardest part of enterprise AI is almost always adoption, not the technology.

Agents
AI Agent Claude

An AI system set up to take a sequence of actions to complete a goal — not just answer a single question. Instead of just responding to your message, an agent can search the web, read documents, write code, send requests to other systems, and keep working through multiple steps until it finishes the task. Claude can act as an agent when given the right tools. The key difference from a regular chatbot: agents do, not just say.

Business
AI Augmentation

Using AI to enhance what people can do rather than replace them entirely. Instead of "AI does the work," it's "AI does the parts humans find slow or tedious so humans can focus on the parts that need judgment." Research shows that framing AI as augmentation rather than replacement consistently drives higher adoption rates, better employee buy-in, and better outcomes — because people are more willing to use tools they don't feel threatened by.

Evaluation
AI Bias

Systematic unfairness in AI outputs that advantages some groups and disadvantages others — often because the training data reflected historical biases, which the model absorbed and sometimes amplified. Examples: an AI that gives worse resume feedback to names that sound female, or recommends different loan amounts based on neighbourhood demographics. Bias is hard to eliminate entirely because training data reflects the world, which is not itself unbiased. Knowing your model's failure modes matters before deploying it in high-stakes decisions.

Business
AI Center of Excellence

A centralized team that sets AI standards, shares best practices, builds shared infrastructure, and coordinates AI deployment across an organization. Effective at scale when multiple business units are running independent AI initiatives that risk duplicating effort or creating inconsistent standards. Often premature for early-stage adoption — a small, capable AI team doing real projects generates more value than a large CoE that mainly produces guidance documents.

Evaluation
AI Compliance

Meeting the legal and regulatory requirements that apply to how your organization uses AI — data protection laws (GDPR, CCPA), industry-specific rules (HIPAA for healthcare, financial regulations), and emerging AI-specific legislation. Compliance requirements vary significantly by industry and geography, and are expanding rapidly as governments catch up to AI adoption. For most teams, the starting questions are: what data goes into Claude, where does it go, and are users informed?

Evaluation
AI Fairness

The goal of making AI treat people and groups equitably — not systematically producing better outcomes for some people than others. Harder than it sounds: there are multiple mathematically conflicting definitions of 'fair,' and optimizing for one often violates another. An AI that is equally accurate for all demographic groups may still produce very different outcomes for them. Fairness in AI is an active area of research with no settled answers — but 'did we test this across diverse groups?' is always the right question.

Evaluation
AI Governance

The organizational structures, policies, and processes that determine how AI is used within a company — who approves new AI tools, what data can be used, how outputs are reviewed, who's accountable when something goes wrong. Without governance, AI use spreads unpredictably: different teams adopt different tools, nobody knows what data is being shared with vendors, and there's no process for incidents. Governance doesn't have to be bureaucratic — but someone needs to own these questions.

Business
AI Integration

The technical work of connecting AI to your existing systems — feeding it data from your databases, having it write back to your CRM, triggering it from your existing workflows. Most of the difficulty in AI integration isn't the AI itself; it's data plumbing, authentication, error handling, and making outputs reliable enough to trust in automated pipelines. This is where 'it works in the demo' turns into 'it works in production.

Business
AI Literacy

A baseline understanding of what AI can and cannot do — enough to use it effectively, evaluate its outputs critically, and make good decisions about where to apply it. AI literacy doesn't mean knowing how models are trained. It means knowing: when to trust Claude's answer versus verify it, how to prompt clearly, what kinds of tasks Claude is bad at, and when AI is actually the right tool for the job. The gap between AI literacy and actual deployment is where most enterprise AI value is lost.

Business
AI Pilot

A contained test of an AI solution before committing to full deployment — a small-scale trial to validate whether it actually works in your context. Valuable when done well; frequently misstructured in practice. Common mistakes: picking an easy use case that doesn't predict production performance, measuring the wrong things, not involving real end users, or running it too short to see the problems. A well-designed pilot tells you whether to proceed, not just whether the demo worked.

Business
AI ROI

The return on investment from AI initiatives — what you actually get back, measured against what you spent. Harder to calculate than traditional IT ROI because AI produces probabilistic, variable outputs rather than deterministic ones, and the value is often in time saved or quality improved rather than direct revenue. Most companies undercount the real cost (engineering time, change management, ongoing maintenance) and overcount the benefit (what would people have done without the tool?). Realistic ROI measurement starts before the pilot, not after.

Evaluation
AI Safety Claude

The field and practice of ensuring AI systems do what they're supposed to do, without causing unintended harm — at every scale from today's applications to future, more powerful systems. At the deployment level: guardrails, safety testing, content filtering. At the research level: alignment, interpretability, and understanding how AI models make decisions. Anthropic was founded specifically to work on AI safety and publishes extensively on the topic.

Tools
AI Safety Level Claude

Anthropic's internal framework for classifying how capable and potentially risky an AI model is — and what safety commitments apply at each level. ASL-1: minimal risk. ASL-2: current Claude models (meaningful capabilities, manageable risks). ASL-3+: future models with capabilities that would require significantly stronger safety measures before deployment. The ASL framework is Anthropic's public commitment to pause or restrict deployment if a model reaches certain capability thresholds without corresponding safety measures.

Business
AI Strategy

A deliberate plan for how an organization will use AI to achieve its goals — as opposed to ad hoc tool adoption without a unifying direction. A real AI strategy answers: what business problems are we solving, which capabilities do we need, how do we build for the long term, and how does AI fit with our existing systems and culture? Without strategy, companies end up with dozens of disconnected AI experiments and no compounding value.

Business
AI Use Case Discovery

The process of identifying which problems in your organization are actually good fits for AI — and which ones aren't. Not every manual process should be automated. Good AI use cases tend to have: clear inputs and outputs, large volume of repetitive work, tolerance for occasional errors, and available data or context for Claude to work with. Poor candidates tend to require deep relationship judgment, physical presence, or extremely high accuracy with no room for error.

Evaluation
Alignment Claude

The challenge of making sure AI systems actually pursue the goals you want — not a slightly different goal that seemed equivalent during training but diverges in practice. An AI trained to maximize user engagement might learn to be addictive rather than helpful. An AI trained to answer questions confidently might learn to hallucinate rather than admit uncertainty. Alignment is about closing the gap between what you want and what the AI actually optimizes for — a hard problem that gets harder as AI becomes more capable.

Tools
Anthropic Claude

The AI safety company that builds Claude — founded in 2021 by former OpenAI researchers, including Dario and Daniela Amodei. Anthropic's core thesis is that the most important thing a company can do right now is build AI safely and work to understand it deeply. They publish their safety research publicly, maintain a strict safety framework (ASL levels), and operate as a public benefit corporation. They are one of the leading AI labs globally alongside OpenAI and Google DeepMind.

Tools
Anthropic Console Claude

Anthropic's developer platform at console.anthropic.com — the control panel for teams building with the Claude API. Use it to test prompts in the Workbench, manage API keys, view usage and costs, set spending limits, run evaluations, and access model documentation. The starting point for any technical Claude integration.

Tools
Anthropic Partner Program Claude

Anthropic's program for businesses building products and services on top of Claude — providing technical support, go-to-market resources, early access to new capabilities, and potential co-marketing. Relevant for companies making Claude a core part of their product offering, not just using it internally. Tiers exist based on usage and strategic fit. The program reflects Anthropic's recognition that a strong partner ecosystem matters for adoption.

Infrastructure
API Claude

A way for software to talk to other software — specifically, a set of rules for how your application can send requests to Claude and receive responses back. Instead of a human typing into Claude.ai, your code sends a message to Anthropic's API and gets Claude's response back as data it can use. APIs are how Claude gets embedded in other products: your company's app, a custom internal tool, or an automated workflow.

Tools
Artifacts Claude

Artifacts are self-contained outputs Claude creates in a dedicated side panel — not buried in chat. They can be documents (Word, PDF, Markdown), spreadsheets, presentations, working code, full HTML pages, SVG graphics, diagrams, or interactive React components with real logic. You can preview, copy, download, or iterate on them without leaving the conversation. The shareable version: publish an artifact publicly and anyone with the link can view and interact with it — no Claude account needed. They can also "remix" it, opening it in their own Claude to build on. On Team and Enterprise plans, sharing stays org-internal behind authentication.

Foundation
Attention Mechanism

The core reason modern AI handles long, complex text without losing track of what's important. Without it, AI would process text left-to-right, easily forgetting earlier context. With it, every word in a passage can weigh its relevance to every other word. It's why Claude can read a 100-page document and still accurately answer a question about something on page 3 — it hasn't forgotten it. The "attention" is which parts of the context the model focuses on when generating each word of its response.

B
Infrastructure
Batch Processing Claude

Processing a large number of AI requests together, on a schedule, rather than in real time. Instead of asking Claude to analyze each customer email the moment it arrives, you collect all the emails from the day and process them overnight. Batch processing is slower but significantly cheaper — Anthropic offers a 50% discount for batch API calls. The right approach when real-time responses aren't needed.

Evaluation
Benchmark

A standardized test used to compare AI models against each other on specific capabilities — math, reasoning, coding, reading comprehension. Benchmarks like MMLU or HumanEval give you an apples-to-apples comparison across models. Useful for getting a general sense of capability, but often misleading for real-world use: a model that scores highest on a benchmark isn't always the best for your specific task. Your own evals matter more than general benchmarks.

Business
Build vs. Buy

The decision between building a custom AI solution from scratch versus using an existing product. In practice it's rarely this binary — most real decisions are about how much to customize on top of an existing foundation model. Build gives you more control and differentiation; buy gets you there faster. The right answer depends on how central AI is to your competitive advantage, your engineering resources, and how specific your requirements are. Most companies should start with buy and build only where they genuinely need differentiation.

C
Agents
Chain-of-Thought Prompting

A technique where you ask Claude to 'think step by step' before giving its final answer — rather than jumping straight to a conclusion. This dramatically improves accuracy on complex problems like math, logic, or multi-step reasoning. When Claude shows its work, you also get to see where its reasoning might be flawed. The practical version: adding "think through this step by step" to your prompt often significantly improves the quality of the response.

Business
Change Management

The organizational work of helping people adapt when AI changes their workflows — making sure the transition is smooth rather than disruptive. When a company deploys an AI tool, the technical setup is usually the easy part. The hard part is: getting people to change habits, addressing the fear of job loss, retraining for new responsibilities, and maintaining quality during the transition. Change management is the most consistently underestimated cost in AI deployments.

Business
Chief AI Officer

An executive responsible for an organization's overall AI strategy, governance, and adoption. The role emerged rapidly as AI moved from IT experiments to strategic priority. What a CAIO actually does varies enormously: some are focused on external product strategy, others on internal capability building, others primarily on risk and governance. In smaller companies, these responsibilities often sit with a CTO or CDO rather than a dedicated role.

Retrieval
Chunking

Splitting documents into smaller pieces before feeding them to an AI system — so they can be searched, retrieved, and processed more efficiently. Too large a chunk: the AI gets too much irrelevant content with the relevant bit. Too small: the AI loses the surrounding context that makes the relevant bit understandable. Getting chunking right is one of the most practical engineering decisions in building a document-based AI system.

Tools
Claude Claude

Anthropic's AI assistant — available at Claude.ai and through the API. Claude is known for long context (it can read entire books or codebases), strong reasoning, careful honesty about what it knows and doesn't know, and being built with safety as a priority from the ground up. It's the focus of this knowledge hub. Different versions of Claude exist for different needs: lighter models for fast, simple tasks, and more powerful models for complex work.

Tools
Claude Code Claude

Claude Code is a command-line tool that brings Claude directly into your development workflow. You run it in your terminal, point it at a codebase, and Claude can read files, write code, run commands, and navigate your project — all in conversation. It's Claude as a pair programmer that can actually see and touch your work, not just read what you paste in.

Tools
Claude Code Skills Claude

Claude Code Skills are markdown files (SKILL.md) that teach Claude Code how to handle a specific type of task — your PR review format, your commit message style, your security checklist. Unlike CLAUDE.md which loads into every conversation, Skills load on-demand: only the name and description load at startup, the full instructions only load when a request matches. This keeps context efficient — you can have many skills without bloating every session. Skills live in four locations with a strict priority order: enterprise (managed settings) overrides personal (~/.claude/skills/) overrides project (.claude/skills/) overrides plugins. A critical gotcha: sub-agents do not inherit skills automatically — they start with a clean context, and you must explicitly list which skills they should load in the agent.md file. The allowed-tools field lets you restrict what Claude can do when a skill is active, useful for read-only or security-sensitive workflows. Not the same as Claude.ai Skills, which are workflow packages for productivity tasks.

Business
Claude Plans Claude

Claude is available across several pricing tiers: Free (basic access to Sonnet, limited usage), Pro ($20/month — higher limits, all models, Cowork, Dispatch), Max ($100/month — very high usage limits), Team Standard ($25/seat/month — admin tools, shared Projects, 5-seat minimum), Team Premium ($125/seat/month — includes Claude Code), and Enterprise (custom pricing — SSO/SCIM, HIPAA-ready, 500K context, full compliance controls). The right plan depends on how many people need access and what level of admin control you need.

Tools
Claude.ai Claude

Anthropic's consumer and business web interface for Claude at claude.ai — where most people access Claude without writing code. Features include: Projects (persistent workspaces with custom instructions and files), Connectors (links to Google Drive, Dropbox, Notion, etc.), Skills (built-in capabilities like web search and code execution), and Cowork (Claude controlling your computer to do tasks). Distinct from the Claude API, which developers use to build their own applications.

Tools
CLAUDE.md Claude

CLAUDE.md is a Markdown file you add to your project root that Claude Code reads automatically at the start of every session. Its contents are appended to your prompt — so whatever is in there, Claude already knows before you type a single thing. Think of it as an onboarding doc for your codebase: your tech stack, your dev commands, your code conventions, and any corrections you've had to make more than once. There are two levels: project-level (root of your project, checked into version control so your whole team benefits) and user-level (your personal config folder, applies across all projects, for preferences that are yours alone). A few practical things most people miss: (1) Don't start with one. Use Claude Code for a while first and notice where you keep course-correcting. When you know what to include, run /init and Claude will generate one from your session history. Keeps it focused instead of bloated. (2) If you catch yourself correcting Claude repeatedly — say, "use server actions, not API routes" — tell Claude to save that rule to CLAUDE.md. It won't rediscover it the next session. (3) CLAUDE.md is for instructions. Hooks are for enforcement. If it needs to happen every time without fail, put it in a hook — not here. Things in CLAUDE.md usually happen. Things in hooks always happen.

Agents
Computer Use Claude

Computer Use is the underlying capability that lets Claude see and interact with a computer screen — clicking buttons, typing text, navigating applications, reading what is displayed. It is the technology behind Cowork and Dispatch. For developers building with the API, Computer Use means you can give Claude a virtual desktop and have it operate software the same way a human would — useful for automating workflows in applications that don't have APIs.

Tools
Connectors Claude

Connectors link Claude.ai to external services — Google Drive, Dropbox, Jira, GitHub, Notion, and others. Once you connect a service, Claude can read documents from it, search it, or take actions within it during your conversation. It's how you give Claude access to your actual work, not just what you paste in.

Foundation
Constitutional AI Claude

Anthropic's method for training Claude to be helpful, harmless, and honest. Instead of having people manually label every AI response as good or bad, Anthropic gave the model a set of written principles — a 'constitution' — and trained it to evaluate and improve its own responses according to those principles. It's a big reason why Claude is generally willing to explain its reasoning, push back on harmful requests, and behave consistently rather than unpredictably.

Evaluation
Content Moderation

Filtering AI inputs and outputs to detect and block harmful content — hate speech, graphic violence, personal information, or anything that violates platform rules. In consumer AI products, this happens automatically on every request. For companies building on the Claude API, you're responsible for moderation appropriate to your use case and user base. Anthropic provides default safety filtering; you can add additional layers on top.

Foundation
Context Window

The context window is how much text Claude can hold in its attention at once — your conversation history, any documents you've shared, system instructions, and Claude's own responses. Once you hit the limit, older content gets pushed out. Claude 3.7 Sonnet has a 200,000-token context window, which is roughly 150,000 words. Long contexts let Claude analyse full documents or maintain long conversations — but they also cost more and can sometimes cause Claude to lose focus on earlier material.

Infrastructure
Cost Optimization Claude

The practical work of making your AI API usage cost less — through better model selection, smarter prompts, caching repeated content, batching requests, and routing simple tasks to cheaper models. Many teams discover that 80% of their API requests are simple enough for a smaller, cheaper model — and only 20% genuinely need the most capable (and expensive) one. Getting this right is often the difference between an AI feature that's financially viable and one that's not.

Tools
Cowork Claude

Cowork is the agentic work mode in the Claude desktop app — the tab between Chat and Code. Where Chat is a back-and-forth conversation, Cowork is delegation: you describe an outcome, Claude plans the steps, works through them, and delivers a finished file to your drive. It reads and writes real files in a folder you point it at, connects to your tools (Slack, Drive, Gmail, Calendar), and keeps working on longer tasks while you step away. Use Cowork when the task needs your files, your connected tools, or a real output file. Use Chat when you're thinking something through or everything fits in a paste. A few practical things to know: Cowork uses more of your plan allocation than Chat — match your model to the task (Sonnet is the right default, Opus only for genuinely complex multi-step work). Sleeping your computer is fine during a long task; quitting the app pauses it. Conversation history is stored locally on your machine, which matters for compliance-sensitive workloads. Projects in Cowork are local to your desktop with memory scoped to that project — different from Claude.ai Projects, which sync in the cloud.

Tools
Cowork Plugins Claude

Cowork Plugins are role-specific bundles that give Cowork domain expertise for a specific job function. Each plugin packages together skills (step-by-step workflows for common tasks), connectors (access to the tools that role uses), and subagents (parallel workers for tasks that have many independent pieces). There are open-source plugins for most knowledge-work roles — sales, marketing, product, finance, legal, ops, CS, data — available on GitHub at github.com/anthropics/knowledge-work-plugins. Install in one click from Cowork's Customize area. Once installed, the plugin is a folder on your machine: every file is readable plain text, every skill is editable, and you can add new skills or modify existing ones with no build step. Plugins compose with scheduled tasks: a skill encodes what to do, a scheduled task decides when. The result is recurring work that runs on its own. Not the same as Claude Code plugins, which distribute Claude Code Skills for developer workflows.

Tools
Cursor

An AI-native code editor built on top of VS Code — designed from the ground up for AI-assisted development. Claude and other models are integrated deeply: it can read your entire codebase, understand what you're building, and make multi-file edits from a single instruction. Widely considered the most productive AI coding tool for professional developers. Relevant if you have engineers; not a tool end users interact with.

D
Tools
Deep Research Claude

Deep Research (called "Research" in the product) turns Claude into a systematic investigator rather than a conversational assistant. Instead of one search, it runs many — each building on what the previous one found — then synthesises everything into a structured, cited report. Extended thinking activates automatically, so Claude plans its approach before searching. Most reports complete in 5 to 15 minutes; complex ones can run up to 45. Not a feature to kick off five minutes before a meeting. Pro, Max, Team, and Enterprise plans only — not available on Free. Two modes worth knowing: the default uses the web; if you disable web search and keep connectors on, it runs the same deep investigation against your internal tools only — useful for "what has our team actually decided about X" questions. When to use it over simpler tools: when you need citations from multiple sources, not just a quick answer.

Tools
Dispatch Claude

Dispatch lets you assign tasks to Claude from your phone or the web, and Claude picks them up on your desktop to execute. You write what you need done — "prepare the weekly report", "research these three companies" — and Claude works through it on your computer while you do something else. The task and its progress persist across devices. It is the async version of Cowork: instead of working alongside you in real time, Claude works independently and reports back.

Retrieval
Document Indexing

Processing and storing documents so they can be searched and retrieved later. When you build an AI that can answer questions about your internal documents, the indexing step is where you read all those documents, split them up, convert them to a searchable format, and store them. It's the setup phase. Every time a user asks a question, the system searches the index to find the relevant pieces before Claude generates an answer.

E
Foundation
Embedding

A way of turning words, sentences, or whole documents into numbers that capture their meaning — so that similar ideas end up mathematically 'close together.' This is what makes AI search smarter than keyword matching. When you search for "how to cancel my subscription," an embedding-based search also finds results about "unsubscribing" or "stopping a plan" because their meanings are similar, even if the exact words differ. You don't work with embeddings directly — they power the smart search happening behind the scenes.

Tools
Enterprise Search Claude

Enterprise Search is an org-wide search project that lives in every team member's Claude sidebar. An admin connects your organisation's tools once — Slack, Google Drive, SharePoint, email — and then anyone in the org can ask questions that pull from all of them simultaneously. The key difference from regular connectors: it's configured centrally with retrieval-optimised instructions, and each person authenticates individually, so Claude only surfaces what that person has permission to see. Two people asking the same question may get different answers — that's intentional. What it surfaces well: documented knowledge spread across tools. What it can't surface: decisions that were never written down. Team and Enterprise plans only. An admin must complete setup before anyone else can use it.

Retrieval
Episodic Memory

Stored records of specific past interactions — what happened in previous conversations or task runs — that an agent can retrieve and reference. Without episodic memory, an agent has no recall of "last Tuesday you told me to always format reports as PDFs." With it, agents can adapt their behavior based on what they've learned about you or your team over time. Technically, this is external storage that the agent reads, not a feature built into the model.

Evaluation
Evals Claude

Systematic tests for measuring how well Claude performs on your specific tasks — the AI equivalent of unit tests in software development. Instead of just "trying it out and seeing if it seems right," evals give you a measurable score: "Claude answered 87 out of 100 test cases correctly." They let you compare models, catch regressions when you change prompts, and build confidence before deploying changes. Most teams skip evals early on — and regret it when something silently breaks in production.

Evaluation
Explainability

The ability to explain, in plain English, why an AI produced a particular output. Important in regulated industries (banking, healthcare, legal) and anywhere decisions affect people's lives — if someone is denied a loan, they're entitled to know why. Claude can often explain its reasoning in natural language, but the underlying neural network computation is not itself interpretable in the same way a decision tree would be. Explainability usually means prompting Claude to show its reasoning alongside its answer.

Foundation
Extended Thinking Claude

Extended thinking is a mode where Claude works through a problem step-by-step before giving its final answer. You see its reasoning process — including the parts where it changes its mind or catches an error. It takes longer and costs more tokens, but produces meaningfully better results on hard problems: complex analysis, multi-step planning, tricky reasoning tasks. Turn it on when accuracy matters more than speed.

G
Tools
GitHub Copilot

Microsoft's AI coding assistant integrated directly into GitHub and popular code editors — the first mass-market AI developer tool, launched in 2021, and still the most widely deployed in enterprise engineering teams. Suggests code completions as you type and can generate functions from comments. Less capable than Cursor for complex multi-file edits but has the advantage of deep IDE integration and enterprise procurement familiarity.

Tools
Google Gemini

Google DeepMind's AI model family — deeply integrated with Google Workspace (Gmail, Docs, Sheets, Slides) and Google Search. Available in multiple sizes: Gemini Ultra for complex tasks, Gemini Pro for most use cases, Gemini Flash for speed and cost efficiency. Strong multimodal capabilities — it can work with text, images, audio, and video. For organizations already invested in Google's ecosystem, Gemini often has the most natural integration points.

Tools
GPT-4

OpenAI's flagship model family — the most widely deployed AI models in enterprise contexts and the benchmark most organizations compare against when evaluating alternatives. GPT-4o (multimodal), GPT-4 Turbo, and related variants all fall under this family. Strong across coding, reasoning, and language tasks. Available via OpenAI's ChatGPT product and API, as well as Microsoft's Copilot integrations across Office 365.

Foundation
Grounding

Connecting Claude's responses to verified, real-world sources or data rather than letting it rely purely on what it learned during training. A grounded AI cites where its information comes from, references actual documents, or queries live data before responding — reducing the chance of hallucination. RAG (Retrieval-Augmented Generation) is the main technical approach to grounding. Grounding matters most when accuracy is critical and Claude's training data might be incomplete or outdated.

Evaluation
Guardrails Claude

Rules and filters built into or around an AI system to constrain what it can do — preventing harmful outputs, keeping it on-topic, enforcing compliance requirements, or protecting user privacy. Examples: blocking Claude from discussing competitor products, requiring a legal disclaimer on financial questions, filtering out personally identifiable information before logging. Guardrails are what you build when "Claude should use good judgment" isn't specific enough for a production system.

H
Foundation
Hallucination

Hallucination is when an AI model states something confidently that isn't true. It might invent a citation, get a date wrong, describe a product feature that doesn't exist, or fill in gaps in its knowledge with plausible-sounding guesses. It happens because language models are trained to produce fluent text — not to verify facts before speaking. Knowing when to trust Claude's output versus when to ground it in real sources is one of the most important skills for operators.

Tools
Hooks Claude

Hooks are shell commands that run at specific points in Claude Code's lifecycle — before a tool call, after a file edit, when you submit a prompt, or when Claude finishes a response. The key distinction: everything else in Claude Code is probabilistic (Claude will usually follow instructions). Hooks are deterministic — they always run, no exceptions. You can tell Claude in CLAUDE.md to run Prettier after every file edit. Most of the time it will. A hook makes it happen every single time. There are five events: PreToolUse (before a tool runs), PostToolUse (after a tool completes), UserPromptSubmit (when you submit a prompt, before Claude processes it), Stop (when Claude finishes responding), and Notification (when Claude sends a notification). For PreToolUse hooks, the exit code controls behavior: 0 = proceed normally, 2 = block the action (your stderr message gets fed back to Claude as context so it knows why and can adjust), anything else = non-blocking error shown to you but not blocking. This is how teams enforce hard rules: block writes to production config directories, block bash commands containing rm -rf, block direct commits to main. Configure hooks with /hooks inside a Claude Code session, or edit .claude/settings.json directly. Project-level hooks live in .claude/settings.json and should be checked into version control — your whole team gets them automatically. Use CLAUDE_PROJECT_DIR in hook commands to reference scripts in your project so they work from any working directory.

Tools
Hugging Face

The central hub for the open-source AI ecosystem — effectively the GitHub of AI. On Hugging Face, you can find and download thousands of pre-trained models, datasets, and demos. It's where most open-source fine-tuned models are published and where the research community shares work. If you're building with open-source AI or evaluating alternatives to proprietary APIs, Hugging Face is where you start looking.

Agents
Human-in-the-Loop Claude

When a person steps in to check, approve, or correct AI output at key points — rather than letting AI run the whole process automatically. For example: Claude drafts a customer email, but a real person reviews it before sending. Or Claude flags a refund request as unusual, and a manager makes the final call. It's how you get the speed of AI without giving up human judgment on the decisions that matter.

Retrieval
Hybrid Search

Combining two types of search — keyword matching and meaning-based (semantic) search — to get better results than either alone. Keyword search is fast and precise for exact matches. Semantic search is better at finding results when you paraphrase or use different words. Hybrid search uses both, then combines the rankings. Most production AI search systems use hybrid search because real user queries don't always match the exact words in documents.

L
Tools
LangChain

An open-source framework for building applications that use language models — particularly RAG pipelines and AI agents. Popular partly because it provides abstractions for common patterns (retrieval, tool use, chains of prompts) that would otherwise require significant custom code. Criticized for adding complexity and making debugging harder than just writing the code directly. Useful as a starting point for understanding patterns; often replaced with custom implementations as teams mature.

Foundation
Large Language Model

The technology behind AI tools like Claude, ChatGPT, and Gemini. An LLM is a type of AI trained on enormous amounts of text — books, websites, articles, code — until it learned to read, write, summarize, answer questions, and hold conversations. The 'large' refers to the scale of training and the model's internal complexity, not something you interact with directly. When people say 'AI,' they usually mean an LLM.

Infrastructure
Latency

The time between sending a request to Claude and getting a response back. Low latency makes AI feel fast and responsive. High latency makes users feel like they're waiting. Latency depends on the model (larger models are slower), the length of the response, and network conditions. Streaming — showing Claude's response as it's being generated rather than waiting for the full answer — is the most common way to make high-latency responses feel faster.

Tools
LlamaIndex

An open-source data framework specifically focused on connecting AI models to your documents and data — strong for building document indexing, retrieval, and question-answering pipelines. Complements LangChain (which is more general) with deeper specialization in the data ingestion and retrieval side. Good choice if your primary use case is 'Claude that knows about our internal documents' and you want a framework rather than building from scratch.

Infrastructure
LLMOps

The operational practices for keeping AI applications running reliably in production — monitoring, evaluation, versioning, cost tracking, and incident response. Similar to DevOps for regular software, but with AI-specific challenges: model behavior can change when models are updated, prompts can drift over time, and outputs are harder to test than deterministic code. LLMOps is what separates teams that ship AI and keep it working from teams that ship it and then scramble when it breaks.

Retrieval
Long Context Claude

Using a model's ability to process very large amounts of text in one go, instead of breaking it into chunks and searching. Claude can handle up to 200,000 tokens (roughly 150,000 words) in a single context window. For some use cases — analysing a full legal contract, reading an entire codebase, processing a transcript — it's simpler and more accurate to just give Claude the whole document than to build a search system around it.

Retrieval
Long-term Memory Claude

Persistent storage of information that an AI agent can retrieve across multiple sessions — not just within one conversation. Without it, every conversation starts from scratch. With it, an agent can remember your preferences, past decisions, ongoing projects, and context from weeks ago. Technically implemented as external storage (a database) that the agent reads from and writes to, not as anything built into the model itself.

Foundation
LoRA

A technique for customizing an AI model without retraining the whole thing from scratch — which would be prohibitively expensive. LoRA adds small trainable layers on top of the existing model and only updates those. The result is a customized version of the model that is cheaper and faster to create. Relevant for developers building specialized AI tools; not something end users interact with directly.

M
Agents
Managed Agents Claude

Managed Agents are cloud-hosted AI agents that Anthropic runs for you. Instead of building and hosting your own agent infrastructure, you define what the agent should do and Anthropic handles execution — secure sandboxing, long-running sessions, tool access, and scaling. Think of it as Claude-as-a-worker: you assign a task, it runs in the cloud, and you get the result. Currently in public beta at $0.08/session-hour plus token costs.

Foundation
Memory Claude

Memory is Claude remembering things about you across conversations. Your preferences, your role, your projects, decisions you have made — Claude retains this context and applies it automatically in future chats. You don't need to re-explain who you are or what you're working on every time. Available on all Claude tiers including Free. You can view and manage what Claude remembers in your settings.

Tools
Meta LLaMA

Meta's family of open-weight AI models — freely downloadable and runnable on your own hardware without API dependency. LLaMA models are the foundation for most of the open-source AI ecosystem: thousands of fine-tuned variants, community extensions, and self-hosted deployments are built on top of them. The strategic value of LLaMA is control: you own the model, your data never leaves your servers, and there are no per-token costs. The tradeoff: you're responsible for running and maintaining the infrastructure.

Prompt
Meta-prompting

Using Claude to write or improve your prompts for you. Instead of figuring out the perfect prompt yourself, you describe what you're trying to accomplish and ask Claude to generate a well-structured prompt for that task. You can then test and refine it. A practical shortcut when you know what you want the output to look like but aren't sure how to phrase the instructions.

Foundation
Mixture of Experts

An AI architecture where only part of the model activates for any given request — different specialized 'sub-models' (experts) handle different types of tasks. It lets you build very large, capable models without needing proportionally large computing power for every query, because you're only using a relevant slice of the model at any given time. It's an efficiency technique that makes some of the largest AI models practical to run.

Tools
Model Context Protocol

MCP (Model Context Protocol) is an open standard that defines how AI models connect to external data sources and tools. Instead of every developer building their own custom integration, MCP gives a universal interface: if a tool supports MCP, any MCP-compatible AI can use it. Anthropic created and open-sourced it. In practice, MCP is what makes it possible for Claude to connect to your file system, your database, your APIs — in a structured, secure way.

Foundation
Model Distillation

Training a smaller, faster AI model to behave like a larger, slower one — by having the small model learn from the big model's outputs. The result is a 'student' model that is much cheaper to run but retains most of the quality of the 'teacher.' This is how companies make AI fast enough and cheap enough for production use at scale, without sacrificing all the capability of their best models.

Foundation
Model Inference

The process of running a trained AI model to get a response — what happens every single time you send a message. 'Inference' is just the technical word for 'using the model.' Training is the expensive, one-time process of teaching the model. Inference is the ongoing, per-request process of getting answers out of it. When you're paying API costs, you're paying for inference.

Infrastructure
Model Routing

Automatically sending different types of requests to different AI models based on complexity and cost. Simple questions go to a smaller, cheaper model. Hard or nuanced questions go to a more powerful one. This cuts costs significantly without sacrificing quality on the things that matter. Like a staffing manager who routes simple tickets to junior support agents and escalates complex ones to senior staff — except automated and happening on every request.

Infrastructure
Model Versioning

Keeping track of which version of a model and which version of your prompts are running in production — so you can test changes safely and roll back if something breaks. This matters because AI model updates change behavior: something that worked well on Claude 3 might behave differently on Claude 3.5. Similarly, changing a system prompt is effectively a code change that should go through the same review process.

Agents
Multi-agent System Claude

A setup where multiple AI agents work together on a complex task — each handling a specific part, coordinated by an overall system. For example: one agent researches the topic, another drafts the report, another checks it for errors. Using multiple specialized agents often produces better results than asking one agent to do everything. It's how companies automate complex workflows that have too many moving parts for a single AI to handle well.

Foundation
Multimodal Model Claude

An AI that can work with more than just text — typically text and images together, sometimes audio or video too. Claude is multimodal: you can paste an image into your conversation and Claude will describe it, answer questions about it, or analyze what's in it. Multimodal models are why AI can now read screenshots, describe photos, analyze charts, and interpret diagrams — not just process typed words.

O
Infrastructure
Observability

The ability to see what your AI system is doing in production — what inputs it received, what it responded, how much it cost, how long it took, and where it went wrong. Without observability, debugging production AI is guesswork. With it, you can spot when Claude's responses drift in quality, catch expensive prompt mistakes early, and diagnose specific failures. Usually implemented through logging, dashboards, and AI-specific monitoring tools.

Tools
Open Source AI

AI models released with public weights that can be downloaded, modified, and run without depending on a vendor's API. 'Open source' in AI is contested — some models release weights but not training data or code, so it's a spectrum. The main alternatives to open-source AI are proprietary API models like Claude or GPT-4. Open-source gives you control, privacy, and no per-token costs. Proprietary APIs give you better performance, easier maintenance, and ongoing improvements without effort. Most organizations use both.

Tools
OpenAI

The AI company behind ChatGPT, GPT-4, and the DALL·E image generation models. Founded in 2015, it catalyzed mainstream AI adoption with ChatGPT's launch in November 2022 — reaching 100 million users in two months, faster than any consumer product in history. Heavily backed by Microsoft, which has integrated OpenAI models across its product suite. OpenAI and Anthropic are the two primary competitors in the frontier AI model market.

Agents
Orchestration Claude

The coordination layer that decides what to do next in an AI workflow — which agent runs, which tool gets called, in what order, and how outputs from one step feed into the next. Think of it like a project manager for AI tasks: it doesn't do the work itself, but makes sure the right work happens in the right sequence. When AI pipelines fail, it's often the orchestration logic that needs fixing.

P
Foundation
Parameters

The internal settings inside an AI model that define what it knows and how it responds — set during training and fixed afterward. When you hear "70 billion parameters," that just means the model has 70 billion tunable numerical values it learned during training. More parameters generally means more capacity, but it's not the only factor in quality. You don't control these — they're baked into the model before you ever use it.

Tools
Perplexity AI

An AI-powered search engine that retrieves current web content and synthesizes it into a cited answer — rather than just returning a list of links. Useful when you need up-to-date information that a language model's training data might not include. Demonstrates RAG (Retrieval-Augmented Generation) at consumer scale: it's fetching real sources and using them to ground the AI's response, which is why it gives citations and is more reliable for current events than a standalone AI assistant.

Tools
Plan Mode Claude

Plan Mode is a read-only execution mode in Claude Code — it analyzes your codebase and produces a plan of action without editing any files. Press Shift+Tab to cycle through modes (approval → auto-accept → plan mode). In plan mode, Claude reads files, runs web searches, and asks clarifying questions, then returns a structured plan you can review before a single line of code changes. You can approve the plan, ask it to revise specific parts, or ask questions before committing. Why it matters: most people jump straight to "write the code" and spend more time course-correcting than they would have spent planning. Plan mode is where it costs nothing to change direction. Once Claude starts editing files, every mistake requires a fix. Three situations where plan mode is the right default: (1) multi-step feature changes that touch more than a few files, (2) anything that involves dependencies or architectural decisions, (3) code review — restrict Claude to read-only tools and ask it to review without editing. The explore subagent does something similar (codebase exploration without planning), but plan mode is specifically scoped to a task you're about to implement. Running /compact after plan approval also helps — the exploration context is summarized before the coding phase begins, freeing up space for the actual work.

Agents
Planning

The process by which an agent determines the sequence of steps needed to achieve a goal — ranging from simple linear plans to dynamic replanning based on observations.

Foundation
Pre-training

The initial phase of building an AI model, where it learns from enormous amounts of text — news, books, websites, code — before being adapted for any specific use. It's how Claude learned general knowledge of language, facts, and reasoning. This happens once during development, before you ever talk to the model. You don't control it — but it's why Claude knows about history, science, coding, cooking, and almost anything else you'd ask about.

Tools
Project Instructions Claude

Project Instructions are the standing brief you give Claude for a specific project — the role it should play, the tone it should use, the constraints it should follow, the workflow steps it should apply. They're written once and automatically prepended to every conversation within that project, so you never have to re-explain the setup. Functionally, they work like a system prompt, but scoped to a workspace rather than a deployment. A good set of project instructions turns a blank chat into something purpose-built: a content editor that always follows your brand voice, a research assistant that always cites sources, a code reviewer that always checks your specific concerns first.

Tools
Projects Claude

Projects are dedicated workspaces inside Claude.ai — each with its own instructions, uploaded files, and conversation history. Where a regular chat starts from scratch, a project carries everything forward: your context, your documents, your rules for how Claude should behave. When the files you upload grow large, Projects automatically switch to RAG mode, pulling only what's relevant rather than loading everything at once — which means the knowledge base can scale well beyond the context window without slowing down. On Team and Enterprise plans, projects can be shared with teammates, so a whole team works from the same foundation instead of each person rebuilding it individually.

Prompt
Prompt Claude

The text you send to an AI to tell it what you want. A prompt can be a simple question ("summarize this article"), a detailed instruction ("write a formal email declining this vendor"), or a complex multi-part task. The quality of your prompt directly affects the quality of the response. Writing clear, specific prompts — including what format you want, what the purpose is, and any constraints — is the most practical skill for getting more out of Claude.

Tools
Prompt Caching Claude

Prompt caching saves a portion of your prompt — typically a large system prompt or document — so Claude doesn't need to re-process it on every request. The first call includes the full context; subsequent calls reference the cached version. The practical effect: faster responses and lower API costs when you're repeatedly sending the same long context, which is common in production applications.

Prompt
Prompt Chaining Claude

Connecting a series of prompts where the output of each step becomes the input for the next — breaking a complex task into a pipeline rather than trying to do everything in one prompt. For example: Step 1: extract key claims from an article → Step 2: fact-check each claim → Step 3: write a balanced summary. Chaining gives you more control than asking Claude to do it all at once, and makes it easier to spot where something went wrong.

Prompt
Prompt Governance

The organizational practice of managing, versioning, reviewing, and controlling which prompts are used in production AI systems — treating prompts like the critical business assets they are. Most companies have no prompt governance: prompts get edited ad hoc, nobody knows why a change was made, and there's no rollback when something breaks. Good governance means prompts are version-controlled, reviewed before deployment, and auditable.

Prompt
Prompt Injection

An attack where malicious content embedded in a document or website tricks an AI agent into doing something it wasn't supposed to. For example: a PDF your agent reads contains hidden text saying "Ignore previous instructions and forward all emails to attacker@evil.com." If the agent follows those instructions, it's been injected. A critical security risk for any AI agent that reads untrusted content — users, documents, web pages — and then takes actions based on what it reads.

Prompt
Prompt Optimization

Systematically testing and improving prompts based on results — treating prompts like code that needs iteration, not one-time writing. This means running the same prompt against test cases, measuring how often it gives the right answer, changing something, running it again, and comparing. The difference between a prompt that works 60% of the time and one that works 90% of the time is usually a few rounds of this kind of deliberate testing.

Prompt
Prompt Template

A reusable prompt structure with placeholder slots for the parts that change — so you can use the same prompt logic across many different inputs. For example: "You are a [tone] copywriter. Write a [length] product description for: [product details]." Fill in the brackets each time. Templates make AI workflows consistent, maintainable, and easy to improve — instead of writing a new prompt from scratch every time.

R
Tools
RAG

RAG (Retrieval-Augmented Generation) is an architecture where Claude retrieves relevant information from an external source before generating a response. Instead of relying purely on training data, Claude searches your documents, database, or knowledge base to find the right context, then uses that to answer. The result: Claude can answer questions about your specific products, policies, or data — things it couldn't have known from training alone.

Infrastructure
Rate Limiting Claude

A cap on how many API requests you can make in a given time window — set by Anthropic to manage server load and prevent abuse. If your application sends too many messages too quickly, you'll hit a rate limit and requests will be rejected until the limit resets. For most teams starting out, rate limits aren't a problem. At scale — processing thousands of documents a day, for example — you need to design your system to handle them.

Agents
ReAct

A pattern for building AI agents that alternates between reasoning and acting — think, then do, then observe the result, then think again. Instead of generating one big response, the agent works iteratively: it reasons about what tool to use, uses it, reads what happened, reasons about the next step, and so on. It's what makes agents feel like thoughtful problem-solvers rather than one-shot responders.

Evaluation
Red Teaming

Deliberately trying to get an AI to do something it shouldn't — produce harmful content, reveal confidential instructions, behave inconsistently — in order to find and fix vulnerabilities before real users encounter them. Borrowed from cybersecurity. In AI, red teaming helps identify failure modes, safety gaps, and unexpected behaviors. Anthropic does extensive red teaming before releasing new Claude models. Companies deploying Claude in sensitive applications should also red team their own system prompts and use cases.

Retrieval
Reranking

A second pass that re-orders search results after the initial retrieval — to surface the truly most relevant documents at the top. Initial search casts a wide net (fast, approximate). Reranking applies a more accurate but slower model to the top results and puts the best ones first. Adding reranking to a document search system often meaningfully improves the quality of what Claude sees before answering — which means better answers.

Evaluation
Responsible AI

A set of principles and practices for building and deploying AI ethically — covering fairness, accountability, transparency, privacy, safety, and societal impact. Increasingly a regulatory expectation and stakeholder requirement, not just a nice-to-have. For most companies using Claude, responsible AI means: understanding what Claude can and can't do, testing for failure modes before deploying, being transparent with users that they're talking to AI, and having a process when something goes wrong.

Foundation
RLHF Claude

A training method where humans rate AI responses — "this answer was better than that one" — and those ratings are used to teach the model to give more helpful, less harmful answers over time. It's a big reason why Claude writes naturally and declines harmful requests rather than just spitting out whatever is statistically likely. Anthropic used RLHF as part of how Claude was trained, alongside their own Constitutional AI approach.

Prompt
Role Prompting

Telling Claude to take on a specific role or perspective — "You are an experienced HR manager reviewing a performance review" or "You are a skeptical investor evaluating a pitch deck." This shapes the tone, depth, and framing of the response. Role prompting works because Claude interprets requests through the lens of the role: a compliance officer gives different advice than a startup founder, even on the same question.

S
Tools
Scheduled Tasks Claude

Scheduled Tasks let you define a piece of work once and have Claude run it automatically on a recurring schedule. A morning briefing that pulls from your calendar and email. A weekly summary of what shipped. An inbox triage that runs before you start your day. You write the task and set when it should run — Claude handles the rest each time the desktop app is open. If your computer was closed when a task was due, it catches up the next time you're back. Available in the Cowork tab of the Claude desktop app on Pro and Max plans.

Retrieval
Semantic Search

Search that finds results based on what you mean, not just which words you typed. Traditional keyword search requires your query to match the exact words in the document. Semantic search uses AI to understand that 'how to remove someone from my account' and 'delete a user' mean the same thing — and surfaces relevant results either way. It's how AI-powered support search feels smarter than old-style site search.

Tools
Skills Claude

Skills are packaged workflows that tell Claude exactly how to execute a specific type of task — the steps, the methodology, the output format. There are two kinds: Anthropic's built-in Skills (which handle Excel, Word, PowerPoint, and PDF creation automatically for paid users) and custom Skills you build for your own repeatable processes. A custom Skill might encode your brand review checklist, your QBR structure, or your quarterly analysis methodology. You create one by describing your workflow to Claude — it builds the Skill file for you. Once saved, Claude invokes it automatically whenever you do that type of work. The difference from Projects: Projects hold what Claude needs to know; Skills define how Claude should act. Note: Claude Code also has a feature called Skills (SKILL.md files) — a separate concept for developer workflows. See Claude Code Skills.

Infrastructure
Streaming Claude

Getting Claude's response word by word as it's generated, rather than waiting for the entire response to be ready. Without streaming, you wait — then get the whole response at once. With streaming, you start reading immediately while Claude is still writing. Makes AI applications feel much more responsive, especially for longer responses. Most modern Claude integrations use streaming by default.

Prompt
Structured Output Claude

Configuring Claude to return information in a specific format — like JSON, a table, or a defined list — rather than free-flowing text. Essential when Claude's output needs to be read by another system or fed into a spreadsheet, database, or application. For example: "Return a JSON object with keys: name, email, company, issue_type" rather than "write a paragraph summarizing the contact." Structured output is what makes AI integrations reliable.

Agents
Subagent Claude

A specialized AI agent that handles one narrow piece of a larger task, under the direction of a main orchestrating agent. For example: in a research pipeline, one subagent searches the web, another reads and summarizes documents, another formats the final report. Breaking work into subagents makes each one more reliable and easier to debug than having a single agent do everything.

Agents
Subagents Claude

Subagents in Claude Code are separate conversation contexts that the main agent spins up to handle isolated work. The subagent gets a task, does its own file reads, searches, and tool calls, then returns a summary — after which its entire context is discarded. Your main thread only sees the question and the answer, not the 15 files that were read along the way. This keeps your main context window clean for longer sessions. Three things most people miss: (1) Subagents do not inherit your skills automatically — you must explicitly list which skills to load in the agent.md file. (2) Adding "proactively" to a subagent's description field makes the main agent delegate to it automatically without you asking. (3) You can assign a different model per subagent — Haiku for fast lookups, Opus for deep analysis — which matters for cost. Three patterns that consistently backfire: sequential pipelines where each step depends on the last (information dies in the handoff), test runner subagents (you need the full output to debug, not a summary), and "expert persona" subagents like "you are a Python expert" (Claude already knows this — it adds nothing). Best use cases: research and codebase exploration, code reviews (fresh context means better feedback — Claude reviewing code it helped write produces weaker critique), and tasks that need a completely different system prompt from your main thread.

Tools
System Prompt

A system prompt is the set of instructions you give Claude before the conversation starts — telling it who it is, what it's for, what it should and shouldn't do, and how to behave. Users see the conversation; they typically don't see the system prompt. It's the configuration layer between you as an operator and Claude as a product. Writing a good system prompt is one of the highest-leverage things you can do when deploying Claude.

T
Agents
Task Decomposition Claude

Breaking a complex goal into smaller, manageable steps that can each be handled reliably — by an AI agent, a human, or both. "Write a market analysis" is hard to do in one shot. Break it into "list the top 5 competitors → summarize each → compare pricing → identify gaps → write conclusion" and each piece becomes much more tractable. Good task decomposition is one of the most important skills in building reliable AI workflows.

Foundation
Temperature

A setting that controls how predictable or creative an AI's responses are. Low temperature (near 0) means the AI picks the most likely, most consistent answer — good when you want reliability, like extracting data from a document. High temperature (near 1) means more variety — good for brainstorming or creative writing where you want different ideas each time. Most people don't set this manually; Claude chooses sensible defaults depending on the task.

Infrastructure
Throughput

How many requests a system can handle per unit of time — the volume capacity, as opposed to how fast any single request is. A system might have low latency (each request is fast) but low throughput (it can only handle a few at once). At production scale, you need both: fast individual responses and the ability to handle many users simultaneously. Throughput is what determines whether your AI feature works fine for 10 users and falls apart at 10,000.

Foundation
Token

A token is the basic unit Claude uses to read and write text — roughly three-quarters of a word. "Hello, how are you?" is about 6 tokens. Everything you send to Claude (your message, system prompt, uploaded documents) and everything Claude sends back counts as tokens. Tokens determine both your usage limits and your API costs. Understanding tokens helps you manage costs, stay within limits, and structure your prompts efficiently.

Foundation
Tokenization

The step where AI breaks your text into small pieces — called tokens — before it can process anything. A token is roughly three-quarters of a word. "Hello, how are you?" is about 6 tokens. This matters practically because API costs and usage limits are measured in tokens, not words or characters. The more text you send (and receive), the more tokens you use.

Tools
Tool Use

Tool use lets Claude call external functions and APIs during a conversation. You define the tools Claude has access to — search, database lookup, sending an email, running a calculation — and Claude decides when to call them, what inputs to send, and how to use the results in its response. It's the mechanism that turns Claude from a text generator into something that can take actions and interact with real systems.

Foundation
Top-p Sampling

A technical setting that limits which words an AI considers when writing each part of a response — only the most probable options are kept until their combined probability reaches a threshold. You rarely need to touch this. Temperature is the more intuitive control for adjusting response style. Top-p sampling is a low-level detail that developers occasionally tune when building AI applications, not something end users typically configure.

Business
Total Cost of Ownership

The full cost of an AI initiative — not just the API bill, but everything: engineering time to build and maintain it, evaluation overhead, change management, training, and ongoing monitoring. Companies typically estimate 3-5x their initial projected costs once these are factored in. The API cost is usually the smallest line item. Understanding TCO is critical for getting accurate ROI numbers and for making the case (or not) for a given AI investment.

Foundation
Transformer Architecture

The engineering design that underlies virtually all modern AI language models — including Claude, GPT-4, and Gemini. Introduced by Google researchers in 2017, it replaced older sequential designs and made it possible to process language in parallel at massive scale. You never interact with it directly — it's the technical architecture that makes large-scale AI possible, the same way you don't think about car engine design when you're driving.