AI Codex
Building Your Internal AI StackStep 3 of 8
← Prev·Next →
Infrastructure & DeploymentHow It Works

Live API vs. ETL: Choosing the Right Data Access Pattern for AI Agents

In brief

Not all data needs to be in a warehouse. Not all data can wait for an ETL cycle. The right pattern depends on freshness requirements, query shape, and whether the operation writes or reads — and in practice, you use both through the same internal MCP server.

8 min read·

Contents

Sign in to save

Every AI agent data question eventually hits the same fork: should this come from a live API call or from the data warehouse?

The naive answer is "live is always better because it's fresher." The expensive answer is "warehouse for everything because it's cheaper." Neither is right. The actual answer depends on what you're asking, how quickly the data changes, and whether you need to write anything back.

Here's how to make the call.


The decision framework

Question Live API ETL / Warehouse
Does freshness matter (< 1 hour)?
Is it a single record lookup? Either
Is it an aggregate or bulk query?
Do you need to join across systems?
High-volume (millions of rows)?
Write / update (not just read)?
Can you tolerate a few hours of lag?
Does the data change multiple times per hour?

The logic behind each row:

Freshness < 1 hour. An ETL cycle typically runs every few hours, or nightly. If the answer to "did this customer's payment clear?" changes faster than your ETL runs, you need a live API call. If yesterday's revenue numbers are accurate enough, the warehouse is fine.

Single record lookup. Both patterns work for looking up one customer. The live API is fresher; the warehouse is cheaper. The tiebreaker is usually freshness requirements.

Aggregate or bulk query. "What's our average deal size by sales rep this quarter?" requires scanning potentially thousands of deal records. A live API call that touches HubSpot's endpoint for each deal doesn't scale. The warehouse is built for this.

Cross-system joins. "Which customers have an open support ticket and a renewal coming up this quarter?" touches Intercom and HubSpot. Your live APIs don't know about each other — you'd need to make two calls and manually join the results. The warehouse has both datasets and can join them in SQL.

Write operations. If Claude is updating a record — marking a task complete, logging a note, changing a deal stage — that has to go through the live API. Warehouses are (almost always) read-only.


What this looks like in practice

Take a customer success manager using Claude as an assistant. Here are some of the questions they ask in a day:

Query Pattern Why
"What's the renewal status for Acme Corp?" Live API Payment status changes fast
"Which accounts have low product usage this month?" Warehouse Aggregate, cross multiple tables
"Who did I talk to at Acme last week?" Warehouse Historical, logged at ETL time
"Log a note on the Acme deal" Live API Write operation
"What's our average renewal rate by industry?" Warehouse Aggregate, single query
"Did the payment for Acme come through?" Live API Real-time status
"Which customers renewed in Q1 vs. Q2 last year?" Warehouse Historical aggregate

The same agent, the same session, uses both patterns. The caller doesn't need to know which one — the internal MCP server routes based on the tool definition.


Building the routing layer

The beauty of going through an internal MCP server is that the client (Claude) doesn't care which path a tool takes. The tool is named and described; the implementation detail is hidden.

// Tool definition — client sees this
{
  name: 'get_customer_renewal_status',
  description: 'Returns the current renewal status for a customer, including payment status and contract end date. Use for real-time renewal questions.',
  // ... schema
}

// Tool implementation — routes to live API
async function get_customer_renewal_status({ customer_id }) {
  // This goes live — renewal status changes hourly
  return await chargeoverClient.getSubscription(customer_id)
}

// ────────────────────────────────────────────────

// Another tool definition
{
  name: 'get_renewal_cohort_analysis',
  description: 'Returns renewal rates by cohort, industry, and plan type. Use for trend analysis and reporting.',
  // ... schema
}

// Tool implementation — routes to warehouse
async function get_renewal_cohort_analysis({ quarter, group_by }) {
  // This goes to warehouse — aggregate query over historical data
  return await warehouse.query(`
    SELECT
      ${group_by},
      COUNT(*) as total,
      SUM(CASE WHEN renewed THEN 1 ELSE 0 END) as renewed,
      ROUND(AVG(CASE WHEN renewed THEN 1.0 ELSE 0.0 END) * 100, 1) as renewal_rate
    FROM renewals
    WHERE quarter = ?
    GROUP BY ${group_by}
  `, [quarter])
}

Same server, two different backends, clean abstraction for Claude.


The hybrid pattern: warehouse with live fallback

The most practical architecture for many teams is a hybrid: the warehouse is the primary data source, with live API fallback for fields where freshness matters.

async function get_customer_profile({ customer_id }) {
  // Start with warehouse — has most of what we need
  const baseData = await warehouse.query(`
    SELECT
      name, industry, arr, deal_stage, health_score,
      last_payment_date, renewal_date, open_tickets
    FROM customer_summary
    WHERE customer_id = ?
  `, [customer_id])

  // Augment with live payment status — this changes fast
  const paymentStatus = await chargeoverClient.getCurrentStatus(customer_id)

  return {
    ...baseData,
    payment_status: paymentStatus.status,  // Live
    payment_as_of: new Date().toISOString(),
  }
}

The result: 90% of the profile from a fast warehouse query, the high-freshness field from a targeted live call. Best of both patterns, minimal token cost.


ETL pipeline considerations for AI workloads

If you're building ETL for AI agent consumption (not just BI dashboards), a few things matter that don't matter for traditional analytics:

Flatten your schemas. BI analysts write complex SQL. AI agents work better with pre-joined, flat tables. Build dedicated customer summary views that already have all the relevant fields joined in.

Add summary fields. Instead of a raw ticket count, store open_high_priority_tickets, days_since_last_activity, onboarding_completion_pct. These are the fields Claude will actually use.

Version your ETL output. When your ETL schema changes, old cached values can confuse Claude if it's reading a mix of old and new formats. Version your output tables and have the MCP layer reference the current version explicitly.

Log ETL run times. The MCP server should know when the last ETL completed. It can include that in context: "data as of 3:42 AM today." This lets Claude accurately caveat time-sensitive answers.

Monitor for ETL lag. If your ETL is supposed to run at 3 AM and it actually ran at 7 AM (because it failed and retried), the agent is working with 4 hours less fresh data than expected. Alerting on ETL lag is as important as alerting on API downtime.


Common mistakes

Using live API for everything. Works fine with one user and one service. Breaks fast when you have 50 users, 10 connected services, and agents that make multiple calls per session. Token costs and API rate limits both become problems.

Using warehouse for everything. Your ETL runs at 3 AM. A user asks "did Acme's payment clear this morning?" Your warehouse says no (because the payment came in at 9 AM). The user thinks there's a billing problem. There isn't — the data is just stale. Warehouse-for-everything breaks on freshness-sensitive queries.

Not communicating data freshness. Claude should know (and tell the user) when data was last refreshed. "Based on data as of this morning" is a materially different claim than "as of right now." Build this into your tool responses.

Writing back through warehouse. This seems obvious but it still happens: someone tries to close a deal by writing a row to the warehouse. The warehouse is read-only (or should be). All write operations go through live APIs.


The right mental model

Think of your data access layer like a caching strategy in software engineering:

  • Hot cache (live API): The freshest data, highest cost, needed for real-time operations
  • Warm cache (internal MCP with shaped responses): Pre-fetched, cheaper, good for most queries
  • Cold cache (warehouse): Batch processed, cheapest, great for aggregates and history

Most data access patterns are warm cache or cold cache. Live API is the exception, not the default. Building with that mental model — and routing explicitly based on freshness requirements — is what separates AI infrastructure that scales from AI infrastructure that gets expensive and slow.

Related tools

Next in Building Your Internal AI Stack · Step 4 of 8

Continue to the next article in the learning path

Next article →

Weekly brief

For people actually using Claude at work.

Each week: one thing Claude can do in your work that most people haven't figured out yet — plus the failure modes to avoid. No tutorials. No hype.

No spam. Unsubscribe anytime.

What to read next

Picked for where you are now

All articles →