Infrastructure & DeploymentHow It Works

The 5 portfolio projects that actually signal FDE readiness

In brief

A GitHub full of side projects tells an FDE hiring manager that you can code. What they actually want to see is evidence that you can build in the real world — against legacy systems, ambiguous requirements, and non-technical stakeholders. These five projects show exactly that.

9 min read·AI Agent

Contents

♡Sign in to save

Most developer portfolio advice is the same: build three projects, put them on GitHub, write a README. That advice works for standard software engineering roles. FDE hiring is different.

FDE hiring managers are evaluating something specific: can you build production AI systems inside a real enterprise environment, against undocumented APIs, incomplete requirements, and non-technical stakeholders? A clean tutorial-complete project doesn't answer that question. It answers a different question: can you follow instructions?

These five projects are designed to answer the right question.

What FDE interviewers actually look for

Before the projects themselves, it's worth understanding the evaluation criteria. When an FDE interviewer reviews your portfolio, they're looking for:

Production thinking — Error handling, observability, access control, graceful degradation. Does the project handle failure modes, or does it just work when everything goes right?

Architecture decision-making — Not just "here's what I built." Why did you build it this way? What did you consider? What did you reject? A project without architecture reasoning is a project that tells them you built it, not that you know how to design systems.

Communication for a non-technical audience — Can a non-engineer understand what you built and why? FDEs spend significant time with stakeholders who don't code. Your README and documentation are a proxy for this skill.

Evidence of real-world constraints — Did you deal with anything messy? Authentication issues, undocumented API behavior, data inconsistency, rate limits? Projects with no friction suggest you've only worked in clean environments.

Project 1: An MCP server that connects 3+ real data sources

Why this works:

Anthropic's FDE job listing explicitly lists MCP server development as a deliverable. This is not a generic "show you know APIs" project. MCP servers are how FDEs connect Claude to enterprise data — the internal CRM, the data warehouse, the ticketing system. Building one that connects multiple sources demonstrates you understand data access patterns, authentication, and routing, which is the core technical architecture of almost every enterprise AI deployment.

What to build:

An MCP server that connects at least 3 real data sources. Not toy examples. Good combinations:

GitHub (via API) + Notion (via API) + a CRM API (HubSpot has a free tier)
Slack (via webhook) + Linear or Jira + a database you control
Any combination that involves a structured data source, an unstructured knowledge source, and an action-capable source

The repo should have a clear README that explains:

What the server does
Why you chose these three data sources (what use case does this serve?)
How authentication works for each source
What happens when one source is unavailable

What to include:

An access control layer. Even a simple one — not every tool should be available to every query. Showing that you thought about who can access what signals production thinking that most junior developers skip.

Error handling for each source individually. If the GitHub API is rate limited, the server should degrade gracefully, not crash. Document what happens in each failure case.

What not to do:

Don't wrap three toy APIs that return static data. The point is real APIs with real authentication, real rate limits, real inconsistent behavior. The friction is the point.

Project 2: An evaluation framework for a real use case

Why this works:

Production AI agents fail in ways that are genuinely hard to detect without a formal eval suite. Most junior developers skip evals entirely and rely on "it works when I test it." FDE hiring managers know this is how production systems accumulate silent failures. An eval framework shows you think about reliability and regression prevention — the thing that separates "built it" from "shipped it and maintained it."

What to build:

Choose a specific agent workflow — not "is the model good?" but something like: "An agent that answers questions about a company's HR policies from a document corpus" or "An agent that routes customer support tickets to the right team." Then build an eval suite for that specific workflow.

The suite should include:

At least 20 test cases
Cases that cover edge cases and failure modes, not just easy examples
A way to run the suite and see results in aggregate (pass rate, failure patterns)
Documented failure modes you found during development

What to include:

Show the failure modes you found. An eval suite that shows 100% pass rate on the first version looks like you wrote easy tests. An eval suite that shows 73% pass rate, followed by the fixes you made, followed by 94% pass rate, shows you actually found real problems.

Document your scoring methodology. How did you decide what "correct" means for this task? This is a design decision with real consequences.

What not to do:

Don't build a generic "LLM quality evaluator." That's a product, not a portfolio project. Build an eval for one specific task, done well, and explain why the cases you chose are the right cases.

Project 3: An enterprise integration with real constraints

Why this works:

FDEs work inside legacy enterprise environments. This is not optional context — it's the core of the job. A project that shows you've dealt with undocumented APIs, authentication headaches, data inconsistencies, or compliance-adjacent requirements signals that you've encountered real-world engineering, not just tutorial engineering.

What to build:

An integration with a real enterprise tool. Options that work well:

Salesforce (free developer edition available) — their API is well-documented but their data model is complex
A legacy database — set up a PostgreSQL or MySQL instance with a realistic messy schema and build an AI layer on top of it
An ERP simulator — some open-source ERP tools exist for exactly this purpose
Any internal tool at a company where you've worked or interned — even a small integration counts

What to document:

This project's value is in the documentation, not just the code. You need to write about:

What broke or was unclear when you started
How you figured out the undocumented behavior
What you'd do differently
What a non-technical stakeholder would need to understand about how this integration works

If you've built this at an actual company (internship, job, freelance), document the real constraints and challenges. Anonymize the company if needed — the details are what matter, not the name.

What makes this project genuinely valuable:

It's evidence of adaptability. Anyone can build against a well-documented REST API that returns consistent JSON. Not everyone has dealt with an API that sometimes returns XML and sometimes JSON, or a database where the "customer_id" field means different things in different tables. Those are the conditions FDEs work in every day.

Project 4: A written architecture decision record (ADR)

Why this works:

This one is not code. It's a document. But it might be the most differentiating project on this list.

FDEs are constantly explaining architecture decisions to non-technical executives and stakeholders. The ability to write a clear, honest ADR — explaining what you chose, what you rejected, what the tradeoffs were, and what you'd revisit — is a direct signal of communication skill that most technical portfolios don't demonstrate at all.

What to write:

Pick one architecture decision from one of your other portfolio projects. Write a 1–2 page document covering:

The situation — what were you building, what decision did you need to make
The options you considered — at least 3, with their honest tradeoffs
The decision — what you chose and why
What you'd revisit — with more time, more information, or different constraints, what would you do differently?

What good looks like:

A good ADR doesn't pretend the decision was obvious. It shows that you genuinely evaluated alternatives and made a defensible choice. The "what I'd revisit" section is particularly valuable — it shows intellectual honesty and the ability to think past the decision you already made.

What not to do:

Don't write an ADR justifying a decision after the fact with only positive framing. The reader should feel like they're watching you think, not reading your defense of a conclusion you'd already reached.

Where to put it:

A linked document from your project README. When the interviewer asks "what was the hardest architecture decision in this project?" your answer should start with "I actually wrote an ADR for that" and then link them to it.

Project 5: A post-mortem on something that broke in production

Why this works:

The FDE interview almost always includes a variation of: "Tell me about something you shipped that broke. What happened?" Having a written post-mortem shows three things at once: that you've shipped things to production (not just side projects), that you think in systems (not just "my code was buggy"), and that you're honest about failure in a way that signals trustworthiness.

What to write:

500–1,000 words covering:

What broke — describe the failure specifically, not vaguely
Why it broke — the root cause, not just the symptom
How you diagnosed it — what was your process for figuring out what happened
What you changed — the fix, but also the process or architectural change you made to prevent it recurring
What you'd do differently — the honest reflection

What if nothing has broken in production?

Then you haven't shipped enough to production. This is a signal. Build something real, deploy it somewhere, and accept that it will break in ways you didn't anticipate. If you've only ever built on localhost, the post-mortem isn't available to you yet — and that's a gap worth closing before you interview.

If you've had real production failures at work that you can describe (appropriately anonymized), those count. A post-mortem from an internship where you broke a staging environment is still a post-mortem.

Where to put it:

A linked document from the relevant project's README. Or a public post on your personal site or GitHub. The medium matters less than the existence and quality.

How to present the portfolio

Repository structure:

Each project should have:

A README that opens with what the project does and why it exists (not what technologies it uses)
A section on architecture decisions (link to the ADR if you wrote one)
A section on known limitations and what you'd improve
Installation/running instructions that actually work

In interviews:

When asked to walk through a project, don't start with the technology stack. Start with: "The problem I was solving was..." Then the constraints. Then the decisions. The technology is context, not the story.

If you've written ADRs and post-mortems for your projects, reference them during the walkthrough. Most candidates don't have them. Having them is a signal.

Red flags to avoid

Tutorial-complete projects — If the project is clearly from a tutorial (the dataset is MNIST, the API is a weather API, the structure is exactly like a guide you followed), it signals you haven't moved beyond directed learning. FDE hiring requires independent judgment.

No error handling — Check every project. Does it fail silently? Does it throw uncaught exceptions? Does it have any logging? Error handling is the single clearest signal of whether you've thought about production.

No tests, no evals — At minimum, you should be able to explain what you would test and how. Better: actually have tests.

No documentation of what you'd do differently — Every project should have a "limitations" or "what I'd improve" section. If everything looks perfect, it signals you either don't know enough to see the problems, or you're not being honest. Neither is reassuring to an FDE interviewer.

Projects that can't be run — If the interviewer tries to run your project and it fails, that's a significant negative signal. Test your own setup instructions on a clean machine before interviewing.

The five projects above are not a checklist to complete mechanically. Each one is designed to answer a specific question an FDE interviewer is asking. The MCP server answers: can you build the enterprise data connectivity layer? The eval framework answers: do you think about reliability, not just capability? The enterprise integration answers: have you dealt with real-world constraints? The ADR answers: can you communicate architecture decisions to a non-technical audience? The post-mortem answers: have you shipped to production and learned from what broke?

If you can answer all five questions, you're ready for the FDE interview.

Related tools

AI Maturity Scorecard

10 questions to assess where your Claude implementation stands and what to improve.

Check your score →

Weekly brief

For people actually using Claude at work.

Each week: one thing Claude can do in your work that most people haven't figured out yet — plus the failure modes to avoid. No tutorials. No hype.

No spam. Unsubscribe anytime.

What to read next

Picked for where you are now

All articles →

How It Works·10 min

How to become a Forward Deployed Engineer — the path nobody has written down yet

The full transition path context — which projects matter most depends on which path you're on.

Core Definition·9 min

What Is an Internal MCP Server — and Why Your Team Needs One

Project 1 requires building an MCP server — this is the deep-dive on the architecture patterns.