For engineers who build infrastructure software — and are wondering what AI actually changes about the work.
A back-merge that changed how I think about software delivery.
We had just cut a release branch. A customer had been waiting multiple months for a fix. Not a complicated fix — a few days of actual work to sort it out thoroughly. But it had to travel through the full machinery: requirements sign-off, code review, CI, QA on the release branch, release coordination, the ship. Multiple months.
Then the back-merge broke some existing behaviors in the software.
Someone on a different stack had made a perfectly reasonable change a few weeks earlier. Well-written, well-tested. But it assumed a specific behavior in our layer — the same behavior the fix had just altered. Nobody caught it until the back-merge. Engineers who had already moved on to the next sprint got pulled back. A different customer, waiting for a different fix, waited longer.
Nobody did anything wrong. The process worked exactly the way it was supposed to.
That was the problem.
I have spent two decades in infrastructure software — protocols, SDN, data center networking, cloud. What I remember most fondly happened inside HP called the Lighthouse Program. The premise was simple: find customers who wanted to be early adopters of OpenFlow and SDN, embed a small engineering team with them, and build directly against their real production constraints. No release train. No bundled changes nobody asked for. You knew at the end of each week whether what you built actually worked for the people using it.
That felt nothing like the release cycle I just described. The difference was ownership. We understood the customer’s needs directly — not filtered through a PM’s interpretation of a call six weeks earlier.
It is the closest thing I have experienced to what people now call a Forward Deployed Engineer model. Nobody called it that at the time. The reason it stayed as a programme rather than becoming a delivery model is the same reason the traditional release process exists: the economics. You cannot staff every customer that way and managing separate customer branches across a large shared codebase creates its own category of problems.
That calculation is what AI is beginning to change.
TODAY: SHARED RELEASE
─────────────────────────────────────────────────────────────────────────
┌──────────────────────────────────────────┐
│ Shared release branch │
│ bug fixes · new features · security │
│ library updates · deprecations │
└───────────────────┬──────────────────────┘
│
▼
┌───────────────────────┐
│ Release team │
│ manages risk across │
│ all customers │
└────┬────────┬─────────┘
│ │
┌───────┘ └───────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ Cust. A │ · · · │ Cust. N │
│ full │ │ full │
│ bundle │ │ bundle │
└──────────┘ └──────────┘
│ back-merge: fixes made on release branch
└──► travel back to Main after ship
(integration conflicts arise here)
Cycle time: months
Every customer receives every change — wanted or not
FDE MODEL: PER-CUSTOMER
─────────────────────────────────────────────────────────────────────────
┌──────────────────────────────────────────┐
│ Main branch │
│ architectural changes · security fixes │
└────────┬────────────────────┬────────────┘
│ one-way flow down │
┌─────┘ └──────┐
│ │
▼ ▼
┌────────────────────────┐ ┌────────────────────────┐
│ Cust. A branch │ │ Cust. N branch │
│ customer-specific │ ··· │ customer-specific │
│ bug fixes + features │ │ bug fixes + features │
│ tailored to Cust. A │ │ tailored to Cust. N │
│ FDE ×2 + agent │ │ FDE ×2 + agent │
└──────────┬─────────────┘ └──────────┬─────────────┘
│ direct deploy │ direct deploy
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Cust. A production │ │ Cust. N production │
└──────────────────────┘ └──────────────────────┘
FDE tags select changes as MAIN-CANDIDATE
└──► Architect team reviews
└──► Accepted: absorbed into Main for all customers
Rejected: stays customer-specific
CVE: fix lands on Main first → auto-ported to all customer
branches → FDE reviews and approves per branch
Cycle time: weeks per customer
Each customer gets exactly what they asked for
What a Forward Deployed Engineer Actually Is
The Lighthouse Program gave me a sense of this before I had a name for it. What made it work was not the org structure or the programme mandate. It was that the engineers in that team could not hide behind process. When something you built did not work for the customer, you knew the same week. There was no release cycle to buffer the feedback. No PM to translate it. The accountability was direct, and that directness changed how carefully you thought before you built.
That is what an FDE carries into every customer engagement. Palantir coined the term and the standard they set is the right one: not someone who builds whatever the customer asks, but someone confident enough to push back when the customer is asking for the wrong thing. Building the wrong thing politely for six months is an easy failure mode in this role.
For infrastructure software, the job is harder than how Palantir originally defined it. Their FDEs build on Palantir’s shared platform — the codebase stays singular. What I am describing is an engineer who owns a customer-specific branch of the actual product codebase. Not a configuration layer on top. The code itself, branched, evolving, deployed in that customer’s environment. When something breaks in production for that customer, you are the person who fixes it. There is no release team behind you. That accountability does not rotate to someone else at the end of a sprint.
What makes this newly viable is not a change in what the role demands. It is AI tooling that compresses per-engineer workload enough to make the economics work across more than a handful of customers.
What AI Actually Changes — Without the Hype
Working with AI coding agents every day on a real codebase gives you a calibrated sense of what these tools actually do. On well-scoped, clearly specified tasks — generating tests, implementing a clearly described change, writing documentation, triaging a CI failure, porting a security patch — the productivity gain is real and consistent. On ambiguous architectural problems in mature infrastructure code, it is close to zero. The tools are genuinely useful on the former category and not yet useful on the latter.
That is enough to shift the headcount math on per-customer delivery. If a small team with AI assistance covers ground on those implementation tasks that previously needed a much larger team, a dedicated FDE pairing per high-value customer starts to become viable. Not across your entire customer base. But for the tier of accounts whose contract value justifies the investment — and who have been waiting on bundled releases for specific things they actually needed — the economics are beginning to move.
TASK FDE OWNS AGENT HANDLES
─────────────────────────────────────────────────────────────────────────────
Monitoring Read production telemetry ──► Surfaces anomalies in logs;
Know what looks wrong in this flags unusual patterns
customer's specific environment
Bug fix Diagnose root cause ◄── Generates candidate fix
Root cause or symptom? (60–70% of well-defined bugs)
Match this customer's config?
After fix Verify against branch context ──► Generates regression tests
What would break that the and documentation from the
agent cannot know? change just made
Feature Scope with customer: what exactly ──► Implements the agreed scope;
request is needed, what does done look generates tests; writes
like, what must not change? API and user-facing docs
FDE defines acceptance criteria.
On a CVE Review the ported patch ◄── Auto-ports fix from Main;
Apply cleanly to this branch? opens PR for FDE review
Any customer-specific conflicts?
─────────────────────────────────────────────────────────────────────────────
──► FDE delegates a task to the agent
◄── Agent returns output; FDE reviews before it ships
The FDE does less implementing and more judging. That is the point.
The Mental Shift That Actually Matters
There is a real difference between using AI tools and working in a genuinely AI-native way. It took me a while to feel it rather than just describe it.
Using AI tools means accepting autocomplete suggestions when they look reasonable. Modest gains at the keyboard.
Working AI-natively means you have changed how you decompose problems before you touch the keyboard.
When I am building the Network Ghost Agent — my agentic network forensics tool — I do not start a new piece of work thinking about how to write the code. I start thinking about how to specify the problem precisely enough that the agent can handle the implementation while I stay focused on the calls that require knowing what I know.
TRADITIONAL ENGINEER AI-NATIVE FDE
───────────────────────────────────────────────────────────────────────────────
Requirement arrives Requirement arrives
│ │
▼ ▼
[Human] High-level design [Human] Understand the real problem.
Map system interactions, What does this customer
component dependencies specifically need?
What must not break?
│ │
▼ ▼
[Human] Low-level design brief [Human + Agent] Decompose and specify.
Detailed implementation plan Draft precisely enough that
Review with team / architect the agent can act on it.
Human corrects for customer
│ context; agent fills structure.
▼ │
[Human] Implements the change ▼
Writes tests after [Agent] Implements. Writes tests.
Writes documentation.
│ │
▼ ▼
[Human] Code review [Human] Reviews agent output.
Errors caught late in cycle "What can only I verify here?"
│ │
▼ ▼
On the release branch — Sanity + targeted regression
full test suite runs: on this branch only:
· Stress tests · Only FDE-approved changes
· Performance tests have landed here
· QA validation · No bundled changes from
· Solution tests other customers or features
· Full regression suite · Risk surface is known
Risk is high: many changes, and contained
many customers affected, · Test scope is proportional
unknown interactions. to what actually changed
│ │
▼ ▼
Ships on release schedule Ships to this customer
months · all customers together weeks · this customer only
───────────────────────────────────────────────────────────────────────────────
[Human] = engineer owns this step
[Agent] = AI coding agent handles this
[Human + Agent] = collaborative — human steers, agent executes
The failure mode I run into consistently is not the agent being unable to produce code. It is the agent being confidently, plausibly wrong. A confident wrong answer sails through a linguistic hedge untouched. You do not catch it by asking the agent to flag uncertainty — the agent is not uncertain. You catch it by building structural verification into your workflow, making the agent prove its answer rather than just state it.
The same discipline applies when using an agent to implement a fix on a customer’s branch. The question is not “did it generate code?” It is “did it generate code that is correct for this customer’s specific context?” The agent cannot fully know that context. You can. That gap is where your judgment lives.
What changes when you work this way is not just speed. It is where your attention goes. In the traditional model, a significant portion of a senior engineer’s week goes to coordination — aligning with other teams, managing merge dependencies, attending reviews that exist because your change touches something someone else owns. In the FDE model that overhead largely disappears, because you own the full scope for one customer. What fills the space is not more implementation — it is more thinking. More time spent actually understanding what the customer needs before anything is built. More time in the code base understanding second-order effects. More time reviewing what the agent produced and asking whether it is right for this specific environment. Engineers who have spent years wanting more time to think and less time coordinating tend to find this shift disorienting at first and then hard to give up.
Who Can Do This
This skews toward senior engineers at the start — not primarily because of technical skill, but because of something more specific: making consequential decisions without a safety net.
In a traditional R&D org, the release process is itself a quality filter. A mistake gets caught in code review, then QA, then the release gate. An FDE’s decisions reach the customer’s production system more directly. When something breaks, you diagnose and fix it on your own, under customer pressure. That requires having seen enough systems fail to recognise failure early. Judgment accumulates with years of practice in environments where the consequences stick.
That said — the AI tools are compressing this. An engineer who learns to work with coding agents not as autocomplete but as implementation partners — who develops precision in specification and rigor in verification — can cover ground well beyond what their years of experience would traditionally predict. Start with senior engineers. Then watch what actually happens.
What Has to Exist for This to Work
Security stays centralized. When a CVE comes in, the fix goes to the main codebase first and an automated pipeline ports it to every affected customer branch for FDE review. One engineer discovering a vulnerability and patching it in isolation is how customers end up unevenly protected.
Every customer branch needs a living record of how it has diverged from Main, and why. An AI agent can maintain this; the FDE reviews it. Without it, all the context lives in one person’s head — which is fine until that person leaves.
Someone needs to decide which upstream architectural changes the FDE must take. Not a committee. A tag on commits: required, recommended, or optional. The FDE needs to know what they can defer without a meeting to find out.
What I Think Comes Next
The coordination overhead that accumulates in late-career infrastructure engineering — the back-merges, the release syncs, the regression triages that pull you back from work you already finished — exists for real reasons. It is risk management for shared codebases under real customer obligations. It is rational.
What is changing is the constraint. AI coding agents make it possible for a small, focused team to cover the implementation ground that previously required a much larger one. The Lighthouse Program worked because embedded ownership works. What it lacked was tooling that could make it viable beyond a handful of customers.
That tooling is arriving. The answer to whether it works at scale will come from engineers willing to try it on a real customer, with real accountability, and see what actually happens.
I am one of them. If you have been a builder your whole career and are wondering whether the FDE path is one you could walk — I think it is worth finding out.
