B One Consulting
·

Build vs Buy. When to develop your AI agent and when to license a platform.

The build-or-buy question, framed that way, is the wrong question. The right question is which capabilities you need to own to control your operating model, and which you can rent from the market without losing optionality. Most production stacks worth running are a blend, and the work is to draw the line in the right place for your context.

Why "build vs buy" is a false binary.

The build vs buy conversation usually arrives in a polarised form. One camp argues for buying a platform because the market is moving fast and internal builds are slow. The other argues for building because vendors are charging premium prices for capabilities the team could put together in a quarter. Both arguments contain truth and both miss the structural point.

Every production AI system we have helped build, supervised or audited has been a blend. The model itself was rented from a frontier provider. The orchestration framework was open source with extensions. The vector store was a managed service. The evaluation suite, the audit trail and the prompt library were built in-house because they encoded knowledge specific to the client. The question we put to leadership teams is not whether to build or buy. It is which layers belong in which column, and what the cost of being wrong on that allocation looks like.

In our experience the dispute about build vs buy is almost always proxy for a different question. Who controls the rate of change. Who owns the differentiation. Who carries the risk when the underlying technology shifts under the system. Those are the conversations worth having, and the framework below tends to move them along.

What you should own.

The domain ontology.

By domain ontology we mean the formal representation of the entities, relationships and rules that define your business. What a customer is, what a transaction is, what an exception looks like. This is not something a vendor can sell you in a generic form. Owning it is what allows any AI capability you deploy to be grounded in your reality rather than in a generalised one. The teams that get this right tend to spend the first weeks of any AI project on the ontology before they touch a model. The teams that skip it tend to discover that every output requires substantial cleanup.

The evaluation pipeline.

The set of test cases that defines what good looks like for your agents is core intellectual property. It encodes operator knowledge, edge cases, regulatory expectations and tone of voice in a form the team can execute against on every change. Renting an evaluation framework is reasonable. Renting the cases themselves makes no sense. We see clients confuse the two often enough that it has become one of the first things we look at in an audit.

The audit trail.

The record of what the system did, why, and on whose authority is something you cannot afford to have stored only in a vendor's database under a vendor's retention policy. Own the trail. Mirror it into infrastructure you control. The regulatory and operational reasons are well rehearsed and the cost of doing it well is modest. The cost of discovering you cannot answer a regulator on time because the vendor's export schedule is monthly is significant.

The prompt and policy library.

The prompts that define how your agents reason, the policies that govern what they can do, and the tone they apply are differentiation. They are also the place where the most valuable iteration happens during the first year of operation. Keeping these in a vendor-specific format that cannot be migrated tends to look fine until the day you want to move, at which point the migration cost can erase the speed gain that motivated the original purchase.

What you should rent.

Model access.

The frontier model itself is the clearest example of a layer to rent. The pace of improvement on foundation models is well beyond what any internal team can match, and the economics of training a competitive model do not work for almost any enterprise we have advised. Use the best model the market offers, design for swapability, and update the evaluation suite to detect regressions when you change provider.

Orchestration plumbing.

The frameworks that wire models, tools, retrieval and memory together are mature enough that building your own is rarely defensible. Open source options are credible, commercial options are well supported, and the engineering capital saved by renting this layer is better spent on the evaluation suite and the audit trail. The teams we work with who built their own orchestration usually regret it within a year as the open source ecosystem catches up and overtakes them.

Observability tooling.

Tracing, latency monitoring, cost tracking and drift detection for AI systems are a fast-moving category with credible specialist vendors. Building this in-house is expensive and tends to produce a tool that lags the market. Rent it, integrate it deeply, and own the dashboards that surface the signals to your operators.

Vector storage and retrieval.

Managed vector databases and retrieval services have matured enough that the operational case for self-hosting is narrow. Where it remains, it is usually for regulatory or sovereignty reasons rather than for performance. For most clients we work with, renting this layer is the right call, provided the index can be exported and rebuilt on a different provider without losing the underlying corpus.

The lock-in that clients regret is rarely the lock-in they negotiated. It is the lock-in they did not, because the deal was framed as build vs buy when it was really about optionality.

The decision matrix.

When the conversation needs to be settled quickly, the matrix we use scores each capability on two axes. Criticality, meaning how much your operating model depends on this capability working precisely the way you need it to. Differentiation, meaning how much owning this capability separates you from competitors who could otherwise buy the same outputs from the same vendors.

High criticality and high differentiation puts the capability in the build column. The domain ontology and the evaluation suite usually land here. Low criticality and low differentiation puts it firmly in the rent column. Vector storage and observability tooling usually land here. The interesting cases are the diagonal: high criticality and low differentiation, or low criticality and high differentiation. These are the cases where most disputes happen and where most procurement mistakes are made. The matrix forces the room to choose explicitly rather than drift.

Buying speed without losing optionality.

The strongest reason to buy is speed. A platform that takes you from zero to a working capability in weeks is genuinely valuable, and we have no quarrel with clients who choose to buy aggressively to get into production. The conversation we insist on having is about what happens at month eighteen. Can the prompts be exported. Can the evaluation cases be migrated. Can the audit trail be retained when the contract ends. Can the model provider be changed without a full rebuild.

If the answer to those four questions is yes, buying is a sound strategy. If the answer to any of them is no, the buyer is taking on a different kind of risk than they probably realise. The discount the vendor offered to win the deal tends to be a small fraction of the migration cost the client pays later. We have walked clients through that calculation more than once and the conclusion has been consistent.

If your team is approaching a build vs buy decision and the conversation is starting to polarise, the question worth opening with is not which side is right. It is which capabilities, exactly, sit on which side of the line, and what optionality you are buying or selling in each direction. The Tech Factory and Consulting teams in our Paris, Dubai, Singapore and Bali offices are reachable from the brief form below.

Frequently asked questions.

When does build win the argument?

When the capability is both critical to the operating model and a source of differentiation. The domain ontology, the evaluation pipeline, the prompt library and the audit trail almost always fall here for clients we work with.

When does buy win the argument?

When the market moves faster than any internal team can credibly match, and when the capability is undifferentiated infrastructure. Foundation models, orchestration frameworks, observability tooling and managed vector stores typically belong here.

How do we avoid vendor lock-in?

Design for exit before signing. Insist on export formats for prompts, evaluation cases and audit trails. Keep the model layer swappable with a fallback that has been exercised, not just documented. The lock-in clients regret is usually the one they did not negotiate when they had leverage.

How do we keep build cost honest?

Track the total cost including operations, on-call rotation and the eventual rewrite. Most internal builds we audit looked cheap at the proof of concept and crossed the cost of a commercial alternative by the second year. The discipline is to estimate three-year cost, not first-year cost.

How does this change for regulated industries?

The bias shifts towards owning audit trail, prompt library and evaluation suite, sometimes the orchestration layer too. The model and observability tooling can still be rented, provided the data residency and audit requirements are met by the vendor under contract.

What about open-source orchestration frameworks?

They are credible for most use cases and the right choice when the team has the engineering capability to extend them. The mistake to avoid is treating open source as free. The operational cost is real, and the comparison with a commercial framework should include it.

Further reading

Where this lands

How we'd take this further with you.

Tech Factory pillar

AI, Agents & Automation

Production-grade agents, evaluation pipelines, observability and the discipline behind shipping AI.

Tech Factory pillar

Data, Cloud & DevOps

Data platforms, sovereign cloud arbitration, devops discipline for systems that have to scale.

Consulting pillar

AI Acceleration

From maturity diagnosis to use case prioritisation to durable adoption across the organisation.

Brief us on your bet.

Tell us about the engine you need first: strategy or Tech Factory. We answer within one working day.

Start a brief

Brief us
We'll take it from there.

Tell us the decision you're trying to make. Strategy, transformation, performance or AI. We answer within one working day.