Crashing an AI Promo Event

Chronology

Signed up for the Dust x Paatch "Agentic AI" event. Showed up in person.
Sat down listening to long-winded pitch session while completely glossing over important questions about vendor lock-in, data sovereignty and others pertaining to enterprise agentic AI systems.
Read Dust's docs. Realized they don't answer these questions there either. Started questioning if this was anything beyond a UI wrapper with integrations—stuff you can build in-house in a week (I know because I did).
Realise some of those around me may not be as exposed to agents themselves and may not be asking the crucial questions of a closed-source agentic AI platform that they should.
Had an idea: use their own platform against them. Use Dust's platform and the fact that all agents created will be shared within the attendee group so they can find mine if they want to :). Creating the agent during the talk
Prompt claude to draft up a system prompt for the Dust Buster agent - "a comprehensive and critical technology researcher specializing in helping people make informed decisions about AI agent platforms".
Gave it a system prompt loaded with every uncomfortable question companies should ask before committing to ANY vendor.
Published it to the shared hackathon workspace so attendees could find it. The agent live on their platform, ready to be discovered
Leave the premises... SIKE. Sit down, enjoy pizza, drink coke. Moments before publishing to the shared workspace
They notice and approach in an attempt to answer some of my questions. No success, low confidence responses. Not good.
You know when you hear "Oh thats interesting" it means that the solutions architect or forward deployed engineer has never given it a thought. Another company saved from vendor lock-in
Go home early, disappointed.

The agent asked the questions their sales deck ignored. Here's the condensed version of what actually matters.

The Questions That Matter

Evaluation:

"How does Dust help you determine if your new prompt is actually better than your old one, or are you guessing?"
"What evaluation framework does Dust provide to measure agent performance objectively?"
"Can you run A/B tests on different prompt versions in Dust, or are you doing trial-and-error in production?"
"Does Dust have version control on prompt edits, how can you monitor and prevent model regression?"

If a vendor can't clearly explain how you'll measure success and prevent regressions, they're out!!!!!!

Vendor Lock-In:

Can you export everything if they 5x the price?
What's your exit strategy if they get acquired or shut down?

Data Sovereignty:

Where exactly is your data stored and who has access?
Has your security team reviewed their SOC 2 reports and DPAs?
Do they use your data to train models?

Cost Reality:

What are you paying over 1, 3, or 5 years? They didn't mention cost once during the entire event... worrying.

Technical Control:

Can you modify agents beyond their UI constraints?
Are you locked into their LLM provider choices?
Can you run this on-premise or in your own cloud?

Strategic:

Is this faster than 1-2 engineers building exactly what you need in 4-6 weeks?
Are you building a competitive advantage, or renting someone else's?

What to Use Instead

This is where it gets boring, if you don't care about agent sdks dev kits etc. its a good time to leave, thank you for reading up to this point!

If you're evaluating AI agent platforms, first remember look for the simplest solution to your problem first and only increase complexity when necessary.

Here are your options:

The Big Three:

Claude Agent SDK:

Good in that it offers:

Agentic search - big boy term for bash scripts e.g. 'grep' and 'tail'
Semantic search - big boy term for chunking data, embedding it into a vector db, querying for vectors, usually faster but less accurate
Subagents - subagents are like python processes: used for parallelization and use their own isolated context windows (memory). They only send back relevant data back to the orchestrator agent.
Context maintenance - probably the most difficult thing to implement yourself, when agents run for long periods the context grows as you may have seen when working in Cursor or another AI IDE and you see the % increase as a chat gets longer. What Claude Agent SDK does is automatically summarize previous messages when the context limit approaches so the agent doesn't run out of context. Pretty useful.
Tools - both prebuilt and custom tools that typically involve creating and exposing an MCP server to your agent through the Claude Code interface.
Fully customisable/extendable: SDK that you can install and run on your own infrastructure. Full control over everything.

Bad in that it's:

Not optimised for Python, a thin wrapper around Claude Code npm package.
So all core logic e.g. agent loops, tool execution, prompt handling is optimised for node.
No native tool interface, only MCP servers.
No health checks.
Have to manage your own memory and infrastructure if you want to self host (obviously).

More reading: Early stage agents: Anthropic Guide

Google Agent Development Kit (ADK): The most "enterprise-ready" of the bunch.

Model agnostic and deployment agnostic.
Prebuilt tooling options + custom + easily integrate third party.
Easily containerise and deploy agents on your own, Vertex AI Agent Engine option available too.
Easily build multi-agent systems (orchestrators, sub-agents).
Comes with built-in observability, evaluation frameworks, and state management.
Offers a visual developer UI for debugging agent traces—something the others sorely lack.

OpenAI AgentKit: Product-first kit. Focus is to help you build a product and ship it within the OpenAI ecosystem.

Full stack: Provides the runtime, memory management, and tool execution.
ChatKit Integration: Designed to ship conversational agents quickly, basically plugging ChatGPT wherever you want it.
Heavily tied to OpenAI's infrastructure and models.

Open Source Frameworks (For Builders):

PydanticAI: Type-safe, modern Python, great DX (by the Pydantic team). The best choice imo if you want to write code.
CrewAI: Good for structured multi-agent workflows, but can be rigid.

Open Source Platforms (The Real Dust Alternatives): If you want the "Platform" experience (UI, RAG, Integrations) but want to own your data:

Dify.AI: The closest open-source equivalent to Dust.
- Full RAG pipeline and LLM orchestration.
- Visual builder + API access.
- Self-hostable (Docker).
- For if you want Dust features without the Dust lock-in.
LangFlow: Visual drag-and-drop builder, great for prototyping flows.

Avoid:

LangChain: Massive ecosystem, but lowkey trash and bloated.

Read this: Why we no longer use LangChain

Or Just Build Your Own: Like I did. Took less than two weeks. Complete control. Zero per-user fees. Exact features I need. No vendor drama.

The Bottom Line

I'm not saying don't use Dust if you're a non-technical person who wants to waste their money and ensure all their agents are not portable (blog post on context portability soon). Just ask the right questions first.

Red flags for ANY AI SaaS platform:

Vague "AI-powered" marketing
Opaque pricing that scales with users
Limited customization options
Unclear data residency and access controls
Difficult or impossible data export
No evaluation framework for measuring agent performance
No clear approach to testing and version control

The evaluation question alone should give you pause. If they can't clearly explain how you'll measure success and prevent regressions, that's a problem.

A Note on "Enterprise Moats"

To be fair to Dust, they do offer things that are annoying to build yourself: SOC 2 compliance, managed data connectors (Notion, Slack, etc.), and user management.

If you are a large enterprise, paying for these "boring" infrastructure pieces might make sense. But don't confuse infrastructure with intelligence.

When building it yourself you own the data, so compliance is often simpler (it stays in your VPC). You can use open-source libraries for data connectors.
You are paying Dust a premium for the convenience of not managing these pipes.

The only real pain point to build yourself is user management. Handling permissions and sharing agents across teams is tedious. If you don't want to build an auth system, Dust solves that. But is that worth the lock-in? Probably not.

Disclaimer: This post reflects my personal experience and opinions from attending a public promotional event. I encourage readers to do their own research and evaluation.

The full Dust Buster system prompt is available if anyone wants it. I've written it to ask every uncomfortable question before you lock into a vendor.